It’s what we do

Abstract

Automunge is an open source python library that has formalized and automated the data preparations for tabular learning in between the workflow boundaries of received “tidy data” (one column per feature and one row per sample) and returned dataframes suitable for the direct application of machine learning. Under automation numeric…


The only constants are laws of nature

For those that haven’t been following along, I’ve been using this forum to document the development of Automunge, a python library that automates the preparation of tabular data for machine learning. The tool is intended for data scientists comfortable with working at the coding layer, such as in the context…


Determinism is overrated

Abstract

This paper offers full introduction to the practice of stochastic noise injections into the features of tabular data fed to machine learning training or inference. Noise injection to the entries of continuous numeric feature sets may be applied by sampling from discrete distributions to select entry injection targets and sampling…


All learners are welcome

For anyone that hasn’t been following along, I’ve been using this forum to document the development of Automunge, an open source python library for tabular learning. One way to think about Automunge is that it is an abstraction that greatly simplifies pipelines of Pandas operations. Specifically we focus on the…


Introducing anonymized dataframes

For those that haven’t been following along, I’ve been using this forum to document the development of Automunge, an open source python library for preparing tabular data for machine learning. The library’s interface is channeled through two master functions: automunge(.) to prepare training data for machine learning, and postmunge(.) to…


From the Diaries of John Henry

Introduction

The essays of Book 5 I think did a good job keeping balance between Automunge and other interests, with treatments ranging from machine learning, physics, music, quantum computing, and even a little space sprinkled in.

The book starts off with an introduction to a prominent foundation model for natural language…


From the Diaries of John Henry

Introduction

Book 4 was the year I started taking incremental software refinements from weekly rollouts to a much more rapid pace, and the opening essay’s disorganized structure I think was nice counter to the close of Book 3. …


From the Diaries of John Henry

Introduction

Book 3 was the year of Automunge. The first few essays are not exactly elegant writing, as I was figuring out how to code I was parallel figuring out how to document code. So these opening chapters were really just setting the groundwork for some of the more elaborate to…


From the Diaries of John Henry

Introduction

Depending on whether you’re reading these books in chronological or inverse order, this book may either be the second or the next to last. Will approach this introduction assuming the former.

In book 2 I started spending more time on these musings, approaching and I believe never exceeding about a…


From the Diaries of John Henry

Introduction

It’s been a fun trip writing in this medium. Having ventured down so many roads over the last year or two, it seemed appropriate to consolidate these writings into a sort of table of contents for ease of browsing. Collected here are all of my posts in chronological order. Although…

Nicholas Teague

Writing for fun and because it helps me organize my thoughts. I also write software to prepare data for machine learning at automunge.com

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store