Automunge is an open source python library that has formalized and automated the data preparations for tabular learning in between the workflow boundaries of received “tidy data” (one column per feature and one row per sample) and returned dataframes suitable for the direct application of machine learning. Under automation numeric…

As followers of this blog may be aware, the author has recently been offering some hypotheses regarding potential benefits of noise injections in the context of tabular learning applications. It is probably worth reiterating that several aspects of these suggestions are merely that, hypotheses. We have been building out features…

Determinism is overrated


This paper offers full introduction to the practice of stochastic noise injections into the features of tabular data fed to machine learning training or inference. Noise injection to the entries of continuous numeric feature sets may be applied by sampling from discrete distributions to select entry injection targets and sampling…

Nicholas Teague

Writing for fun and because it helps me organize my thoughts. I also write software to prepare data for machine learning at

