Small update to appendix A associated with parameter consolidations in 5.94
Just finished reading a really impressive book on the fundamentals of quantum computing. In my experience many books that cover this territory get somewhat lost in the formality of quantum notations and linear algebra formulations without imparting any real intuition on the mechanics behind quantum algorithms. This book, Programming Quantum Computers by Eric Johnston, Nic Hurrigan, and Mercedes Gimeno-Segovia, turned out to be the single most helpful book I’ve come across for clearly articulating what is taking place in fundamental algorithms like Grover’s search and Shor’s factoring algorithms. …
Missing data is a fundamental obstacle in the practice of data science. This paper surveys a few conventions for imputation as available in the Automunge open source python library platform for tabular data preprocessing, including ML infill in which auto ML models are trained for target features from partitioned extracts of a training set. A series of validation experiments were performed to benchmark imputation scenarios towards downstream model performance, in which it was found for the given benchmark sets that ML infill performed best for numeric target columns in cases of missing not at random, and was otherwise at minimum…
The developers of the Automunge open source platform for tabular data preprocessing have taken a somewhat unorthodox approach to documentation and communications, making use of multimedia, blogging, tweets, jupyter notebooks, as well as music and photography in publication. This submission will offer an exhibited excerpt of such communication practices, featuring elements of multimedia videos with narration, accompanied with hand drawn slides and transcript, presented as both a brief introduction and extended walkthrough. We believe this form of presentation is a very accessible low cost option to communicate complex subject matter in a concise and accessible form. …
Having spent the better part of the last five years writing these essays, have finally come to the convincing realization that the table of contents is getting a little out of hand to put it mildly. Yeah I mean the goal is to contribute and share lessons, build connections, and etc, and suspect the full table of contents may now be interfering with that by way of signal getting lost in the noise so to speak.
So, without further ado, here now is a greatly abbreviated collection of essays that I think is somewhat representative of the better parts of…
Took a week off from working on the preprocessing library to play a little catch up in another interesting domain, one occurring at the intersection of quantum computing and machine learning aka quantum machine learning — a topic this blog has explored previously such as in our 2018 presentation on the same subject.
This sidetrack manifested primarily as a deep dive into the TensorFlow Quantum library, sort of an extension from the well known TensorFlow library for training neural networks. The progress made in the field since that 2018 presentation has been considerable, with nearly every nook of machine learning…
“Hashing is a form of cryptography in which a message is transformed into an encoded representation.”
=> '0f44cb01d838c981156d9f0c030159fb'
In common practice hashing may be used to validate voracity of a message’s sender, such as e.g. by comparing a received hash of a bank account number to a hash of that number on file without having to transmit the actual account number through a channel which may be exposed to an eavesdropper. Thus, a hashing is a deterministic transform where consistently received data will return a consistent encoding. …
This will be a short essay, wanted to just document a theory that I think is helpful way to think about deep learning. There is an open question in research as to why deep over-parameterized models have a regularizing effect, even when the number of parameters exceeds the number of training data points — which intuition might suggest would result in a model simply memorizing the training points, but in practice this type of deep learning instead successfully achieves a kind of generalization. …
Writing for fun and because it helps me organize my thoughts. I also write software to prepare data for machine learning at automunge.com