Data Science Articles
My articles range widely in topic and scope. I’ve written hundreds of articles now, covering paper reviews, concept explanations, code demonstrations, topical surveys, and more. For my writing, I was recognized as Top Writer in AI and in Technology by Medium and awarded a Gold and a Silver medal by KDnuggets, a leading data science site with 700k+ visitors monthly.
These days, I don’t write so many articles due to the competing influence of other work.
Perhaps the highlight of my short blog/article writing career is this article, which kicked off a big debate between Steven Pinker, Yann LeCun, Francois Chollet, Gary Marcus, and so on in Twitterverse and beyond. You can find coverage of the spectacle on AI Coffee Break, Gowri Shankar’s blog, ML Street Talk, and Yann LeCun’s Twitter.
View a selection of my articles below, organized by topic - chosen because of their popularity, my personal interest, or both. (These links are configured to bypass the Medium subscription paywall.)
Concepts and Developments
- “What Does It Really Mean for an Algorithm to ‘Learn’?”
- “The Beauty of Bayesian Optimization”. Intuitive and visual exploration of the Bayesian Optimization paradigm.
- “The Most Controversial Neural Network Ever Created”. A piece of ‘technical journalism’, exploring different perspectives on the Extreme Learning Model (ELM) network design.
- “Batch Normalization: An Incredibly Versatile Deep Learning Tool”. An overview of how batch normalization is implemented and various theories for why it works and its success.
- “How Injecting Randomness Can Improve Model Accuracy”. Why does bagging work? Why exactly does Random Forest usually perform better than Decision Tree? In this short article I present a visual, concrete example to demonstrate the intuition for bagging.
- “Decentralizing AI & Championing Privacy: The Genius of Federated Learning”. A review of Google’s Federated Learning distributed deployment design to encourage greater AI decentralization and privacy.
Surveys of the Field
- “5 Exciting Deep Learning Advancements to Keep Your Eye on in 2021”. An survey of five modern developments in deep learning.
- “The Future of Deep Learning Can Be Broken Down Into These 3 Learning Paradigms”. Original analysis of three key motifs/themes/patterns/trends in the current of deep learning research.
- “Machine Learning Goes Quantum”. An overview of developments at the intersection of deep learning and quantum computing.
Code-Centric Posts
- “7 Pandas Functions That Will Reduce Your Data Manipulation Stress”. Presentation and example usage of seven handy pandas functions.
- “Your Ultimate Data Mining and Machine Learning Cheat Sheet”. Comprehensive overview of models and functions from various data mining and machine learning libraries.
- “Stop One-Hot Encoding Your Categorical Variables”. Explores the weaknesses and alternatives to the traditionally used one hot encoding paradigm.
Data Science in Business
- “Customer Segmentation Tutorial”. Example code walkthrough of a customer segmetation analysis on a business dataset.
- “How Instacart Uses Data Science to Tackle Complex Business Problems”. Summary of the challenges faced and solutions innovated by Instacart’s data science team.