Books

Read these in roughly this order if you are starting from scratch.

  • Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow Aurelien Geron

    Best practical starting point. Covers the full ML stack from regression to deep learning with real code.

  • An Introduction to Statistical Learning (ISLR) James, Witten, Hastie, Tibshirani

    The cleanest conceptual foundation for supervised learning. Free PDF at statlearning.com.

  • Deep Learning Goodfellow, Bengio, Courville

    Rigorous theory behind neural networks. Dense, but worth it once you have the basics.

  • Designing Machine Learning Systems Chip Huyen

    How ML actually works in production. Data pipelines, feature stores, monitoring, and drift. Essential for practitioners.

  • The Elements of Statistical Learning (ESL) Hastie, Tibshirani, Friedman

    The graduate-level companion to ISLR. Go here when ISLR feels too light.

  • Python for Data Analysis Wes McKinney

    Definitive reference for pandas from its creator. Still the fastest way to get fluent with data wrangling.

Courses

Structured learning with feedback loops. Pick one and finish it before starting another.

People to Follow

Reading good practitioners think out loud is underrated as a learning method.

  • Lilian Weng lilianweng.github.io

    OpenAI safety researcher. Writes the clearest long-form explanations of complex ML papers anywhere.

  • Andrej Karpathy karpathy.ai

    Former Tesla AI director. His "Neural Networks: Zero to Hero" YouTube series is exceptional.

  • Eugene Yan eugeneyan.com

    Applied ML at Amazon / Humans of AI. Writes deeply about production ML systems and the craft of data science.

  • Sebastian Raschka sebastianraschka.com

    Author of Python Machine Learning. Writes detailed, well-cited posts on everything from LLMs to vision transformers.

  • Chip Huyen huyenchip.com

    MLOps and ML systems. Her writing is the standard reference for production machine learning.

Tools and Libraries

What actually shows up in production projects, not just tutorials.

  • scikit-learn

    Where you learn ML algorithms. Clean API, great docs, battle-tested.

  • pandas + polars

    pandas for exploration, polars when you need speed at scale.

  • PyTorch

    The research standard. Intuitive, Pythonic, and transferable to any deep learning job.

  • Weights and Biases

    Experiment tracking done right. Free tier is generous enough for personal projects.

  • Gradio

    Fastest way to put an ML demo in front of a non-technical stakeholder.

  • DVC

    Git for datasets and models. Essential once you have more than one version of anything.

  • Great Expectations

    Data validation and quality checks. Catches data drift before your model silently breaks.