Can You Really Teach Machine Learning to Teens? Absolutely.
One of the questions I’m often asked is whether you can teach machine learning to teenagers who don’t yet have a background in calculus or linear algebra. My answer is a resounding yes. As someone who teaches both calculus and machine learning, I’ve had firsthand experience bridging this gap.
I teach calculus, probably my favorite high school subject, and I also teach machine learning. Many of my machine learning students haven’t taken calculus yet, let alone multivariable calculus, or linear algebra, which are traditionally prerequisites for entering fields like machine learning in college. But with the right approach, young students can dive in and build models successfully.
The Core Idea: Gradient Descent
The main concept that unites calculus and machine learning is gradient descent. Gradient descent is a mathematical technique used to minimize the error in a function. In calculus terms, it’s about finding the minimum point of a curve where the derivative, or gradient, in higher dimensions, equals zero. Machine learning uses gradient descent to iteratively improve algorithms.
Here’s how it works: machine learning models start with random parameters, known as weights. These weights are essentially guesses about how to relate input data to the output. For instance, if you’re predicting house prices based on features like the number of rooms, bathrooms, and square footage, the algorithm assigns random values to these features initially.
After making its first guess, the algorithm calculates the error—how far off it was from the actual price. Gradient descent then kicks in, using the derivative of the error (the slope) to adjust the weights. If the slope is negative, the model adjusts in one direction; if it’s positive, it moves in the opposite direction. This process continues until the error is minimized, producing a model that can make accurate predictions.
Understanding Without Calculus
Now, while calculus is fundamental to understanding gradient descent, it’s not a barrier to teaching machine learning. Most practitioners, myself included, don’t write machine learning algorithms from scratch where gradient descent is needed. Instead, we use existing algorithms and frameworks developed by experts in the field. In other words, we are using algorithms with gradient descent already built in.
Think of it like flying a jet. Pilots don’t need to build jet engines to fly planes, just as most machine learning practitioners don’t need to write their own algorithms to build effective models. At Berkeley Coding Academy, we focus on teaching students how to fly jets—how to implement state-of-the-art Python libraries like sklearn to initialize machine learning algorithms that have already been written before fitting them and tuning them to their data. None of this work actually requires calculus.
Using Tools to Build Understanding
To be clear, I’m not devaluing the importance of a research-based approach to teaching machine learning that includes calculus. Building machine learning algorithms from scratch with gradient descent is highly valuable and definitely has a important place in education. But for most machine learning practitioners, in academics and in industry, the focus is on applying machine learning algorithms effectively, not reinventing them.
For example, when most people build machine learning models, they use existing libraries and frameworks. They don’t write gradient descent from scratch; instead, they tune parameters, adjust data, and evaluate results using modern tools. For most of us, the machine learning world starts with the algorithms that have already been written and goes forward from there.
In my teaching, along the way, I explain how gradient descent works conceptually. Depending on the audience, I may avoid "calculus speak" entirely, but the core ideas still come through. Yet at the end of the day, while understanding is helpful, it’s the output of the models making real predictions that matters most. The goal is not to expound theory, but to make meaningful predictions about the future using big data.
A Practical Philosophy: Focus on the Work
There’s a philosophy in physics often attributed to the Niels Bohr’s school of thought of quantum mechanics known as the Copenhagen Interpretation: "Shut up and compute." The idea is that you don’t need to understand the underlying mechanisms to use it masterfully. This approach revolutionized quantum mechanics and became the dominant philosophical point of view emerging from the 1920s because it worked so well. The physicists who built up quantum mechanics accepted this radical viewpoint and got busy doing physics.
A similar philosophy holds true in machine learning. One doesn’t have to understand all theoretical details underpinning machine learning algorithms to become masterful at using them. While deep understanding is valuable, the priority for practitioners more often than not is to get meaningful results by building models and solving real-world problems.
Similarly, most of us don’t do linear algebra per se when building machine learning models (in this context, building a model means uniquely implementing an algorithm). The data at hand is stored in a matrix, and linear algebra computations rapidly occur behind the scenes, but at the level of code, it’s a familiar world of input and output.
Go Forward With Code
Over the years, I’ve seen teenagers with no prior experience in calculus or machine learning successfully build models. Are they creating cutting-edge research? Not yet. But they learn to use the right tools, write code, and understand enough to adjust their models to make meaningful predictions—and that’s a significant achievement.
The primary focus is not on mathematics, but on programming. The numbers are extremely important, and comfort analyzing numbers within matrices is essential. The mathematics of calculus and linear algebra, by contrast, is non-essential at this stage, the stage where most professionals remain on a daily basis.
If you want to get good at building machine learning models, get good at building machine learning models. Start with the code, and go from there.
Corey Wade
Corey Wade is the director and founder of Berkeley Coding Academy where he teaches machine learning to teenagers. He is the author of Hands-on Gradient Boosting with XGBoost and Scikit-learn, a machine learning book for practitioners. His data science mentor Kevin Glynn revealed that he did not have to learn every theoretical detail of machine learning to get good at building models, an approach that Wade was initially hesitant, but ultimately grateful to employ.