Zen of Machine Learning

2024-03-17

Machine learning can be a frustrating endeavour, and I know several ML engineers who have transitioned to SWE roles – the unpredictability of machine learning being at the heart of their frustrations. You never know if or when your model will be good enough for production or if your new algorithm- or data idea will give any improvements. Usually, they don’t. Additionally, the feedback loop in ML is typically slow. These factors combined can lead to a sense of loss of control.

By comparison, developing non-ML functionality is typically more straightforward. You can usually determine its feasibility and create a timeline and milestones. The feedback cycle is much faster; tests run in seconds and changes show up when you refresh the browser window.

To persevere as ML engineers, we can find inspiration from other areas that face randomness, delayed rewards, and loss of control. For example, poker players cannot judge their performance solely based on the final outcome of a single hand and instead have to look at average performance over longer periods of time*. After individual hands, they analyze their actions and observations throughout the game, identify possible flaws, and update their tactics. Every new hand becomes an opportunity to begin again and make incremental improvements. Through this, they can regain control, not over individual hands but over their mastery.

To achieve this kind of detachment to immediate results as ML engineers, we can also focus on our learning and understanding rather than on the outcome of individual experiments. We can learn to start fresh at any given point by asking what the best possible next action is given everything we know so far.

Consider an experiment that runs for a week and yields results only slightly better than a naive baseline. While it can be frustrating, we can’t change the results themselves. Yet we have control over our next actions. We can review our code and configurations for bugs, analyze individual examples to see where the model makes mistakes, or consider trying a different approach. However, we should have a clear reason to believe that a particular action is the best one to take. Choosing actions randomly means giving up control again.

By focusing on the present and engaging in deliberate decision-making, we can regain a sense of control and improve our skills. Ultimately, the difference between an expert and a novice is the ability to make better decisions on average.

(*) A caveat: one key difference between poker and ML is that the feedback for poker is fast but noisy, and we need many repeated games to get good signal. For ML the feedback is slow but (hopefully) has a high signal-to-noise ratio. The similarity between the two lies in the fact that it takes some time to get good signal.