I remember clearly my disappointment when I learned, as a college student, how computers were able to play skilled chess. I was taking my first course in artificial intelligence, and had assumed that such a mysterious and powerful topic must have similarly impressive methodologies. In fact, the Minimax algorithm used by Deep Blue and other game-playing computers is quite intuitive (1). Ultimately, though, my disappointment was replaced by excitement: I could build an algorithm to play chess!
I’d like to provide you with that same sense of disappointment regarding machine learning. Machine learning is powerful but needn’t be mysterious. Its basic capabilities, like regression, classification and clustering, are easy to understand, and nontechnical knowledge of those capabilities is sufficient to make clear requests of data scientists.
Businesses that do not derive value from machine learning will soon be at a massive competitive disadvantage as the rest of the world learns to use this powerful technology. Yet today, most companies I’ve worked with have made an investment in data science teams that are operating as expensive research silos disconnected from business outcomes. This is a major strategic blunder. As a businessperson, the ability to make specific, actionable requests of your data scientists, hold them to a high standard and connect their work to relevant action is basic contemporary corporate literacy.
As a field, data science has grown preoccupied with ever-more-sophisticated algorithms rather than the operational challenge of connecting data to action. At first blush this seems ridiculous — who cares if the model is classifying with 2% more accuracy after three months of effort if it’s not operationalized?
I’ve seen a clear cause of this disconnect across several customers: Business stakeholders ask too little of their data scientists (and too vaguely), who then spend tremendous time and effort building overly complex solutions. In one instance, a data scientist had six months to build a model predicting a business unit’s quarterly revenue from a tiny data set, a task that should take two weeks. Given that this was his full-time job, of course he spent the additional five months creating increasingly elaborate solutions to the problem that were likely overfitted (i.e., not generalizable to similar problems, or even the same problem with different data).
So how can you demand more from machine learning? Communication and process are the two key factors. On the communication front, basic machine learning literacy is important and achievable. What kinds of problems can machine learning solve? How can I best frame a business problem in that form? This blog will break down topics like regression, classification and clustering — what kinds of problems they solve, and how to talk about them with data scientists.
Regarding process, business leaders need to organize groups of people that are appropriate, not for an open-ended research project, but rather for solving specific problems and operationalizing those solutions. So data scientists play a part, but other roles like data engineers, app engineers, project managers and business analysts are equally important. I’ll cover the basic process steps that accompany any machine learning project in the enterprise — what stages projects go through, who you need involved at each stage and how to move models to production.
Tomorrow’s most successful businesses will excel at organizing people to put data into action with machine learning. It’s time to get literate and demand more from this transformational technology.
(1) To be fair, much of the ingenuity behind Deep Blue resided in its ability to evaluate board states.