Analytics Blog

A Conversation about Data Science with UC Berkeley

Earlier this month, we were invited by the UC Berkeley School of Information to host a conversation on the real-world applications of data science. It was an honor to be invited and we were happy to host what turned out to be their most popular webinar yet.

A worry of any presenter is reaching the question and answer portion of a presentation and not having any engagement from the audience; it’s usually a sign that your presentation has missed its mark. Fortunately in this case, the questions were coming in so rapidly that we could barely read them as they scrolled by!

My favorite was whether I thought software like Alpine may eventually replace the data scientist all together. My answer at the time was “no”, more or less: data science will always be something of an art. For example, you need to select the problems first, which is a very human process of interacting with the business. And crafting models will always require the construction of features that involve subtly translating real-world phenomena into mathematical expressions. For example, if you’re going to model the effects of TV ads on consumer sales, you’d better have a good understanding of how humans react to advertising.

But maybe the answer isn’t so clear-cut. It’s surely true that, over time, more aspects of a data scientist’s work will be done by software. Feature generation has already become less important as models become more sophisticated. Model parameter selection will become increasingly automated — model deployment entirely so.

Think about how the work of the software engineer has changed fundamentally in the last 20 years. She no longer needs to write her own logging module or database access layer or UI widget. And agile methods have brought the ‘customer’ more immediately into the development process. More and more the job of the engineer is to stitch together higher-level components and to collaborate with product managers and UX designers.

Similarly, the job of the data scientist will be to take advantage of pre-built components in order to solve a greater variety of business problems. We see our job at Alpine in terms of simplifying the analytics process and bringing the entire analytics team together. Instead of a few six-month analytics projects that focus on model accuracy and algorithmic niceties, companies will be able to work on hundreds of projects that emphasize time to real business action.

So yes, analytics software is getting more powerful, but the result should be more data scientists, not fewer!

Later on, I might have scared the attendees by suggesting that I was seeing a bit of a slow-down in data science hiring. We’re certainly looking for data scientists here at Alpine, but we now see an increasing number of better-qualified candidates and a greater level of persistence. But, worry not. If anything, this may imply that the demand for data scientists is shifting from “insanely high” to “very high”.

If that’s true, I suspect it may be because companies are learning that analytics projects are more challenging than they expected: the technical environment is very complex and cumbersome, especially for big data, and projects can get bogged down in the mathematical weeds.

That’s exactly why we want to simplify the analytics process and make life easier for both the data science and the business team. If you’d like to learn more about the presentation, you can re-visit the content here or contact us.