Analytics Blog

On the Latest Release: Custom Operators

Recently we announced the release of our new Alpine “Custom Operator” framework, a plug-in mechanism for adding new operators to the Alpine workbench. The Custom Operator framework allows Alpine’s engineering team to add new algorithms more rapidly than ever before and allows our customers to build their own analytics operators integrated directly into the Alpine environment. That means we can easily add new open-source functions from projects like MLlib and MADlib, and users can easily share their own code that others might find useful.

It used to be that developing a new algorithm meant doing two things: implementing the actual algorithm in parallel, and then fitting it into the Alpine framework. The first of these is, of course, inherently challenging, and part of what makes it fun to do machine learning at Alpine. But the second was more complicated than it needed to be. For example, you needed to write UI code for the operator to show up on the workflow designer. You had to write code for how the algorithm would handle the data. You had to think about how to handle errors and bad data. You even had to think about how the code gets submitted to the cluster for execution, and all of the security and logging and monitoring that goes along with that!

Now, with the Custom Operator framework, developers have a straightforward and declarative way to define the metadata and presentation aspects of an operator. It takes very little time to define the input parameters, the output formats, the documentation, and so on. In addition, all of the connectivity and error handling happens with almost no additional programming. In this way, we’ve cut several days off of the process of adding a new algorithm.

But the most exciting result of this release is for our customers. Oftentimes, data science teams will have their own proprietary algorithms that they would like to leverage within Alpine, without having to find an equivalent from an open source library or the Alpine workbench. Now, any Alpine user with programming skills can create fully parallel algorithms, without worrying about any of the complexities and overhead of Spark or JDBC or MapReduce, and then anyone else on the team can use these algorithms without touching a line of code.

It’s often true that adding even the simplest extensions to a product can magnify its power significantly. (Think VB Macros in Excel, or iPhone apps.) Our new Custom Operator framework is a prime example of this pattern and the extensibility of our platform. If you’d like to read more about this release, I recommend reading the latest post from our VP of Engineering, Lawrence Spracklen, as well as a recent article published in eWeek’s Enterprise Apps section.