Analytics Blog

Deploying Machine Learning to the Cloud

While enterprises have traditionally deployed Hadoop clusters on their data centers, there is a growing number creating clusters in the cloud. Cloud providers such as AWS and GCP make it almost effortless to spin-up and tear-down Hadoop clusters on-demand and provide a cost-effective approach to on-demand big data systems. However, the current analytics solutions offered… Read more »

How to Use the YARN API to Determine Resources Available for Spark Application Submission (Part 1)

At Alpine we continue to deliver new enterprise analytic features within Chorus. With Chorus 6.1 we launched the ability to deliver sophisticated auto-tuning for Spark jobs. Chorus automatically determines the settings needed to launch a Spark Application by using information on the size of the data being analyzed, the analytical operations being used in the… Read more »

An Introduction to PFA

With Chorus 6.1 we have introduced the support for PFA, the Portable Format for Analytics. Before we get into what PFA is let’s make some observations about the data science process. There are a few important questions we can ask about the process in general: 1.) What is our processing model? 2.) What are our… Read more »