Posted on Leave a comment

IBM and Cloudera Join Forces to Expand Data Science Access

On June 21, IBM and Cloudera jointly announced that they were expanding their existing relationship to bring more advanced data science solutions to Hadoop users by developing a shared go-to-market program. IBM will now resell Cloudera Enterprise Data Hub and Cloudera DataFlow, while Cloudera will resell IBM Watson Studio and IBM BigSQL.

In bulking up their joint go-to-market programs, IBM and Cloudera are reaffirming their pre-existing partnership to amplify each others’ capabilities, particularly in heavy data workflows. Cloudera Hadoop is a common enterprise data source, but Cloudera’s existing base of data science users is small despite the growing demand for data science options, and their Data Science Workbench is coder-centric. Being able to offer the more user-friendly IBM Watson Studio to its customers gives Cloudera’s existing data customers a convenient option for doing data science without necessarily needing to know Python or R or Scala. IBM can now sell Watson Studio, BigSQL, and IBM consulting and services into Cloudera customers more deeply; it broadens their ability to upsell additional offerings.

Because IBM and Cloudera each hold significant amounts of on-prem data, It’s interesting to look at this partnership in terms of the 800-pound gorilla of cloud data: AWS. IBM, Cloudera, and Amazon are all leaders when it comes to the sheer amount of data each holds. But Amazon is the biggest cloud provider on the planet; it holds the plurality of the cloud hosting market, and most of IBM and Cloudera’s customers’ data is on-prem. Because that data is hosted on-prem, it’s data Amazon doesn’t have access to; IBM and Cloudera are teaming up to sell their own data science and machine learning capabilities on that on-prem data where there may be security or policy reasons to keep it out of the cloud.

A key differentiator in comparing AWS with the IBM-Cloudera partnership lies in AWS’ breadth of machine learning offerings. In addition to having a general-purpose data science and machine learning platform in SageMaker, AWS also offers task-specific tools like Amazon Personalize and Textract that address precise use cases for a number of Amazon customers who don’t need a full-blown data science platform. IBM has some APIs for visual recognition, natural language classification, and decision optimization, but AWS has developed their own APIs into higher-level services. Cloudera customers building custom machine learning models may find that IBM’s Watson Studio suits their needs. However, IBM lacks the variety of off-the-shelf machine learning applications that AWS provides. IBM supplies their machine learning capabilities as individual APIs that an application development team will need to fit together to create their own in-house apps.

Recommendations

  • For Cloudera customers looking to do broad data science, IBM Watson Studio is now an option. This offers Cloudera customers an alternative to Data Science Workbench; in particular, an option that has a more visual interface, with more drag-and-drop capabilities and some level of automation, rather than a more code-centric environment.
  • IBM customers can now choose Cloudera Enterprise Data Hub for Hadoop. IBM and Hortonworks had a long-term partnership; IBM supporting and cross-selling Enterprise Data Hub demonstrates that IBM will continue to sell enterprise Hadoop in some flavor.
Leave a Reply