IBM and Cloudera Join Forces to Expand Data Science Access

On June 21, IBM and Cloudera jointly announced that they were expanding their existing relationship to bring more advanced data science solutions to Hadoop users by developing a shared go-to-market program. IBM will now resell Cloudera Enterprise Data Hub and Cloudera DataFlow, while Cloudera will resell IBM Watson Studio and IBM BigSQL.

In bulking up their joint go-to-market programs, IBM and Cloudera are reaffirming their pre-existing partnership to amplify each others’ capabilities, particularly in heavy data workflows. Cloudera Hadoop is a common enterprise data source, but Cloudera’s existing base of data science users is small despite the growing demand for data science options, and their Data Science Workbench is coder-centric. Being able to offer the more user-friendly IBM Watson Studio to its customers gives Cloudera’s existing data customers a convenient option for doing data science without necessarily needing to know Python or R or Scala. IBM can now sell Watson Studio, BigSQL, and IBM consulting and services into Cloudera customers more deeply; it broadens their ability to upsell additional offerings.

Because IBM and Cloudera each hold significant amounts of on-prem data, It’s interesting to look at this partnership in terms of the 800-pound gorilla of cloud data: AWS. IBM, Cloudera, and Amazon are all leaders when it comes to the sheer amount of data each holds. But Amazon is the biggest cloud provider on the planet; it holds the plurality of the cloud hosting market, and most of IBM and Cloudera’s customers’ data is on-prem. Because that data is hosted on-prem, it’s data Amazon doesn’t have access to; IBM and Cloudera are teaming up to sell their own data science and machine learning capabilities on that on-prem data where there may be security or policy reasons to keep it out of the cloud.

A key differentiator in comparing AWS with the IBM-Cloudera partnership lies in AWS’ breadth of machine learning offerings. In addition to having a general-purpose data science and machine learning platform in SageMaker, AWS also offers task-specific tools like Amazon Personalize and Textract that address precise use cases for a number of Amazon customers who don’t need a full-blown data science platform. IBM has some APIs for visual recognition, natural language classification, and decision optimization, but AWS has developed their own APIs into higher-level services. Cloudera customers building custom machine learning models may find that IBM’s Watson Studio suits their needs. However, IBM lacks the variety of off-the-shelf machine learning applications that AWS provides. IBM supplies their machine learning capabilities as individual APIs that an application development team will need to fit together to create their own in-house apps.

Recommendations

  • For Cloudera customers looking to do broad data science, IBM Watson Studio is now an option. This offers Cloudera customers an alternative to Data Science Workbench; in particular, an option that has a more visual interface, with more drag-and-drop capabilities and some level of automation, rather than a more code-centric environment.
  • IBM customers can now choose Cloudera Enterprise Data Hub for Hadoop. IBM and Hortonworks had a long-term partnership; IBM supporting and cross-selling Enterprise Data Hub demonstrates that IBM will continue to sell enterprise Hadoop in some flavor.

Data Science Platforms News Roundup, September 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, DominoElastic, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Data Science Platforms News Roundup, August 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Code-Free to Code-Based: The Power Spectrum of Data Science Platforms

Codeless to Code-Based

The spectrum of code-centricity on data science platforms ranges from “code-free” to “code-based.” Data science platforms frequently boast that they provide environments that require no coding, and that are code-friendly as well. Where a given platform falls along this spectrum affects who can successfully use a given data science platform, and what tasks they are…

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Data Science Platforms News Roundup, July 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cloudera, Databricks, Dataiku, DataRobotDatawatch, Domino, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

What Data Science Platform Suits Your Organization’s Needs?

This summer, my Amalgam Insights colleague Hyoun Park and I will be teaming up to address that question. When it comes to data science platforms, there’s no such thing as “one size fits all.” We are writing this landscape because understanding the processes of scaling data science beyond individual experiments and integrating it into your business is difficult. By breaking down the key characteristics of the data science platform market, this landscape will help potential buyers choose the appropriate platform for your organizational needs. We will examine the following questions that serve as key differentiators to determine appropriate data science platform purchasing solutions to figure out which characteristics, functionalities, and policies differentiate platforms supporting introductory data science workflows from those supporting scaled-up enterprise-grade workflows.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Domino Debuts Data Science Framework

Domino Model Management

On May 22, Domino held its first Analyst Seminar in advance of its Rev conference for data science leaders. Domino provides an open data science platform to coordinate data science initiatives across enterprises, integrating data scientists, IT, and line of business.

At the Analyst Seminar, Domino introduced its Model Management framework: five pillars supporting a core belief that data science best practices involve data science not just being a siloed department or team, but that its resulting models should drive the business. For this to be possible,  all relevant stakeholders across the enterprise will need to buy into data science initiatives, as this will involve changes to existing business process in order to take advantage of the knowledge gained from data science projects.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Alter(yx)ing Everything at Inspire 2018

In early June, Amalgam Insights attended Alteryx Inspire ‘18, where Alteryx Chairman and CEO Dean Stoecker led an energetic keynote to inspire their users to “Alter(yx) Everything.” Based on conversations I had with Alteryx executives, partners, and end-users, I came away with the strong impression that Alteryx wants to make advanced analytics and data science tasks as easy and quick as possible for a broad audience that may not know code – and they want to expand that community and its capabilities as quickly as possible. Data scientists and analytics-knowledgeable employees are in high demand, and the shortage is projected to worsen as the demand for these capabilities grows; data is growing faster than the existing data analyst and data scientist community can keep up with it.

Continue reading “Alter(yx)ing Everything at Inspire 2018”

Market Milestone: Oracle Builds Data Science Gravity By Purchasing DataScience.com

Bridging the Gap

Industry: Data Science Platforms

Key Stakeholders: IT managers, data scientists, data analysts, database administrators, application developers, enterprise statisticians, machine learning directors and managers, current DataScience.com customers, current Oracle customers

Why It Matters: Oracle released a number of AI tools in Q4 2017, but until now, it lacked a data science platform to support complete data science workflows. With this acquisition, Oracle now has an end-to-end platform to manage these workflows and support collaboration among teams of data scientists and business users, and it joins other major enterprise software companies in being able to operationalize data science.

Top Takeaways: Oracle acquired DataScience.com to retain customers with data science needs in-house rather than risk losing their data science-based business to competitors. However, Oracle has not yet not defined a timeline for rolling out the unified data science platform, or its future availability on the Oracle Cloud.

Oracle Acquires DataScience.com

On May 16, 2018, Oracle announced that it had agreed to acquire DataScience.com, an enterprise data science platform that Oracle expects to add to the Oracle Cloud environment. With Oracle’s debut of a number of AI tools last fall, this latest acquisition telegraphs Oracle’s intent to expedite its entrance into the data science platform market by buying its way in.

Oracle is reviewing DataScience.com’s existing product roadmap and will supply guidance in the future, but they mean to provide a single unified data science platform in concert with Oracle Cloud Infrastructure and its existing SaaS and PaaS offerings, empowering customers with a broader suite of machine learning tools and a complete workflow.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register