Posted on Leave a comment

Data Science Platforms News Roundup, July 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cloudera, Databricks, Dataiku, DataRobotDatawatch, Domino,, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta.

New SAS Viya Release Fueled by AI Capabilities, Allowing Customers a Look Under the Hood of Machine Learning Techniques

SAS Viya’s latest release addresses key concerns that have been top-of-mind in the data science world lately. First: providing transparency around recommendations from complex machine learning models using open source frameworks such as PDP, LIME, and ICE. Second, keeping customers’ personal information private automatically by automatically identifying and tagging such data (particularly in light of GDPR and similar legislation). The embrace of open source extends to providing the ability to use Python and R models within Viya as well.

Machine Learning in Google BigQuery

Google announced the beta of BigQuery ML, a new capability within its BigQuery software that lets data analysts use simple SQL extensions to employ two machine learning modeling techniques (linear regression and binary logistic regression) to analyze data residing in Google Cloud Storage. BigQuery ML can be accessed from within BigQuery, and also via external tools such as Jupyter notebooks and BI tools; Looker announced its support of BigQuery ML the day it debuted, and other Google Cloud Platform partners are likely to follow suit, though again, the capability remains in beta. I provide recommendations for organizations considering testing out the BigQuery ML capabilities in an earlier piece. and Google Cloud Announce Collaboration to Drive Enterprise AI Adoption announced a partnership with Google Cloud that brings’s H2O-3, Sparkling Water, and Driverless AI to the Google Cloud Platform. The partnership provides the entire suite on GCP, allowing customers to bring automated machine learning and AI capabilities to their data in Google Cloud in an accelerated timeframe.

Databricks Survey Gets To The Heart of the AI Dilemma: Nearly 90% of Organizations Investing in AI, Very Few Succeeding

A Databricks-commissioned survey by IDC reveals that the vast majority of large organizations pursuing AI initiatives run into significant trouble along the way. Companies commonly encounter issues in timeliness (timely data aggregation and preparation is a challenge because data is siloed and inconsistent; AI projects take more than six months to be deployed into production, and only 1/3 of these projects succeed anyway), complexity (nearly 90% of large organizations have invested in multiple machine learning tools), and collaboration (data science at scale is a “team sport” but communication is scattered) presenting significant obstacles. The solution: an “end-to-end” platform uniting data preparation, machine learning, and collaboration capabilities – a data science platform, in other words.

Domino Data Lab Partners With SAS to Accelerate Data Science Work in the Cloud

Domino users can now run SAS Analytics for Containers in the public cloud on AWS while using Domino as the orchestration layer. The ability to shift on-prem SAS Analytics compute work to the cloud as necessary can provide much-needed flexibility around speed and cost for “spiky” workloads while letting data scientists using Domino treat these SAS containers like any other model. I covered the Domino-SAS partnership in more detail earlier this month.

DataRobot Acquires Automated Machine Learning Startup Nexosis

DataRobot announced the acquisition of Nexosis, an automated machine learning company. Nexosis’ primary offering is an automated machine learning platform called Axon which amasses multiple data sources to produce actionable insights. Though the details of the acquisition remain classified, DataRobot continues to push its vision of automating AI development as the way to accelerate the deployment of machine learning and AI initiatives.


Finally, a reminder that I’m currently taking briefings for Amalgam Insights’ Vendor SmartList for the Data Science Platforms space. If you’d like to learn more about this research initiative, or set up a briefing with Amalgam Insights for potential inclusion, please email me at

Leave a Reply