Data Science and Machine Learning News Roundup, May 2019

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, Elastic, Google, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Domino Data Lab Champions Expert Data Scientists While Outpacing Walled-Garden Data Science Platforms

Domino announced key updates to its data science platform at Rev 2, its annual data science leader summit. For data science managers, the new Control Center provides information on what an organization’s data science team members are doing, helping managers address any blocking issues and prioritize projects appropriately. The Experiment Manager’s new Activity Feed supplies data scientists with better organizational and tracking capabilities on their experiments. The Compute Grid and Compute Engine, built on Kubernetes, will make it easier for IT teams to install and administer Domino, even in complex hybrid cloud environments. Finally, the beta Domino Community Forum will allow Domino users to share best practices with each other, as well as submit feature requests and feedback to Domino directly. With governance becoming a top priority across data science practices, Domino’s platform improvements around monitoring and making experiments repeatable will make this important ability easier for its users.

Informatica Unveils AI-Powered Product Innovations and Strengthens Industry Partnerships at Informatica World 2019

At Informatica World, Informatica publicized a number of key partnerships, both new and enhanced. Most of these partnerships involve additional support for cloud services. This includes storage, both data warehouses (Amazon Redshift) and data lakes (Azure, Databricks). Informatica also announced a new Tableau Dashboard Extension that enables Informatica Enterprise Data Catalog from within the Tableau platform. Finally, Informatica and Google Cloud are broadening their existing partnership by making Intelligent Cloud Services available on Google Cloud Platform, and providing increased support for Google BigQuery and Google Cloud Dataproc within Informatica. Amalgam Insights attended Informatica World and provides a deeper assessment of Informatica’s partnerships, as well as CLAIRE-ity on Informatica’s AI initiatives.

Microsoft delivers new advancements in Azure from cloud to edge ahead of Microsoft Build conference

Microsoft announced a number of new Azure Machine Learning and Azure AI capabilities. Azure Machine Learning has been integrated with Azure DevOps to provide “MLOps” capabilities that enable reproducibility, auditability, and automation of the full machine learning lifecycle. This marks a notable increase in making the machine learning model process more governable and compliant with regulatory needs. Azure Machine Learning also has a new visual drag-and-drop interface to facilitate codeless machine learning model creation, making the process of building machine learning models more user-friendly. On the Azure AI side, Azure Cognitive Services launched Personalizer, which provides users with specific recommendations to inform their decision-making process. Personalizer is part of the new “Decisions” category within Azure Cognitive Services; other Decisions services include Content Moderator, an API to assist in moderation and reviewing of text, images, and videos; and Anomaly Detector, an API that ingests time-series data and chooses an appropriate anomaly detection model for that data. Finally, Microsoft added a “cognitive search” capability to Azure Search, which allows customers to apply Cognitive Services algorithms to search results of their structured and unstructured content.

Microsoft and General Assembly launch partnership to close the global AI skills gap

Microsoft also announced a partnership with General Assembly to address the dearth of qualified data workers, with the goal of training 15,000 workers by 2022 for various artificial intelligence and machine learning roles. The two companies will found an AI Standards Board to create standards and credentials for artificial intelligence skills. In addition, Microsoft and General Assembly will develop scalable training solutions for Microsoft customers, and establish an AI Talent network to connect qualified candidates to AI jobs. This continues the trend of major enterprises building internal training programs to bridge the data skills gap.

Data Science and Machine Learning News, October 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, DominoElastic, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Data Science Platforms News Roundup, August 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Data Science Platforms News Roundup, July 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cloudera, Databricks, Dataiku, DataRobotDatawatch, Domino, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Domino Deploys SAS Analytics Into a Model-Driven Cloud

The announcement: On July 10, Domino Data Lab announced a partnership with SAS Analytics that will let Domino users run SAS Analytics for Containers in the public cloud on AWS while using Domino’s data science platform as the orchestration layer for the infrastructure provisioning and management. This partnership will allow SAS customers to use Domino as an orchestration layer to access multiple SAS environments for model building, deploy multiple SAS applications on AWS, track each SAS experiment in detail, while having reproducibility of prior work.

What does this mean?

Domino customers with SAS Analytics workloads currently running on-prem will now be able to deploy those workloads to the public cloud on AWS by using SAS Analytics for Containers via the Domino platform. Domino plans to follow up with support for Microsoft Azure and Google Cloud Platform to further enable enterprises to offload containerized SAS workloads in the cloud. By running SAS Analytics for Containers via Domino, Domino users will be able to track, provide feedback on, and reproduce their containerized SAS experiments the same way they do so with other experiments they’ve constructed using Python, R, or other tools within Domino.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

What Data Science Platform Suits Your Organization’s Needs?

This summer, my Amalgam Insights colleague Hyoun Park and I will be teaming up to address that question. When it comes to data science platforms, there’s no such thing as “one size fits all.” We are writing this landscape because understanding the processes of scaling data science beyond individual experiments and integrating it into your business is difficult. By breaking down the key characteristics of the data science platform market, this landscape will help potential buyers choose the appropriate platform for your organizational needs. We will examine the following questions that serve as key differentiators to determine appropriate data science platform purchasing solutions to figure out which characteristics, functionalities, and policies differentiate platforms supporting introductory data science workflows from those supporting scaled-up enterprise-grade workflows.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Domino Debuts Data Science Framework

Domino Model Management

On May 22, Domino held its first Analyst Seminar in advance of its Rev conference for data science leaders. Domino provides an open data science platform to coordinate data science initiatives across enterprises, integrating data scientists, IT, and line of business.

At the Analyst Seminar, Domino introduced its Model Management framework: five pillars supporting a core belief that data science best practices involve data science not just being a siloed department or team, but that its resulting models should drive the business. For this to be possible,  all relevant stakeholders across the enterprise will need to buy into data science initiatives, as this will involve changes to existing business process in order to take advantage of the knowledge gained from data science projects.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Lynne Baer: Clarifying Data Science Platforms for Business

Word cloud of data science software and terms

My name is Lynne Baer, and I’ll be covering the world of data science software for Amalgam Insights. I’ll investigate data science platforms and apps to solve the puzzle of getting the right tools to the right people and organizations.

“Data science” is on the tip of every executive’s tongue right now. The idea that new business initiatives (and improvements to existing ones) can be found in the data a company is already collecting is compelling. Perhaps your organization has already dipped its toes in the data discovery and analysis waters – your employees may be managing your company’s data in Informatica, or performing statistical analysis in Statistica, or experimenting with Tableau to transform data into visualizations.

But what is a Data Science Platform? Right now, if you’re looking to buy software for your company to do data science-related tasks, it’s difficult to know which applications will actually suit your needs. Do you already have a data workflow you’d like to build on, or are you looking to the structure of an end-to-end platform to set your data science initiative up for success? How do you coordinate a team of data scientists to take better advantages of existing resources they’ve already created? Do you have coders in-house already who can work with a platform designed for people writing in Python, R, Scala, Julia? Are there more user-friendly tools out there your company can use if you don’t? What do you do if some of your data requires tighter security protocols around it? Or if some of your data models themselves are proprietary and/or confidential?

All of these questions are part and parcel of the big one: How can companies tell what makes a good data science platform for their needs before investing time and money? Are traditional enterprise software vendors like IBM, Microsoft, SAP, SAS dependable in this space? What about companies like Alteryx, H2O.ai, KNIME, RapidMiner? Other popular platforms under consideration should also include Anaconda, Angoss (recently acquired by Datawatch), Domino, Databricks, Dataiku, MapR, Mathworks, Teradata, TIBCO. And then there’s new startups like Sentenai, focused on streaming sensor data, and slightly more established companies like Cloudera looking to expand from their existing offerings.

Over the next several months, I’ll be digging deeply to answer these questions, speaking with vendors, users, and investors in the data science market. I would love to speak with you, and I look forward to continuing this discussion. And if you’ll be at Alteryx Inspire in June, I’ll see you there.