On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta.
The announcement: On July 10, Domino Data Lab announced a partnership with SAS Analytics that will let Domino users run SAS Analytics for Containers in the public cloud on AWS while using Domino’s data science platform as the orchestration layer for the infrastructure provisioning and management. This partnership will allow SAS customers to use Domino as an orchestration layer to access multiple SAS environments for model building, deploy multiple SAS applications on AWS, track each SAS experiment in detail, while having reproducibility of prior work.
What does this mean?
Domino customers with SAS Analytics workloads currently running on-prem will now be able to deploy those workloads to the public cloud on AWS by using SAS Analytics for Containers via the Domino platform. Domino plans to follow up with support for Microsoft Azure and Google Cloud Platform to further enable enterprises to offload containerized SAS workloads in the cloud. By running SAS Analytics for Containers via Domino, Domino users will be able to track, provide feedback on, and reproduce their containerized SAS experiments the same way they do so with other experiments they’ve constructed using Python, R, or other tools within Domino.
On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include:
My name is Lynne Baer, and I’ll be covering the world of data science software for Amalgam Insights. I’ll investigate data science platforms and apps to solve the puzzle of getting the right tools to the right people and organizations.
“Data science” is on the tip of every executive’s tongue right now. The idea that new business initiatives (and improvements to existing ones) can be found in the data a company is already collecting is compelling. Perhaps your organization has already dipped its toes in the data discovery and analysis waters – your employees may be managing your company’s data in Informatica, or performing statistical analysis in Statistica, or experimenting with Tableau to transform data into visualizations.
But what is a Data Science Platform? Right now, if you’re looking to buy software for your company to do data science-related tasks, it’s difficult to know which applications will actually suit your needs. Do you already have a data workflow you’d like to build on, or are you looking to the structure of an end-to-end platform to set your data science initiative up for success? How do you coordinate a team of data scientists to take better advantages of existing resources they’ve already created? Do you have coders in-house already who can work with a platform designed for people writing in Python, R, Scala, Julia? Are there more user-friendly tools out there your company can use if you don’t? What do you do if some of your data requires tighter security protocols around it? Or if some of your data models themselves are proprietary and/or confidential?
All of these questions are part and parcel of the big one: How can companies tell what makes a good data science platform for their needs before investing time and money? Are traditional enterprise software vendors like IBM, Microsoft, SAP, SAS dependable in this space? What about companies like Alteryx, H2O.ai, KNIME, RapidMiner? Other popular platforms under consideration should also include Anaconda, Angoss (recently acquired by Datawatch), Domino, Databricks, Dataiku, MapR, Mathworks, Teradata, TIBCO. And then there’s new startups like Sentenai, focused on streaming sensor data, and slightly more established companies like Cloudera looking to expand from their existing offerings.
Over the next several months, I’ll be digging deeply to answer these questions, speaking with vendors, users, and investors in the data science market. I would love to speak with you, and I look forward to continuing this discussion. And if you’ll be at Alteryx Inspire in June, I’ll see you there.
Recommended Audience: CIOs, Enterprise Architects, Data Managers, Analytics Managers, Data Scientists, IT Managers
Vendors Mentioned: Trifacta, Paxata, Datameer, Datawatch, Lavastorm, Alation, Tamr, Unifi, 1010Data, Podium Data, IBM, Domo, Microsoft, Information Builders, Board, Microstrategy, Cloudera, H20.ai, RapidMiner, Domino Data Lab, Dataiku, TIBCO, SAS, Amazon Web Services, Google, DataRobot.
In case you missed it, I just finished up my webinar on Data and Analytic Strategies for Developing Ethical IT. We are headed into a new algorithmic, statistical, and heterogenous data-defined model of IT where IT ethics and relevance are being challenged. In this webinar, we discussed:
- Why IT is broken from a support and business perspective
- The aspects of IT that can be fixed
- What we can do as IT managers to fix IT
- Data Prep, Data Unification, Business Intelligence, Data Science, and Machine Learning vendors that can help unlock the Black Boxes and Opt-Out disasters in IT
- Key Recommendations
This webinar provides context to my ongoing research tracks of “BI to AI on Shared Data” and “IT Management at Scale.” To attend the webinar, please check the embedded view below or click to watch on BrightTALK