On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta.
Companies Mentioned: Aberdeen Group, Actian, Alation, Arcadia Data, Attunity, BMC, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataKitchen, Datameer, Datarobot, Domino Data Lab, EMA, HPE, Hurwitz and Associates, IBM, Informatica, Kogentix, LogTrust, Looker, < MesoSphere, Micro Focus, Microstrategy, Ovum, Paxata, Podium Data, Qubole, SAP, Snowflake, Strata Data, Tableau, Tamr, Tellius, Trifacta.
Last week, I attended Strata Data Conference at the Javitz Center in New York City to catch up with a wide variety of data science and machine learning users, enablers, and thought leaders. In the process, I had the opportunity to listen to some fantastic keynotes and to chat with 30+ companies looking for solutions, 30+ vendors presenting at the show, and attend with a number of luminary industry analysts and thought leaders including Ovum’s Tony Baer, EMA’s John Myers, Aberdeen Group’s Mike Lock, and Hurwitz & Associates’ Judith Hurwitz.
From this whirwind tour of executives, I took a lot of takeaways from the keynotes and vendors that I can share and from end users that I unfortunately have to keep confidential. To give you an idea of what an industry analyst notes, following are a short summary of takeaways I took from the keynotes and from each vendor that I spoke to:
Keynotes: The key themes that really got my attention is the idea that AI requires ethics, brought up by Joanna Bryson, and that all data is biased, which danah boyd discussed. This idea that data and machine learning have their own weaknesses that require human intervention, training, and guidance is incredibly important. Over the past decade, technologists have put their trust in Big Data and the idea that data will provide answers, only to find that a naive and “unbiased” analysis of data has its own biases. Context and human perspective are inherent to translating data into value: this does not change just because our analytic and data training tools are increasingly nuanced and intelligent in nature.
Behind the hype of data science, Big Data, analytic modeling, robotic process automation, DevOps, DataOps, and artifical intelligence is this fundamental need to understand that data, algorithms, and technology all have inherent biases as the following tweet shows: