Last week at H2O World San Francisco, H2O.ai announced a number of improvements to Driverless AI, H2O, Sparkling Water, and AutoML, as well as several new partnerships for Driverless AI. The improvements provide incremental improvements across the platform, while the partnerships reflect H2O.ai expanding their audience and capabilities. This piece is intended to provide guidance to data analysts, data scientists, and analytic professionals working on including machine learning in their workflows.
Announcements
H2O.ai has integrated H2O Driverless AI with Alteryx Designer; the connector is available for download in the Alteryx Analytics Gallery. This will permit Alteryx users to implement more advanced and automatic machine learning algorithms into analytic workflows in Designer, as well as doing automatic feature engineering for their machine learning models. In addition, Driverless AI models can be deployed to Alteryx Promote for model management and monitoring, reducing time to deployment. Both of these new capabilities provide Alteryx-using business analysts and citizen data scientists more direct and expanded access to machine learning via H2O.ai.
H2O.ai is integrating Kx’s time-series database, kdb+, into Driverless AI. This will extend Driverless AI’s ability to process large datasets, resulting in faster identification of more performant predictive capabilities and machine learning models. Kx users will be able to perform feature engineering for machine learning models on their time series datasets within Driverless AI, and create time-series specific queries.
H2O.ai also announced a collaboration with Intel that will focus on accelerating H2O.ai technology on Intel platforms, including the Intel Xeon Scalable processor and H2O.ai’s implementation of XGBoost. Driverless AI on Intel, globally. Accelerating H2O on Intel will help establish Intel’s credibility in machine learning and artificial intelligence for heavy compute loads. Other aspects of this collaboration will include expanding the reach of data science and machine learning by supporting efforts to integrate AI into analytics workflows and using Intel’s AI Academy to teach relevant skills. The details of the technical projects will remain under wraps until spring.
Finally, H2O.ai announced numerous improvements to both Driverless AI and their open-source H2O, Sparkling Water, and AutoML, mostly focused on expanding support for more algorithms and heavier workloads among their product suite. Among the improvements that caught my eye was the new ability to inspect trees thoroughly for all of the tree-based algorithms that the open-source H2O platform supports. With concern about “black-box” models and lack of insight around how a given model performs its analysis and why it yields the results it does for any given experiment, providing an API for tree inspection is a practical step towards making the logic behind model performance and output more transparent for at least some machine learning models.
Recommendations
Alteryx users seeking to implement machine learning models into analytic workflows should take advantage of increased access to H2O Driverless AI. Providing more machine learning capabilities to business analysts and citizen data scientists enhances the capabilities available to their data analytics workflows; Driverless AI’s existing AutoDoc capability will be particularly useful for ensuring Alteryx users understand the results of the more advanced techniques they now have access to.
If your organization collects time-series data but has not yet pursued analytics of this data with machine learning yet, consider trialing KX’s kdb+ and H2O’s Driverless AI. With this integration, Driverless AI will be able to quickly and automatically process time series data stored in kdb+, allowing swift identification of performant models and predictive capabilities.
If your organization is considering making significant investments in heavy-duty computing assets for heavy machine learning loads in the medium-term future, keep an eye on the work Intel will be doing to design chips for specific types of machine learning workloads. NVIDIA has its GPUs and Google its TPUs; by partnering with H2O, Intel is declaring its intentions to remain relevant in this market.
If your organization is concerned about the effects of “black box” machine learning models, the ability to inspect tree-based models in H2O, along with the AutoDoc functionality in Driverless AI, are starting to make the logic behind machine learning models in H2O more transparent. This new ability to inspect tree-based algorithms is a key step towards more thorough governance surrounding the results of machine learning endeavors.
[…] Four Key Announcements from H2O World in San Francisco […]