On May 22, Domino held its first Analyst Seminar in advance of its Rev conference for data science leaders. Domino provides an open data science platform to coordinate data science initiatives across enterprises, integrating data scientists, IT, and line of business.
At the Analyst Seminar, Domino introduced its Model Management framework: five pillars supporting a core belief that data science best practices involve data science not just being a siloed department or team, but that its resulting models should drive the business. For this to be possible, all relevant stakeholders across the enterprise will need to buy into data science initiatives, as this will involve changes to existing business process in order to take advantage of the knowledge gained from data science projects.
Domino finds themselves doing a fair amount of primary education surrounding the need for data science platforms in general. Among Domino’s customers, few are doing so at scale – only 10% of the 250-plus organizations surveyed have more than 50 models in production. So presenting a framework to help data science teams, IT, and line of business get on the same page has the potential to accelerate adoption by making the data science process tangible and understandable.
Fleshing Out Model Management
Though “model management” is commonly used among data science platforms to refer to the general practice of tracking models while they are in production, Domino’s definition expands “model management” beyond this specific practice to encompass a broader scope of building, delivering, validating, and monitoring models at scale to create a competitive advantage – an end-to-end model creation methodology. From Domino’s perspective, the five pillars that contribute to a good Model Management practice include:
- Model Technology: You can’t “do data science” without computers that can handle the massive compute loads imposed by analyzing large data sets with complex models. Likewise, you need software tools that your data scientists are familiar with in order to do their work – and above a certain team size, you’ll be looking at a data science platform to permit collaboration on projects.
- Model Development: The value-added work of data science practitioners is to create and modify models; in Domino’s model, this is home base for data scientists.
- Model Production: Here, data science work gets operationalized. It’s where models are deployed either into business-relevant reports, or into models or APIs that can be turned into a service or plugged into a larger software application.
- Model Governance: This is where Domino envisions data science leaders spending most of their time, monitoring their data scientists’ projects.
- Model Context: Domino’s Model Context should be considered a library or catalog of all of the products of the data science: models, reports, APIs.
Domino goes into more depth on these concepts in their whitepaper Introducing Model Management.
If Domino wants the market to think of data science in terms of this framework, then how does this framework map onto the Domino data science platform?
- Data scientists begin their research within their company’s Model Context in Domino’s Knowledge Center.
- To augment that, they then pull the appropriate tools into Domino’s Lab, where Model Development occurs.
- When a model is finished, data scientists move it into Model Production to turn it into a report, an app, or a similar deliverable that line of business users can then reference in Domino’s Launchpad.
- Organization leaders can monitor their team’s data science initiatives in Domino’s Control Center as Model Governance.
- Model Technology is the available toolkit for the entire endeavor, encompassing Domino as a whole.
From a macro view, Domino’s data science platform is on par with other data science platforms, and its Model Management framework reflects the current high-level typical enterprise data science workflow. The difference between that and the workflow for smaller data science teams or individual contributors is in the level of collaboration necessary. An individual contributor data scientist typically does a subset of three things: preparing data for use in their model, actual model development, then preparing the model to be operationalized by a software developer. Enterprise-grade data science adds a level of collaboration by centralizing digital resources (such as data, models, reports, APIs) for all relevant stakeholders to access, as well as a level of governance from a higher-level view concerned with relevant business and regulatory requirements.
The need for adding these capabilities when moving from individual contributor-scale data science to enterprise-grade data science isn’t always obvious, which both drives the need for educating enterprises on best practices in conducting collaborative data science and providing a platform that will accept the work that individual data science contributors have created.
Amalgam believes that Domino’s Model Management framework will be most useful for organizations fitting one of the following profiles.
1. If your enterprise already has a dedicated data science department in place, and is seeking to address core business issues where data science can help, the Model Management framework provides a guide for looking at scaling data science initiatives.
2. If your organization has active data science initiatives within a particular department, and is looking to expand data science activities throughout the company, the Model Management framework provides a map of the location you are trying to get to. To be fully prepared, though, you need an executive champion leading the process of getting all relevant stakeholders on the same page when it comes to using data science to drive business for your company: data scientists and data science leads, IT, and line of business. Building the culture of data science in your organization will be key to being able to do enterprise-grade data science.
3. If data science activities form the core of your business, the Model Management framework provides a guide to doing collaborative enterprise-grade data science.
For Domino: Domino recognizes that they are in an emerging market with their data science platform. By providing the Model Management framework, they are conducting an important primary education task for the data science platforms market: teaching enterprises how to effectively operationalize data science. But the vast majority of enterprises haven’t formalized the practice of data science in their organizations to the same extent that they have for data management, business intelligence, and analytics. They want to do data science, but don’t know how to get there.
Right now, trying to find useful, relevant information on implementing enterprise-grade data science initiatives is like trying to find the proverbial needle in a haystack. For their message to stand out, Domino will need to double down on educating the market, likely in concert with partners and key allies to help spread the word. In this context of prioritizing education, Domino’s guide to managing data science at scale is just as important as their Model Management framework for the level of training and education they need to provide.
By virtue of my observations on the difficulty of navigating the data science platforms market, I am working on a Data Science Platforms Vendor Landscape, scheduled for publication in Fall 2018. In this Vendor Landscape, I will be evaluating relevant Data Science Platforms in the context of key market trends and challenges.