Enterprise Data World 2019: Data Science Will Take Over The World! … Eventually.

Amalgam Insights attended Enterprise Data World, a conference focused on data management, in late March. Though the conference tracks covered a wide variety of data practices, our primary interest was in the sessions on the AI and Machine Learning track. We came away with the impression that the data management world is starting to understand and support some of the challenges that organizations face when trying to get complex data initiatives off the ground, but that the learning process will continue to have growing pains.

Data Strategy Bootcamp

I began my time at Enterprise Data World with the Data Strategy Bootcamp on Monday. Often, organizations focus on getting smaller data projects done quickly in a tactical fashion at the expense of consciously developing their broader data strategy. The bootcamp addressed how to incorporate these “quick wins” into the bigger picture, and delved into the details of what a data strategy should include, and what does the process of building one look like. For people in data analytics and data scientist roles, understanding and contributing to your organization’s data strategy is important because well-documented and properly-managed data means data analysts and data scientists can spend more of their time doing analytics and building machine learning models. The “data scientists spend 80% of their time cleaning and preparing data” number continues to circulate without measurable improvement. To build a successful data strategy, organizations will need to identify business goals that are data-centric to align the organization’s data strategy with its business strategy, assess the organization’s maturity and capabilities across its data ecosystem, and determine long-term goals and “quick wins” that will provide measurable progress towards those goals.

Getting Started with Data Science, Machine Learning, and Artificial Intelligence Initiatives

Actually getting started on data science, machine learning, and artificial intelligence initiatives remains a point of confusion for many organizations looking to expand beyond the basic data analytics they’re currently doing. Both Kristin Serafin and Lizzie Westin of FINRA and Vinay Seth Mohta of Manifold led sessions discussing how to turn talk about machine learning and artificial intelligence into action in your organizations, and how to do so in a way that can scale up quickly. Key takeaways: your organization needs to understand its data to understand what questions it wants answered that require a machine learning approach; it needs to understand what tools are necessary to move forward; it needs to understand who already has pertinent data capabilities within the organization, and who is best positioned to improve their skills in the necessary manner; and you need to obtain buy-in from relevant stakeholders.

Data Job Roles

Data job roles were discussed in multiple sessions; I attended one from the perspective of how analytical jobs themselves are evolving, and one from the perspective of analytical career development. Despite the hype, not everyone is a data scientist, even if they may perform some tasks that are part of a data science pipeline! Data engineers are the difference between data scientists’ experiments sitting in silos and getting them into production where they can affect your company. Data analysts aren’t going anywhere – yet. (Though Michael Stonebraker, in his keynote Tuesday morning, stated that he believed data science would eventually replace BI, pending upskilling a sufficient number of data workers.) And data scientists spend 80% of their time doing data prep instead of building machine learning models; they’d like to do more of the latter, and because they’re an expensive asset, the business needs them to be doing less prep and more building as well.

By the same token, there are so many different specialties across the data environment, and the tool landscape is incredibly large. No one will know everything; even relatively low-level people will need to provide leadership in their particular roles to bridge the much-bemoaned gap between IT and Business. So how can data people do that? They’ll need to learn to talk about their initiatives and accomplishments in business terms – increasing revenue, decreasing cost, managing risk. By doing this, data strategy can be tied to business strategy, and this barrier to success can be surmounted.

Data Integration at Scale

Michael Stonebraker’s keynote highlighted the growing need for people with data science capabilities, but the real meat of his talk centered around how to support complex data science initiatives: doing data integration at scale. One example: General Electric’s procurement system problem. Clearly, the ideal number of procurement systems in any company is “one.” Given mergers and acquisitions, over time, GE had accumulated *75* procurement systems. They could save $100M if they could bring together all of these systems, with all of the information on the terms and conditions negotiated with each vendor via each of these systems. But this required a rather complex data integration process. Once that was done, the same process remained for dealing with their supplier databases, and their customer databases, and a whole host of other data. Machine learning can help with this – once there are sufficient people with machine learning skills to address these large problems. But doing data integration at scale will remain a significant challenge for enterprises for now, with machine learning skills being relatively costly and rare, data accumulation continuing to grow exponentially, and bringing in third-party data to supplement existing analyses..

Knowledge Graphs and Semantic AI

A number of sessions discussed knowledge graphs and their importance for supporting both data management and data science tasks. Knowledge graphs provide a “semantic” layer over standard relational databases – they prioritize documenting the relationships between entities, making it easier to understand how different parts of your organization’s data are interrelated. Because having a knowledge graph about your organization’s data provides natural-language context around data relationships, it can make machine learning models based on that data more “explainable” due to the additional human-legible information available for interpretation and understanding. Another example: if you’re trying to perform a search, most results rely on exact matches. Having a knowledge graph makes it simple to pull up “related” results based on the relationships documented in that knowledge graph.

Data Access, Control, and Usage

My big takeaway from Scott Taylor’s Data Architecture session: data should be a shared, centralized asset for your entire organization; it must be 1) accessible by its consumers 2) in the format they require 3) via the method they require 4) if they have permission to access it (security) 5) and they will use it in a way that abides by governance standards and laws. Data scientists care about this because they need data to do their job, and any hurdle in accessing usable data makes it more likely they’ll avoid using official methods to access the data. Nobody has three months to wait for a data requisition from IT’s data warehouses to be turned around anymore; instead, “I’ll just use this data copy on my desktop” – or more likely these days, in a cloud-hosted data silo. Making centralized access easy to use makes data users much more likely to comply with data usage and access policies, which helps secure data properly, govern its use appropriately, and prevent data silos from forming.

Digging a bit more into the security and governance aspects mentioned above, it’s surprisingly easy to identify individuals in a set of anonymized data. In separate presentations, Matt Vogt of Immuta demonstrated this with a dataset consisting of anonymized NYC taxi data, even as more and more information was redacted from it. Jeff Jonas of Senzing’s keynote took this further – as context accumulates around data, it gets easier to make inferences, even when your data is far from clean. With GDPR on the table, and CCPA coming into effect in nine months, how data workers can use data, ethically and legally, will shift, significantly affecting data workflows. Both the use of data and the results provided by black-box machine learning models will be challenged.

Recommendations

Data scientists and machine learning practitioners should familiarize themselves with the broader data management ecosystem. Said practitioners understand why dirty data is problematic, given that they spend most of their work hours cleaning that data so they can do the actual machine learning model-building, but there are numerous tools available to help with this process, and possibly obviate the need for a particular cleaning job that’s already been done once. As enterprise data catalogs become more common, this will prevent data scientists from spending hours on duplicative work when someone else has already cleaned the set they were planning to use and made it available for the organization’s use.

Data scientists and data science managers should also learn how to communicate the business value of their data initiatives when speaking to business stakeholders. From a technical point of view, making a model more accurate is an achievement in and of itself. But knowing what it means from a business standpoint builds understanding of what that improved accuracy or speed means for the business as a whole. Maybe your 1% improvement in model accuracy means you save your company tens of thousands of dollars by more accurately targeting potential customers who are ready to buy your product – that’s what will get the attention of your line-of-business partners.

Data science directors and Chief Data or Chief Analytics Officers should approach building their organization’s data strategy and culture with the long-term view in mind. Aligning your data strategy with the organization’s business strategy is crucial to your organization’s success. Rather than having both departments tugging on opposite ends of the rope going in different directions, develop an understanding of each others’ needs and capabilities and apply that knowledge to keep everyone focused on the same goal.

Chief Data Officers and Chief Analytics Officers should understand their organization’s capabilities by conducting an assessment both of their data capabilities and capacity available by individual, and to assess the general maturity in each data practice area (such as Master Data Management, Data Integration, Data Architecture, etc.). Knowing the availability of both technical and people-based resources is necessary to develop a scalable set of data processes for your organization with consistent results no matter who the data scientist or analyst is in charge of executing on the process for any given project.

As part of developing their organization’s data strategy, Chief Data Officers and Chief Analytics Officers must work with their legal department to develop rules and processes for accumulating, storing, accessing, and using data appropriately. As laws like GDPR and the California Privacy Act start being enforced, data access and usage will be much more scrutinized; companies not adhering to the letters of those laws will find themselves fined heavily. Data scientists and data science managers who are working on projects that involve sensitive or personal data should talk to their general counsel to ensure they remain on the right side of the law.

Data Science and Machine Learning News Roundup, March 2019

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, Elastic, Google, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Dataiku Releases Version 5.1 in Anticipation of AI’s Surge in the Enterprise

Dataiku released version 5.1 of their software platform. This includes a GDPR framework for governance and control, as well as user-experience upgrades such as the ability to copy and reuse analytic workflows in new projects, coders being able to use their preferred development environment from within Dataiku, and easier navigation of complex analytics projects where data sources may number in the hundreds.

Being able to document when sensitive data is being used and prevent inappropriate use of such data is key for companies trying to work within GDPR and similar laws and not lose significant funds to violations of these laws. Dataiku’s inclusion of a governance component within its data science platform distinguishes it from its competitors, many of whom lack such a component natively, and enhances Dataiku’s attractiveness as a data science platform.

Domino Data Lab Platform Enhancements Improve Productivity of Data Science Teams Across the Entire Model Lifecycle

Domino announced three new capabilities for their data science platform. Datasets is a high-performance data store that will make it easier for data scientists to find, share, and reuse large data resources across multiple projects, saving time in the search process. Experiment Manager gives data science teams a system of record for ongoing experiments, making it easier to avoid unnecessary duplicate work. Activity Feed provides this type of information for data science leads to understand changes in any given project when they may be tracking multiple projects at once. Together, these three collaboration capabilities enhance Domino users’ ability to do data science in a documented, repeatable, and mature fashion.

SAS Announces $1 Billion Investment in Artificial Intelligence (AI)

SAS announced a $1B investment in AI across three key areas: Research and Development, education initiatives, and a Center of Excellence. The goal is to to enable SAS users to use AI to some degree even without a significant baseline of AI skills, to help SAS users improve their baseline AI skills through training, and to help organizations using SAS to bring AI projects into production more quickly with the help of AI experts as consultants. A significant percent of SAS users aren’t currently using SAS to perform complex machine learning and artificial intelligence tasks; helping these users to  get actual SAS-based AI projects into production enhances SAS’ ability to sell its AI software.

NVIDIA-Related Announcements

H2O.ai and SAS both announced partnerships with NVIDIA this month. H2O.ai’s Driverless AI and H2O4GPU are now optimized for NVIDIA’s Data Science Workstations, and NVIDIA RAPIDS will be integrated into H2O as well. SAS disclosed future plans to expand NVIDIA GPU support across SAS Viya, and plan to use these GPUs and the CUDA-X AI acceleration library to support SAS’ AI software. Both H2O.ai and SAS are using NVIDIA’s GPUs and CUDA-X to make certain types of machine learning algorithms operate more quickly and efficiently.

These follow prior announcements about NVIDIA partnerships with IBM, Oracle, Anaconda, and MathWorks, reflecting NVIDIA’s importance in machine learning. With NVIDIA GPUs making up an estimated 70% of the world market share, data science and machine learning software programs and platforms need to be able to work well on the de facto default GPU.

At IBM Think, Watson Expands “Anywhere”

At IBM Think in February, IBM made several announcements around the expansion of Watson’s availability and capabilities, framing these announcements as the launch of “Watson Anywhere.” This piece is intended to provide guidance to data analysts, data scientists, and analytic professionals seeking to implement machine learning and artificial intelligence capabilities and evaluating the capabilities of IBM Watson’s AI and machine learning services for their data.

Announcements

IBM declared that Watson is now available “anywhere” – both on-prem and in any cloud configuration, whether private, public, singular, multi-cloud, or a hybrid cloud environment. Data that needs to remain in place for privacy and security reasons can now have Watson microservices act on it where it resides. The obstacle of cloud vendor lock-in can be avoided by simply bringing the code to the data instead of vice versa. This ubiquity is made possible via a connector from IBM Cloud Private for Data that makes these services available via Kubernetes containers. New Watson services that will be available via this connector include Watson Assistant, IBM’s virtual assistant, and Watson OpenScale, an AI operation and automation platform.

Watson OpenScale is an environment for managing AI applications that puts IBM’s Trust and Transparency principles into practice around machine learning models. It builds trust in these models by providing explanations of how said models come to the conclusions that they do, permitting visibility into what’s seen as a “black box” by making their processes auditable and traceable. OpenScale also claims the ability to automatically identify and mitigate bias in models, suggesting new data for model retraining. Finally, OpenScale also provides monitoring capabilities of AI in production, validating ongoing model accuracy and health from a central management console.

Watson Assistant lets organizations build conversational bot interfaces into applications and devices. When interacting with end users, it can perform searches of relevant documentation, ask the user for further clarification, or redirect the user to a person for sufficiently complex queries. Its availability as part of Watson Anywhere permits organizations to implement and run virtual assistants in clouds outside of the IBM Cloud.

These new services join other Watson services currently available via the IBM Cloud Private for Data connector including Watson Studio and Watson Machine Learning, IBM’s programs for creating and deploying machine learning models. Additional Watson services being made available for Watson Anywhere later this year include Watson Knowledge Studio and Watson Natural Language Understanding.

In addition, IBM also announced IBM Business Automation with Watson, a future AI capability that will permit businesses to further automate existing work processes by analyzing patterns in workflows for commonly repeated tasks. Currently, this capability is available via limited early access; general availability is anticipated later in 2019.

Recommendations

Organizations seeking to analyze data “in place” have a new option with Watson services now accessible outside of the IBM Cloud. Data that must remain where it is for security and privacy reasons can now have Watson analytics processes brought to it via a secure container, whether that data resides on-prem or in any cloud, not just the IBM cloud. This opens the possibility of using Watson to enterprises in regulated industries like finance, government, and healthcare, as well as in departments where governance and auditability are core requirements, such as legal and HR.

With the IBM Cloud Private for Data connector enabling Watson Anywhere, companies now have a net-new reason to consider IBM products and services in their data workflow. While Amazon and Azure dominate the cloud market, Watson’s AI and machine learning tools are generally easier to use out of the box. For companies who have made significant commitments to other cloud providers, Watson Anywhere represents an opportunity to bring more user-friendly data services to their data residing in non-IBM clouds.

Companies concerned about the “explainability” of machine learning models, particularly in regulated industries or for governance purposes, should consider using Watson OpenScale to monitor models in production. Because OpenScale can provide visibility into how models behave and make decisions, concerns about “black box models” can be mitigated with the ability to automatically audit a model, trace a given iteration, and explain how the model determined its outcomes. This transparency boosts the ability for line of business and executive users to understand what the model is doing from a business perspective, and justify subsequent actions based on that model’s output. For a company to depend on data-driven models, those models need to prove themselves trustworthy partners to those driving the business, and explainability bridges the gap between the model math and the business initiatives.

Finally, companies planning for long-term model usage need to consider how they plan to support model monitoring and maintenance. Longevity is a concern for machine learning models in production. Model drift reflects changes that your company needs to be aware of. How do companies ensure that model performance and accuracy is maintained over the long haul? What parameters determine when a model requires retraining, or to be taken out of production? Consistent monitoring and maintenance of operationalized models is key to their ongoing dependability.

Data Science and Machine Learning News Roundup, February 2019

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, Elastic, Google, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Four Key Announcements from H2O World in San Francisco

At H2O World in San Francisco, H2O.ai made several important announcements. Partnerships with Alteryx, Kx, and Intel will extend Driverless AI’s accessibility, capabilities, and speed, while improvements to Driverless AI, H2O, Sparkling Water, and AutoML focused on expanding support for more algorithms and heavier workloads. Amalgam Insights covered H2O.ai’s H2O World announcements.

IBM Watson Now Available Anywhere

At IBM Think in San Francisco, IBM announced the expansion of Watson’s availability “anywhere” – on-prem, and in any cloud configuration, whether private or public, singular or multi-cloud. Data no longer has to be hosted on the IBM Cloud to use Watson on it – instead, a connector from IBM Cloud Private for Data permits organizations to bring various Watson services to data that cannot be moved for privacy and security reasons. Update: Amalgam Insights now has a more in-depth evaluation of IBM Watson Anywhere.

Databricks’ $250 Million Funding Supports Explosive Growth and Global Demand for Unified Analytics; Brings Valuation to $2.75 Billion

Databricks has raised $250M in a Series E funding round, bringing its total funding to just shy of $500M. The funding round raises Databricks’ valuation to $2.75B in advance of a possible IPO. Microsoft joins this funding round, reflecting continuing commitment to the Azure Databricks collaboration between the two companies. This continued increase in valuation and financial commitment demonstrates that funders are satisfied with Databricks’ vision and execution.

Four Key Announcements from H2O World San Francisco

Last week at H2O World San Francisco, H2O.ai announced a number of improvements to Driverless AI, H2O, Sparkling Water, and AutoML, as well as several new partnerships for Driverless AI. The improvements provide incremental improvements across the platform, while the partnerships reflect H2O.ai expanding their audience and capabilities. This piece is intended to provide guidance…

Please register or log into your Free Amalgam Insights Community account to read more.
Log In Register

Data Science and Machine Learning News Roundup, January 2019

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, DominoElastic, Google, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Cloudera and Hortonworks Complete Planned Merger

In early January, Cloudera and Hortonworks completed their planned merger. With this, Cloudera becomes the default machine learning ecosystem for Hadoop-based data, while providing an easy pathway for expanding into  machine learning and analytics capabilities for Hortonworks customers.

Study: 89 Percent of Finance Teams Yet to Embrace Artificial Intelligence

A study conducted by the Association of International Certified Professional Accountants (AICPA) and Oracle revealed that 89% of organizations have not deployed AI to their finance groups. Although a correlation exists between companies with revenue growth and companies that are using AI, the key takeaway is that artificial intelligence is still in the early adopter phase for most organizations.

Gartner Magic Quadrant for Data Science and Machine Learning Platforms

In late January, Gartner released its Magic Quadrant for Data Science and Machine Learning Platforms. New to the Data Science and Machine Learning MQ this year are both DataRobot and Google – two machine learning offerings with completely different audiences and scope. DataRobot offers an automated machine learning service targeted towards “citizen data scientists,” while Google’s machine learning tools, though part of Google Cloud Platform, are more of a DIY data pipeline targeted towards developers. By contrast, I find it curious that Amazon’s SageMaker machine learning platform – and its own collection of task-specific machine learning tools, despite their similarity to Google’s – failed to make the quadrant, given this quadrant’s large umbrella.

While data science and machine learning are still emerging markets, the contrasting demands of these technologies made by citizen data scientists and by cutting-edge developers warrants splitting the next Data Science and Machine Learning Magic Quadrant into separate reports targeted to the considerations of each of these audiences. In particular, the continued growth of automated machine learning technologies will likely drive such a split, as citizen data scientists pursue a “good enough” solution that provides quick results.

Amazon Expands Toolkit of Machine Learning Services at AWS re:Invent

At AWS re:Invent, Amazon Web Services expanded its toolkit of machine learning application services with the announcements of Amazon Comprehend Medical, Amazon Forecast, Amazon Personalize, and Amazon Textract. These new services augment the capabilities Amazon provides to end users when it comes to text analysis, personalized recommendations, and time series forecasts. The continued growth of these individual services removes obstacles for companies looking to get started with common machine learning tasks on a smaller scale; rather than building a wholesale data science pipeline in-house, these services allow companies to quickly get one task done, and this permits an incremental introduction to machine learning for a given organization. Forecast, Personalize, and Textract are in preview, while Comprehend Medical is available now.

Amazon Comprehend Medical, Forecast, Personalize, and Textract join a collection of machine learning services that include speech recognition (Transcribe) and translation (Translate), speech-to-text and text-to-speech (Lex and Polly) to power machine conversation such as chatbots and Alexa, general text analytics (Comprehend), and image and video analysis (Rekognition).

New Capabilities

Amazon Personalize lets developers add personalized recommendations into their apps, based on a given activity stream from that app and a corpus of what’s available to be recommended, whether that’s products, articles, or other things. In addition to recommendations, Personalize can also be used to customize search results and notifications. By combining a given search string or location with contextual behavior data, Amazon looks to provide customers with the ability to build trust.

Amazon Forecast builds private, custom time-series forecast models that predict future trends based on that data. Customers provide both histoical data and related causal data, and Forecast analyzes the data to determine the relevant factors in building its models and providing forecasts.

Amazon Textract extracts text and data from scanned documents, without requiring manual data entry or custom code. In particular, using machine learning to recognize when data is in a table or form field and treat it appropriately will save a significant amount of time over the current OCR standard.

Finally, Amazon Comprehend Medical, an extension of last year’s Amazon Comprehend, uses natural language processing to analyze unstructured medical text such as doctor’s notes or clinical trial records, and extract relevant information from this text.

Recommendations

Organizations doing resource planning, financial planning, or other similar forecasting that currently lack the capability to do time series forecasting in-house should consider using Amazon Forecast to predict product demand, staffing levels, inventory levels, material availability, and to perform financial forecasting. Outsourcing the need to build complex forecasting models in-house lets departments focus on the predictions.

Consumer-oriented organizations looking to build higher levels of engagement with their customers who provide generic, uncontextualized recommendations right now (based on popularity or other simple measures) should consider using Amazon Personalize to provide personalized recommendations, search results, and notifications via their apps and website. Providing high-quality relevant recommendations a la minute builds customer trust in the quality of a given organization’s engagement efforts, particularly compared to the average spray-and-pray marketing communication.

Organizations that still depend on physical documents, or who have an archive of physical documents to scan and analyze, should consider using Amazon Textract. OCR’s limits are well-known, especially when it comes to accurately interpreting and formatting semi-structured blocks of text data such as form fields and tables, resulting in significant time devoted to post-processing manual correction. Textract handles complex documents without the need for custom code or maintaining templates; being able to automate text interpretation and analysis further accelerates document processing workflows, and better permits organizations to maintain compliance.

Medical organizations using software that depends on manually-implemented rules to process their medical text should consider using Amazon Comprehend Medical. By removing the need to maintain a list of rules in-house, Comprehend Medical accelerates the ability to extract and analyze medical information from unstructured text fields like doctor’s notes and health records, improving processes such as medical coding, cohort analysis to recruit patients for clinical trials, and health monitoring of patients.

All organizations looking to use machine learning services from external providers need to consider whether outsourcing will work for their circumstances. Data privacy is a key concern, and even more so in regulated verticals with industry-specific rules such as HIPAA. Does the service you want to use respect those rules? From a compliance perspective, why a model gives the results it does needs to be explained as well; merely accepting results from the black box at face value is insufficient. Machine learning products that automatically provide such an explanation in plain English do exist, but this feature is still uncommon and in its infancy.

Conclusion

With its latest announcements, Amazon continues to broaden the scope of customer issues it addresses with machine learning services. Medical companies need better text analytics yesterday, but struggle to comply with HIPAA while assessing the data they have. Customer-facing organizations face stiff competition when their competitor is only a click away. And any company trying to plan for the future based on past data grapples with understanding what factors affect future results. Amazon’s machine learning application services address common tactical business issues by simplifying the process for customers of implementing task-specific machine learning models to pure inputs and outputs. These services present outsourcing opportunities for overworked departments struggling to keep up.

Data Science and Machine Learning News, November 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, DominoElastic, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, SnapLogic, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Continue reading “Data Science and Machine Learning News, November 2018”

Data Science and Machine Learning News, October 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, DominoElastic, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Please register or log into your Free Amalgam Insights Community account to read more.
Log In Register

Data Science Platforms News Roundup, September 2018

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, DominoElastic, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Please register or log into your Free Amalgam Insights Community account to read more.
Log In Register