5 MegaThemes for the 2020s That Will Transform IT

2020s Tech MegaTrends
As we get ready for 2020, Amalgam Insights is here to prepare companies for the future.  In the past few weeks, we’ve been posting insights on what to look for in 2020 with posts including:



and our four-part series on Ethical AI for the future:


and


Over this decade, we have learned how to work with technology at massive scale and with unprecedented power as the following technology trends surfaced in the 2010s:


  • The birth and death of Big Data in supporting massive scale as the terabyte shifted from an intimidating amount of data to a standard unit of measurement
  • The evolution of cloud computing from niche tool to a rapidly growing market that is roughly $150 billion a year now and will likely be well over a trillion dollars a year by the end of the 2020s
  • The Internet of Things, which will enable a future of distributed and specialized computing based on billions of processors and build on this decade’s massive progress in creating mobile and wireless smart devices.
  • The democratization of artificial intelligence tools including machine learning, deep learning, and data science services and platforms that have opened up the world of AI to developers and data analysts
  • The use of CRISPR Cas9 to programmatically edit genes, which has changed the biological world just as AI has changed the world of technology
  • Brain biofeedback and Brain-Computer Interfaces, which provide direct neural interfaces to control and affect a physical environment.
  • Extended Reality, through the development of augmented and virtual reality which are starting to provide realistic sensory simulations available on demand
2010s Tech Drivers
2010s Tech Drivers
These bullet points describe where we already are today as of the end of 2019. So, how will all of these technologies affect the way we work in the 2020s? From our perspective, these trends fit into 5 MegaThemes of Personalization, Ubiquity, Computational Augmentation, Biologically Influenced Computing, and Renewability.


We believe the following five themes have both significantly evolved during the 2010s and will create the opportunity for ongoing transformative change that will fundamentally affect enterprise technology. Each of these MegaThemes has three key trends that will affect the ways that businesses use technology in the 2020s. This piece provides an introduction to these trends that will be contextualized from an IT, data, and finance perspective in future work, including blogs, webinars, vendor landscapes, and other analyst insights.
2020s Tech MegaTrends
2020s Tech MegaTrends
Over the rest of January, we’ll explore each of these five MegaThemes in greater detail, as these primary themes will end up driving innovation, change, and transformation within our tactical coverage areas including AI, analytics, Business Planning, DevOps, Finance and Accounting, Technology Expense Management, and Extended Reality.


In our next blog, we’ll cover MegaTheme 1 on Individualization and how the 2020s will build on generational shifts, subscription and on-demand economies, and distributed identity to support specific and contextualized experiences on a ubiquitous basis.

Developing a Practical Model for Ethical AI in the Business World: Introduction

As we head into 2020, the concept of “AI (Artificial Intelligence) for Good” is becoming an increasingly common phrase. Individuals and organizations with AI skillsets (including data management, data integration, statistical analysis, machine learning, algorithmic model development, and application deployment skills) have effort into pursuing ethical AI efforts.

Amalgam Insights believes that these efforts have largely been piecemeal and inadequate to meet common-sense definitions for companies to effectively state that they are pursuing, documenting, and practicing true ethical AI because of the breadth and potential repercussions of AI on business outcomes. This is not due to a lack of interest, but based on a couple of key considerations. First, AI is a relatively new capability in the enterprise IT portfolio that often lacks formal practices and guidelines and has been managed as a “skunkworks” or experimental project. Second, businesses have not seen AI as a business practice, but as a purely technical practice and made a number of assumptions in skipping to the technical development that would typically not have been made for more mature technical capabilities and projects.

In the past, Amalgam Insights has provided frameworks to help organizations take the next step to AI through our BI to AI progression.

Figure 1: Amalgam’s Framework from BI to AI

 

 

 

To pursue a more ethical model of AI, Amalgam Insights believes that AI efforts need to be analyzed through three key lenses:

  • Executive Design
  • Technical Development
  • Operational Deployment

Figure 2: Amalgam’s Three Key Areas for Ethical AI

In each of these areas, businesses must ask the right questions and adequately prepare for the deployment of ethical AI. In this framework, AI is not just a set of machine learning algorithms to be utilized, but an enabler to effectively augment problem-solving for appropriate challenges.

Over the next week, Amalgam Insights will explore 12 areas of bias across these three categories with the goal of developing a straightforward framework that companies can use to guide their AI initiatives and take a structured approach to enforcing a consistent set of ethical guidelines to support governance across the executive, technical, and operational aspects of initiating, developing, and deploying AI.

In our next blog, we will explore Executive Design with a focus on the five key questions that an executive must consider as they start considering the use of AI within their enterprise.

Data Science and Machine Learning News Roundup, May 2019

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, Elastic, Google, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Domino Data Lab Champions Expert Data Scientists While Outpacing Walled-Garden Data Science Platforms

Domino announced key updates to its data science platform at Rev 2, its annual data science leader summit. For data science managers, the new Control Center provides information on what an organization’s data science team members are doing, helping managers address any blocking issues and prioritize projects appropriately. The Experiment Manager’s new Activity Feed supplies data scientists with better organizational and tracking capabilities on their experiments. The Compute Grid and Compute Engine, built on Kubernetes, will make it easier for IT teams to install and administer Domino, even in complex hybrid cloud environments. Finally, the beta Domino Community Forum will allow Domino users to share best practices with each other, as well as submit feature requests and feedback to Domino directly. With governance becoming a top priority across data science practices, Domino’s platform improvements around monitoring and making experiments repeatable will make this important ability easier for its users.

Informatica Unveils AI-Powered Product Innovations and Strengthens Industry Partnerships at Informatica World 2019

At Informatica World, Informatica publicized a number of key partnerships, both new and enhanced. Most of these partnerships involve additional support for cloud services. This includes storage, both data warehouses (Amazon Redshift) and data lakes (Azure, Databricks). Informatica also announced a new Tableau Dashboard Extension that enables Informatica Enterprise Data Catalog from within the Tableau platform. Finally, Informatica and Google Cloud are broadening their existing partnership by making Intelligent Cloud Services available on Google Cloud Platform, and providing increased support for Google BigQuery and Google Cloud Dataproc within Informatica. Amalgam Insights attended Informatica World and provides a deeper assessment of Informatica’s partnerships, as well as CLAIRE-ity on Informatica’s AI initiatives.

Microsoft delivers new advancements in Azure from cloud to edge ahead of Microsoft Build conference

Microsoft announced a number of new Azure Machine Learning and Azure AI capabilities. Azure Machine Learning has been integrated with Azure DevOps to provide “MLOps” capabilities that enable reproducibility, auditability, and automation of the full machine learning lifecycle. This marks a notable increase in making the machine learning model process more governable and compliant with regulatory needs. Azure Machine Learning also has a new visual drag-and-drop interface to facilitate codeless machine learning model creation, making the process of building machine learning models more user-friendly. On the Azure AI side, Azure Cognitive Services launched Personalizer, which provides users with specific recommendations to inform their decision-making process. Personalizer is part of the new “Decisions” category within Azure Cognitive Services; other Decisions services include Content Moderator, an API to assist in moderation and reviewing of text, images, and videos; and Anomaly Detector, an API that ingests time-series data and chooses an appropriate anomaly detection model for that data. Finally, Microsoft added a “cognitive search” capability to Azure Search, which allows customers to apply Cognitive Services algorithms to search results of their structured and unstructured content.

Microsoft and General Assembly launch partnership to close the global AI skills gap

Microsoft also announced a partnership with General Assembly to address the dearth of qualified data workers, with the goal of training 15,000 workers by 2022 for various artificial intelligence and machine learning roles. The two companies will found an AI Standards Board to create standards and credentials for artificial intelligence skills. In addition, Microsoft and General Assembly will develop scalable training solutions for Microsoft customers, and establish an AI Talent network to connect qualified candidates to AI jobs. This continues the trend of major enterprises building internal training programs to bridge the data skills gap.

Data Science and Machine Learning News Roundup, April 2019

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, Elastic, Google, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta, TROVE.

Alteryx Acquires ClearStory Data to Accelerate Innovation in Data Science and Analytics

Alteryx acquired ClearStory Data, an analytics solution for complex and unstructured data with a focus on automating Big Data profiling, discovery, and data modeling.  This acquisition reflects Alteryx’s interest in expanding its native capabilities to include more in-house data visualization tools. ClearStory Data provides a visual focus on data prep, blending, and dashboarding with their Interactive Storyboards that partners with Alteryx’s ongoing augmentation of internal visualization capabilities throughout the workflow such as Visualytics.

Dataiku Announces the Release of Dataiku Lite Edition

Dataiku released two new versions of its machine learning platform, Dataiku Free and Dataiku Lite, targeted towards small and medium businesses. Dataiku Free will allow teams of up to three users to work together simultaneously; it is available both on-prem and on AWS and Azure. Dataiku Lite will provide support for Hadoop and job scheduling beyond the capabilities of Dataiku Free. Since Dataiku already partners with over 1000 small and medium businesses, creating versions of its existing platform more financially accessible to such organizations lowers a significant barrier to entry, and grooms smaller companies to grow their nascent data science practices within the Dataiku family.

DataRobot Celebrates One Billion Models Built on Its Cloud Platform

DataRobot announced that as of mid-April, its customers had built one billion models on its automatic machine learning program. Vice President of Product Management Phil Gurbacki noted that DataRobot customers build more than 2.5 million models per day. Given that the majority of models created are never successfully deployed – a common theme cited this month at both Enterprise Data World and at last week’s Open Data Science Conference – it seems likely that DataRobot customers don’t currently have one billion models operationalized. If the percentage of deployed models is significantly higher than the norm, though, this would certainly boost DataRobot in potential customers’ eyes, and serve to further legitimize AutoML software solutions as plausible options.

Microsoft, SAS, TIBCO Continue Investments in AI and Data Skills Training

Microsoft announced a new partnership with OpenClassrooms to train students for the AI job marketplace via online coursework and projects. Given an estimate that projects 30% of AI and data jobs will go unfilled by 2022, OpenClassrooms’ recruiting 1000 promising candidates seems like just the beginning of a much-needed effort to address the skills gap.

SAS provided more details on the AI education initiatives they announced last month. First, they launched SAS Viya for Learners, which will allow academic institutions to access SAS AI and machine learning tools for free. A new SAS machine learning course and two new Coursera courses will provide access to SAS Viya for Learners to those wanting to learn AI skills without being affiliated with a traditional academic institution. SAS also expanded on the new certifications they plan to offer: three SAS specialist certifications in machine learning, natural language and computer vision, and forecasting and optimization. Classroom and online options for pursuing both of these certifications will be available.

Meanwhile, TIBCO continued expanding its partnerships with educational institutions in Asia to broaden analytics knowledge in the region. Most recently, it has augmented its existing partnership with Singapore Polytechnic to train 1000 students in analytics and IoT skillsets by 2020. Other analytics education partnerships TIBCO has announced in the last year include Yuan Ze University in Taiwan, Asia Pacific University of Technology and Innovation in Malaysia, and BINUS University in Indonesia.

The big picture: existing data science degree programs and machine learning and AI bootcamps are not providing a large enough volume of highly-skilled job candidates quickly enough to fill many of these data-centric positions. Expect to hear more about additional educational efforts forthcoming from data science, machine learning, and AI vendors.

Quick AI Insights at #MSBuild in an Overstuffed Tech Event Week

We are in the midst of one of the most packed tech event weeks in recent memory. This week alone, Amalgam Insights is tracking *six* different events:

This means a lot of announcements this week that will be directly comparable. For instance, Google, Microsoft, Red Hat, SAP, and ServiceNow should all have a variety of meaty DevOps and platform access announcements. Google, Microsoft, SAP, and possibly IBM and ServiceNow should have interesting new AI announcements. ServiceNow and Red Hat will both undoubtedly be working to one-up each other when it comes to revolutionizing IT. We’ll be providing some insights and give you an idea of what to look forward to.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

How is Salesforce Taking on AI: a look at Einstein at Salesforce World Tour Boston

On April 3rd, Amalgam Insights attended Salesforce World Tour 2019 in Boston. Salesforce users may know this event as an opportunity to meet with their account managers and catch up with new functionalities and partners without having to fly to San Francisco and navigate through the colossus that is Dreamforce.

Salesforce also uses this tour as an opportunity to present analysts with the latest and greatest changes in their offerings. Amalgam Insights was interested both in learning more about Salesforce’s current positioning from a data perspective, including the vendor’s acquisition of Mulesoft as well as its progression in both the Einstein Analytics and Einstein Platform in providing value-added insights and artificial intelligence to Salesforce clients.

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Enterprise Data World 2019: Data Science Will Take Over The World! … Eventually.

Amalgam Insights attended Enterprise Data World, a conference focused on data management, in late March. Though the conference tracks covered a wide variety of data practices, our primary interest was in the sessions on the AI and Machine Learning track. We came away with the impression that the data management world is starting to understand and support some of the challenges that organizations face when trying to get complex data initiatives off the ground, but that the learning process will continue to have growing pains.

Data Strategy Bootcamp

I began my time at Enterprise Data World with the Data Strategy Bootcamp on Monday. Often, organizations focus on getting smaller data projects done quickly in a tactical fashion at the expense of consciously developing their broader data strategy. The bootcamp addressed how to incorporate these “quick wins” into the bigger picture, and delved into the details of what a data strategy should include, and what does the process of building one look like. For people in data analytics and data scientist roles, understanding and contributing to your organization’s data strategy is important because well-documented and properly-managed data means data analysts and data scientists can spend more of their time doing analytics and building machine learning models. The “data scientists spend 80% of their time cleaning and preparing data” number continues to circulate without measurable improvement. To build a successful data strategy, organizations will need to identify business goals that are data-centric to align the organization’s data strategy with its business strategy, assess the organization’s maturity and capabilities across its data ecosystem, and determine long-term goals and “quick wins” that will provide measurable progress towards those goals.

Getting Started with Data Science, Machine Learning, and Artificial Intelligence Initiatives

Actually getting started on data science, machine learning, and artificial intelligence initiatives remains a point of confusion for many organizations looking to expand beyond the basic data analytics they’re currently doing. Both Kristin Serafin and Lizzie Westin of FINRA and Vinay Seth Mohta of Manifold led sessions discussing how to turn talk about machine learning and artificial intelligence into action in your organizations, and how to do so in a way that can scale up quickly. Key takeaways: your organization needs to understand its data to understand what questions it wants answered that require a machine learning approach; it needs to understand what tools are necessary to move forward; it needs to understand who already has pertinent data capabilities within the organization, and who is best positioned to improve their skills in the necessary manner; and you need to obtain buy-in from relevant stakeholders.

Data Job Roles

Data job roles were discussed in multiple sessions; I attended one from the perspective of how analytical jobs themselves are evolving, and one from the perspective of analytical career development. Despite the hype, not everyone is a data scientist, even if they may perform some tasks that are part of a data science pipeline! Data engineers are the difference between data scientists’ experiments sitting in silos and getting them into production where they can affect your company. Data analysts aren’t going anywhere – yet. (Though Michael Stonebraker, in his keynote Tuesday morning, stated that he believed data science would eventually replace BI, pending upskilling a sufficient number of data workers.) And data scientists spend 80% of their time doing data prep instead of building machine learning models; they’d like to do more of the latter, and because they’re an expensive asset, the business needs them to be doing less prep and more building as well.

By the same token, there are so many different specialties across the data environment, and the tool landscape is incredibly large. No one will know everything; even relatively low-level people will need to provide leadership in their particular roles to bridge the much-bemoaned gap between IT and Business. So how can data people do that? They’ll need to learn to talk about their initiatives and accomplishments in business terms – increasing revenue, decreasing cost, managing risk. By doing this, data strategy can be tied to business strategy, and this barrier to success can be surmounted.

Data Integration at Scale

Michael Stonebraker’s keynote highlighted the growing need for people with data science capabilities, but the real meat of his talk centered around how to support complex data science initiatives: doing data integration at scale. One example: General Electric’s procurement system problem. Clearly, the ideal number of procurement systems in any company is “one.” Given mergers and acquisitions, over time, GE had accumulated *75* procurement systems. They could save $100M if they could bring together all of these systems, with all of the information on the terms and conditions negotiated with each vendor via each of these systems. But this required a rather complex data integration process. Once that was done, the same process remained for dealing with their supplier databases, and their customer databases, and a whole host of other data. Machine learning can help with this – once there are sufficient people with machine learning skills to address these large problems. But doing data integration at scale will remain a significant challenge for enterprises for now, with machine learning skills being relatively costly and rare, data accumulation continuing to grow exponentially, and bringing in third-party data to supplement existing analyses..

Knowledge Graphs and Semantic AI

A number of sessions discussed knowledge graphs and their importance for supporting both data management and data science tasks. Knowledge graphs provide a “semantic” layer over standard relational databases – they prioritize documenting the relationships between entities, making it easier to understand how different parts of your organization’s data are interrelated. Because having a knowledge graph about your organization’s data provides natural-language context around data relationships, it can make machine learning models based on that data more “explainable” due to the additional human-legible information available for interpretation and understanding. Another example: if you’re trying to perform a search, most results rely on exact matches. Having a knowledge graph makes it simple to pull up “related” results based on the relationships documented in that knowledge graph.

Data Access, Control, and Usage

My big takeaway from Scott Taylor’s Data Architecture session: data should be a shared, centralized asset for your entire organization; it must be 1) accessible by its consumers 2) in the format they require 3) via the method they require 4) if they have permission to access it (security) 5) and they will use it in a way that abides by governance standards and laws. Data scientists care about this because they need data to do their job, and any hurdle in accessing usable data makes it more likely they’ll avoid using official methods to access the data. Nobody has three months to wait for a data requisition from IT’s data warehouses to be turned around anymore; instead, “I’ll just use this data copy on my desktop” – or more likely these days, in a cloud-hosted data silo. Making centralized access easy to use makes data users much more likely to comply with data usage and access policies, which helps secure data properly, govern its use appropriately, and prevent data silos from forming.

Digging a bit more into the security and governance aspects mentioned above, it’s surprisingly easy to identify individuals in a set of anonymized data. In separate presentations, Matt Vogt of Immuta demonstrated this with a dataset consisting of anonymized NYC taxi data, even as more and more information was redacted from it. Jeff Jonas of Senzing’s keynote took this further – as context accumulates around data, it gets easier to make inferences, even when your data is far from clean. With GDPR on the table, and CCPA coming into effect in nine months, how data workers can use data, ethically and legally, will shift, significantly affecting data workflows. Both the use of data and the results provided by black-box machine learning models will be challenged.

Recommendations

Data scientists and machine learning practitioners should familiarize themselves with the broader data management ecosystem. Said practitioners understand why dirty data is problematic, given that they spend most of their work hours cleaning that data so they can do the actual machine learning model-building, but there are numerous tools available to help with this process, and possibly obviate the need for a particular cleaning job that’s already been done once. As enterprise data catalogs become more common, this will prevent data scientists from spending hours on duplicative work when someone else has already cleaned the set they were planning to use and made it available for the organization’s use.

Data scientists and data science managers should also learn how to communicate the business value of their data initiatives when speaking to business stakeholders. From a technical point of view, making a model more accurate is an achievement in and of itself. But knowing what it means from a business standpoint builds understanding of what that improved accuracy or speed means for the business as a whole. Maybe your 1% improvement in model accuracy means you save your company tens of thousands of dollars by more accurately targeting potential customers who are ready to buy your product – that’s what will get the attention of your line-of-business partners.

Data science directors and Chief Data or Chief Analytics Officers should approach building their organization’s data strategy and culture with the long-term view in mind. Aligning your data strategy with the organization’s business strategy is crucial to your organization’s success. Rather than having both departments tugging on opposite ends of the rope going in different directions, develop an understanding of each others’ needs and capabilities and apply that knowledge to keep everyone focused on the same goal.

Chief Data Officers and Chief Analytics Officers should understand their organization’s capabilities by conducting an assessment both of their data capabilities and capacity available by individual, and to assess the general maturity in each data practice area (such as Master Data Management, Data Integration, Data Architecture, etc.). Knowing the availability of both technical and people-based resources is necessary to develop a scalable set of data processes for your organization with consistent results no matter who the data scientist or analyst is in charge of executing on the process for any given project.

As part of developing their organization’s data strategy, Chief Data Officers and Chief Analytics Officers must work with their legal department to develop rules and processes for accumulating, storing, accessing, and using data appropriately. As laws like GDPR and the California Privacy Act start being enforced, data access and usage will be much more scrutinized; companies not adhering to the letters of those laws will find themselves fined heavily. Data scientists and data science managers who are working on projects that involve sensitive or personal data should talk to their general counsel to ensure they remain on the right side of the law.

At IBM Think, Watson Expands “Anywhere”

At IBM Think in February, IBM made several announcements around the expansion of Watson’s availability and capabilities, framing these announcements as the launch of “Watson Anywhere.” This piece is intended to provide guidance to data analysts, data scientists, and analytic professionals seeking to implement machine learning and artificial intelligence capabilities and evaluating the capabilities of…

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Data Science and Machine Learning News Roundup, February 2019

On a monthly basis, I will be rounding up key news associated with the Data Science Platforms space for Amalgam Insights. Companies covered will include: Alteryx, Amazon, Anaconda, Cambridge Semantics, Cloudera, Databricks, Dataiku, DataRobot, Datawatch, Domino, Elastic, Google, H2O.ai, IBM, Immuta, Informatica, KNIME, MathWorks, Microsoft, Oracle, Paxata, RapidMiner, SAP, SAS, Tableau, Talend, Teradata, TIBCO, Trifacta,…

Please register or log into your Amalgam Insights Community account to read more.
Log In Register

Four Key Announcements from H2O World San Francisco

Last week at H2O World San Francisco, H2O.ai announced a number of improvements to Driverless AI, H2O, Sparkling Water, and AutoML, as well as several new partnerships for Driverless AI. The improvements provide incremental improvements across the platform, while the partnerships reflect H2O.ai expanding their audience and capabilities. This piece is intended to provide guidance…

Please register or log into your Amalgam Insights Community account to read more.
Log In Register