The birth and death of Big Data in supporting massive scale as the terabyte shifted from an intimidating amount of data to a standard unit of measurement
The evolution of cloud computing from niche tool to a rapidly growing market that is roughly $150 billion a year now and will likely be well over a trillion dollars a year by the end of the 2020s
The Internet of Things, which will enable a future of distributed and specialized computing based on billions of processors and build on this decade’s massive progress in creating mobile and wireless smart devices.
The democratization of artificial intelligence tools including machine learning, deep learning, and data science services and platforms that have opened up the world of AI to developers and data analysts
The use of CRISPR Cas9 to programmatically edit genes, which has changed the biological world just as AI has changed the world of technology
Brain biofeedback and Brain-Computer Interfaces, which provide direct neural interfaces to control and affect a physical environment.
Extended Reality, through the development of augmented and virtual reality which are starting to provide realistic sensory simulations available on demand
Some years are easy to predict than others. Stability in a market makes tracing the trend line much easier. 2020 looks to be that kind of year for the migration to microservices: stable with steady progression toward mainstream acceptance.
There is little doubt that IT organizations are moving toward microservices architectures. Microservices, which deconstruct applications into many small parts, removes much of the friction that is common in n-Tier applications when it comes to development velocity. The added resiliency and scalability of microservices in a distributed system are also highly desirable. These attributes promote better business agility, allowing IT to respond to business needs more quickly and with less disruption while helping to ensure that customers have the best experience possible.
Little in this upcoming year seems disruptive or radical; That big changes have already occurred. Instead, this is a year for building out and consolidating; Moving past the “what” and “why” and into the “how” and “do”.
Kubernetes will be top of mind to IT in the coming year. From its roots as a humble container orchestrator – one of many in the market – Kubernetes has evolved into a platform for deploying microservices into container clusters. There is more work to do with Kubernetes, especially to help autoscale clusters, but it is now a solid base on which to build modern applications.
No one should delude themselves into thinking that microservices, containers, and Kubernetes are mainstream yet. The vast majority of applications are still based on n-Tier design deployed to VMs. That’s fine for a lot of applications but businesses know that it’s not enough going forward. We’ve already seen more traditional companies begin to adopt microservices for at least some portion of their applications. This trend will accelerate in the upcoming year. At some point, microservices and containers will become the default architecture for enterprise applications. That’s a few years from now but we’ve already on the path.
From a vendor perspective, all the biggest companies are now in the Kubernetes market with at least a plain vanilla Kubernetes offering. This includes HPE and Cisco in addition to the companies that have been selling Kubernetes all along, especially IBM/Red Hat, Canonical, Google, AWS, VMWare/Pivotal, and Microsoft. The trick for these companies will be to add enough unique value that their offerings don’t appear generic. Leveraging traditional strengths, such as storage for HPE, networking for Cisco, and Java for Red Hat and VMWare/Pivotal, are the key to standing out in the market.
The entry of the giants in the Kubernetes space will pose challenges to the smaller vendors such as Mirantis and Rancher. With more than 30 Kubernetes vendors in the market, consolidation and loss is inevitable. There’s plenty of value in the smaller firms but it will be too easy for them to get trampled underfoot.
Expect M&A activity in the Kubernetes space as bigger companies acquihire or round out their portfolios. Kubernetes is now a big vendor market and the market dynamics favor them.
If there is a big danger sign on the horizon, it’s those traditional n-Tier applications that are still in production. At some point, IT will get around to thinking beyond the shiny new greenfield applications and want to migrate the older ones. Since these apps are based on radically different architectures, that won’t be easy. There just aren’t the tools to do this migration well. In short, it’s going to be a lot of work. It’s a hard sell to say that the only choices are either expensive migration projects (on top of all that digital transformation money that’s already been spent) or continuing to support and update applications that no longer meet business needs. Replatforming, or deploying the old parts to the new container platform, will provide less ROI and less value overall. The industry will need another solution.
This may be an opportunity to use all that fancy AI technology that vendors have been investing in to create software to break down an old app into a container cluster. In any event, the migration issue will be a drag on the market in 2020 as IT waits for solutions to a nearly intractable problem.
2020 is the year of the microservice architecture.
Even if that seems too dramatic, it’s not unreasonable to expect that there will be significant growth and acceleration in the deployment of Kubernetes-based microservices applications. The market has already begun the process of maturation as it adapts to the needs of larger, mainstream, corporations with more stringent requirements. The smart move is to follow that trend line.
(Note: This presentation is also available in a presentation format on Slideshare)
One of the most frequent questions Amalgam Insights receives is how Technology Business Management and Technology Expense Management are related to each other. And what do these topics have to do with the new phrase of “FinOps,” that is starting to appear? Amalgam’s perspective is definitely different.
What are the key IT cost trends that you need to be aware of in 2020? In case you missed my webinars with MDSL yesterday, you can get a short summary right here.
Continue reading “Nine Big Trends for Technology Expense Management in 2020”
To read the introduction, click here.
To read about Stage 1: Executive Design, click here
To read about Stage 2: Technical Development, click here.
This blog focuses on Operational Deployment, the third of the Three Keys to Ethical AI described in the introduction.
Figure 1: The Three Keys to Ethical AI
Stage 3: Operational Deployment
Once an AI model is developed, organizations have to translate this model into actual value, whether it be by providing the direct outputs to relevant users or by embedding these outputs into relevant applications and process automation. But this part of AI also requires its own set of ethical considerations for companies to truly maintain an ethical perspective.
- Who has access to the outputs?
- How can users trace the lineage of the data and analysis?
- How will the outputs be used to support decisions and actions?
Who has access to the outputs?
Just as with data and analytics, the value of AI scales as it goes out to additional relevant users. The power of Amazon, Apple, Facebook, Google, and Microsoft in today’s global economy shows the power of opening up AI to billions of users. But as organizations open up AI to additional users, they have to provide appropriate context to users. Otherwise, these new users are effectively consuming AI blindly rather than as informed consumers. At this point, AI ethics expands beyond a technical problem into an operational business problem that affects every end user affected by AI.
Understanding the context and impact of AI at scale is especially important for AI initiatives that are focused on continuous improvement focused on increasing user value. Amalgam Insights recommends a focus on directly engaging user feedback for user experience and preference rather than simply depending on A/B testing. It takes a combination of quantitative and qualitative experience to optimize AI at a time when we are still far from truly understanding how the brain works and how people interact with relevant data and algorithms. Human feedback is a vital aspect for AI training and to understand the perception and impact of AI.
How can users trace the lineage of the data and analysis?
Users accessing AI in an ethical manner should have basic access to the data and assumptions used to support the AI. This means both providing quantitative logic and qualitative assumptions that can communicate the sources, assumptions, and intended results of the AI to relevant users. This context is important in supporting an ethical AI project as AI is fundamentally based not just on a basic transformation of data, but on a set of logical assumptions that may not be inherently obvious to the user.
From a practical perspective, most users will not fully understand the mathematical logic associated with AI, but users will understand the data and basic conceptual assumptions being made to provide AI-based outputs. Although Amalgam Insights believes that the rise of AI will lead to a broader grasp of statistics, modeling, and transformations over time, it is more important that both executive and technical stakeholders are able to explain how AI technologies in production are productive, relevant, and ethical based on both a business and technical basis.
How will the outputs be used to support decisions and actions?
Although this topic should already have been explored at the executive level, operational users will have deeper knowledge of how the technology will be used on a day-to-day basis and should revisit this topic based on their understanding of processes, internal operations, and customer-facing outcomes.
There are a variety of ways that AI can be used to support the decisions we make. In some cases, such as with search engines and basic prioritization exercises, AI is typically used as the primary source of output. For a more complex scenario, such as sales and marketing use cases or complex business or organizational decisions, AI may be a secondary source to provide an additional perspective or an exploratory and experimental perspective simply to provide context for how an AI perspective would differ from a human-oriented perspective.
But it is important for ethical AI outputs to be matched up with appropriate decisions and outcomes. A current example creating headlines is focused on the current launch of the Apple credit card and decisions being made about disparate credit limits for a married man and woman based on “the algorithm.” In this example, the man was initially given a much larger credit limit than the woman despite the fact that the couple filed taxes jointly and effectively shared joint income.
In this case, the challenge of giving “the algorithm” an automated and primary (and, likely, exclusive) role in determining a credit limit has created issues that are now in the public eye. Although this is a current and prominent example, it is less of a statement about Apple in particular and more of a statement regarding the increasing dependence that financial services has on non-transparent algorithms to accelerate decisions and provide an initial experience to new customers.
A more ethical and human approach would have been to figure out if there were inherent biases in the algorithm. If the algorithm had not been sufficiently tested, it should have been a secondary source for a credit limit decision that would ultimately be made by a human.
So, based on these explorations, we create a starting point for practical business AI ethics.
Maintain a set of basic ethical precepts for each AI project across design, development, and deployment. As mentioned in Part 1, these ethical statements should be focused on a few key goals that should be consistently explored from executive to technical to operational deployment. These should be short enough to fit onto every major project update memo and key documentation associated with the project. By providing a consistent starting point of what is considered ethical and must be governed, AI can be managed more consistently.
Conduct due diligence across bias, funding, champions, development, and users to improve ethical AI usage. The due diligence on AI currently focuses too heavily on the construction of models, rather than the full business context of AI. Companies continue to hurt their brands and reputation by putting out models and AI logic that would not pass a basic business or operational review.
Align AI to responsibilities that reflect the maturity, transparency, and fit of models. For instance, experimental models should not be used to run core business processes. For AI to take over significant operational responsibilities from an automation, analytical, or prescriptive perspective, the algorithms and production of AI need to be enterprise-ready just as traditional IT is. Just because AI is new does not mean that it should bypass key business and technical deployment rules.
Review and update AI on a regular basis. Once an AI project has been successfully brought to the wild and is providing results, it must be managed and reviewed on a regular basis. Over time, the models will need to be tweaked to reflect real-life changes in business processes, customer preferences, macroeconomic changes, or strategic goals. AI that is abandoned or ignored will become technical debt just as any outdated technology is. If there is no dedicated review and update process for AI, the models and algorithms used will eventually become outdated and potentially less ethical and accurate from a business perspective.
We hope this guide and framework are helpful in supporting more ethical and practical AI projects. If you are seeking additional information on ethical AI, the ROI of AI, or guidance across data management, analytics, machine learning, and application development, please feel free to contact us at firstname.lastname@example.org and send us your questions. We would love to work with you.
When we started Amalgam Insights, we oh-so-cleverly chose the AI initials with the understanding that artificial intelligence (the other AI…), data science, machine learning, programmatic automation, augmented analytics, and neural inputs would lead to the greatest advances in technology. At the same time, we sought to provide practical guidance for companies seeking to bridge the gaps between their current data and analytics environments and the future of AI. With that in mind, here are 8 predictions we’re providing for 2020 for Analytics Centers of Excellence and Chief Data Officers to keep in mind to stay ahead while remaining practical.
1. In 2020, AI becomes a $50 billion market, creating a digital divide between the haves and have nots prepared to algorithmically assess their work in real time. Retail, Financial Services, and Manufacturing will be over half of this market.
2. The data warehouse becomes less important as a single source of truth. Today’s single source replaces data aggregation and duplication with data linkages and late-binding of data sources to bring together the single source of truth on a real-time basis. This doesn’t mean that data warehouses aren’t still useful; it just means that the single source of truth can change on a real-time basis and corporate data structures need to support that reality. And it becomes increasingly important to conduct analytics on data, wherever the data may be, rather than be dependent on the need to replicate and transfer data back to a single warehouse.
3. Asking “What were our 2020 revenues?” will be an available option in every major BI solution by the end of 2020, with the biggest challenge then being how companies will need to upgrade and configure their solutions to support these searches. We have maxed out our ability to spread analytics through IT. To get beyond 25% analytics adoption in 2020, businesses will need to take advantage of natural language queries and searches are becoming a general capability for analytics, either as a native or partner-enabled capability.
4. 2020 will see an increased focus on integrating analytics with automation, process mapping, and direct collaboration. Robotic Process Automation is a sexy technology, but what makes the robots intelligent? Prioritized data, good business rules, and algorithmic feedback for constant improvement. When we talk about “augmented analytics” at Amalgam Insights, we think this means augmenting business processes with analytic and algorithmic logic, not just augmenting data management and analytic tasks.
5. By 2025, analytic model testing and Python will become standard data analyst and business analyst capabilities to handle models rather than specific data. Get started now in learning Python, statistics, an Auto Machine Learning method, and model testing. IT needs to level up from admins to architects. All aspects of IT are becoming more abstracted through cloud computing, process automation, and machine learning. Data and analytics are no exception. Specifically, Data analysts will start conducting the majority of “data science” tasks conducted in the enterprise, either as standalone or machine-guided tasks. If a business is dependent on a “unicorn” or a singular talent to conduct a business process, that process is not scalable and repeatable. As data science and machine learning projects start becoming part of the general IT portfolio, businesses will push down more data management, cleansing, and even modeling and testing tasks to the most dependable talent of the data ecosystem, the data analyst.
6. Amalgam Insights predicts that the biggest difference between high ROI and low ROI analytics in 2020 will come from data polishing, not data hoarding. – The days of data hoarding for value creation are over. True data champions will focus on cleansing, defining, prioritizing, and separating the 1% of data that truly matters from the 99% more suited to mandatory and compliance-based storage.
7. On a related note, Amalgam Insights believes the practice of data deletion will be greatly formalized by Chief Data Protection Officers in 2020. With the emergence of CCPA along with the continuance of GDPR, data ownership is now potentially risky for organizations holding the wrong data.
8. The accounting world will make progress on defining data as a tangible asset. My expectations: changes to the timeframes of depreciation and guidance on how to value specific contextually-specific data such as customer lists and business transactions. Currently, data cannot be formally capitalized, meaning asset. Now that companies are generally starting to realize that data may be their greatest assets outside of their talent, accountants will bring up more concerns for FASB Statements 141 and 142.
To read the introduction, click here.
To read about Stage 1: Executive Design, click here
This blog focuses on Technical Development, the second of the Three Keys to Ethical AI described in the introduction.
Figure 1: The Three Keys to Ethical AI
Stage 2: Technical Development
Technical Development is the area of AI that gets the most attention as machine learning and data science start to mature. Understandably, the current focus in this Early Adopter era (which is just starting to move into Early Majority status in 2020) is simply on how to conduct machine learning, data science efforts, and potentially deep learning projects in a rapid, accurate, and potentially repeatable manner. However, as companies conduct their initial proofs of concepts and build out AI services and portfolios, the following four questions are important to take into account.
- Where does the data come from?
- Who is conducting the analysis?
- What aspects of bias are being taken into account?
- What algorithms and toolkits are being used to analyze and optimize?
Figure 2: Technical Development
Where does the data come from?
Garbage In, Garbage Out has been a truism for IT and data projects for many decades. However, the irony is that much of the data that is used for AI projects used to literally be considered “garbage” and archival exhaust up until the practical emergence of the “Big Data” era at the beginning of this decade. As companies use these massive new data sources as a starting point for AI, they must check on the quality, availability, timeliness, and context of the data. It is no longer good enough to just pour all data into a “data lake” and hope that this creates a quality training data sample.
The quality of the data is determined by the completeness, accuracy, and consistency of the data. If the data have a lot of gaps, errors, or significant formatting issues, the AI will need to account for these issues in a way that maintains trust. For instance, a long-standing historical database may be full of null values as the data source has been augmented over time and data collection practices have improved. If those null values are incorrectly accounted for, AI can end up defining or ignoring a “best practice” or recommendation.
From a practical perspective, consider as an example how Western culture has recently started to formalize non-binary gender or transgendered identity. Just because data may not show these identities prior to this decade does not mean that these identities didn’t exist. Amalgam Insights would consider a gap like this to be a systemic data gap that needs to be taken into account to avoid unexpected bias, perhaps through the use of adversarial de-biasing that actively takes the bias into account.
The Availability and Timeliness of the data refers to the accessibility, uptime, and update frequency of the data source. Data sources that may be transient or migratory may serve as a risk for making consistent assumptions from an AI perspective. If an AI project is depending on a data source that may be hand-curated, bespoke in nature, or inconsistently hosted and updated, this variability needs to be taken into account in determining the relative accuracy of the AI project and its ability to consistently meet ethical and compliance standards.
Data context refers to the relevance of the data both for solving the problem and for providing guidance to downstream users. Correlation is not causation, as the hilarious website “Spurious Correlations” run by Tyler Vigen shows us. One of my favorite examples shows how famed actor Nicolas Cage’s movies are “obviously” tied to the number of people who drown in swimming pools.
Figure 3: Drownings as a Function of Nicolas Cage Movies
(Thanks to Spurious Correlations! Buy the book!)
But beyond the humor is a serious issue: what happens if AI assumptions are built on faulty and irrelevant data? And who is checking the hyperparameter settings and the contributors to parameter definitions? Data assumptions need go through some level of line of business review. This isn’t to say that every business manager is going to suddenly have a Ph.D. level of data science understanding, but business managers will be able to either provide confirmation that data is relevant or provide relevant feedback on why a data source may or may not be relevant.
Who is conducting the analysis?
In this related question, the deification of the unicorn data scientist has been well-documented over the last few years. But just as business intelligence and analytics evolved from the realm of the database master and report builder to a combination of IT management and self-service conducted by data-savvy analysts, data science and AI must also be conducted by a team of roles that include the data analyst, data scientist, business analyst, and business manager. In small companies, an individual may end up holding multiple roles on this team.
But if AI is being developed by a single “unicorn” focused on the technical and mathematical aspects of AI development, companies need to make sure that the data scientist or AI developer is taking sufficient business context into account and fully considering the fundamental biases and assumptions that were made during the Executive Design phase.
What aspects of bias are being taken into account?
Any data scientist with basic statistical training will be familiar with Type I (false positive) and Type II (false negative) errors as a starting point for identifying bias. However, this statistical bias should not be considered the end-all and be-all of defining AI bias.
As parameters and outputs become defined, data scientists must also consider organizational bias, cultural bias, and contextual bias. Simply stating that “the data will speak for itself” does not mean that the AI lacks bias; this only means that the AI project is actively ignoring any bias that may be in place. As I said before, the most honest approach to AI is to acknowledge and document bias rather than to simply try to “eliminate” bias. Bias documentation is a sign of understanding both the problem and the methods, not a weakness.
An extreme example is Microsoft’s “Tay” chatbot released in 2016. This bot was released “without bias” to support conversational understanding. The practical aspect of this lack of bias was that the bot lacked the context to filter racist messages and to differentiate between strongly emotional terms and culturally appropriate conversation. In this case, the lack of bias led to the AI’s inability to be practically useful. In a vacuum, the most prevalent signals and inputs will take precedence over the most relevant or appropriate signals.
Unless the goal of the AI is to reflect the data that is most commonly entered, an “unbiased” AI approach is generally going to reflect the “GIGO” aspect of programming that has been understood for decades. This challenge reflects the foundational need to understand the training and distribution of data associated with building of AI.
What algorithms and toolkits are being used to analyze and optimize?
The good news about AI is that it is easier to access than ever before. Python resources and a plethora of machine learning libraries including PyTorch, Scikit, Keras, and, of course, Tensorflow, make machine learning relatively easy to access for developers and quantitatively trained analysts.
The bad news is that it becomes easy for someone to implement an algorithm without fully understanding the consequences. For instance, a current darling in the data science world is XGBoost (Extreme Gradient Boosting) which has been a winning algorithmic approach for recent data science contests because it reduces data to an efficient minima more quickly than standard gradient boosting. But it also requires expertise in starting with appropriate features, stopping the model training before the algorithm overtunes, and appropriately fine tuning the model for production.
So, it is not enough to simply use the right tools or the most “efficient” algorithms, but to effectively fit, stop, and tune models based on the tools being used to create models that are most appropriate for the real world and to avoid AI bias from propagating and gaining overweight influence.
In our next blog, we will explore Operational Deployment with a focus on the line of business concerns that business analysts and managers consider as they actually use the AI application or service and the challenges that occur as the AI logic becomes obsolete or flawed over time.
In this blog post series, Amalgam Insights is providing a practical model for businesses to plan the ethical governance of their AI projects. To read the introduction, click here. This blog focuses on Executive Design, the first of the Three Keys to Ethical AI introduced in the last blog. Stage I: Executive Design As a…
As we head into 2020, the concept of “AI (Artificial Intelligence) for Good” is becoming an increasingly common phrase. Individuals and organizations with AI skillsets (including data management, data integration, statistical analysis, machine learning, algorithmic model development, and application deployment skills) have effort into pursuing ethical AI efforts.
Amalgam Insights believes that these efforts have largely been piecemeal and inadequate to meet common-sense definitions for companies to effectively state that they are pursuing, documenting, and practicing true ethical AI because of the breadth and potential repercussions of AI on business outcomes. This is not due to a lack of interest, but based on a couple of key considerations. First, AI is a relatively new capability in the enterprise IT portfolio that often lacks formal practices and guidelines and has been managed as a “skunkworks” or experimental project. Second, businesses have not seen AI as a business practice, but as a purely technical practice and made a number of assumptions in skipping to the technical development that would typically not have been made for more mature technical capabilities and projects.
In the past, Amalgam Insights has provided frameworks to help organizations take the next step to AI through our BI to AI progression.
To pursue a more ethical model of AI, Amalgam Insights believes that AI efforts need to be analyzed through three key lenses:
- Executive Design
- Technical Development
- Operational Deployment
Figure 2: Amalgam’s Three Key Areas for Ethical AI
In each of these areas, businesses must ask the right questions and adequately prepare for the deployment of ethical AI. In this framework, AI is not just a set of machine learning algorithms to be utilized, but an enabler to effectively augment problem-solving for appropriate challenges.
Over the next week, Amalgam Insights will explore 12 areas of bias across these three categories with the goal of developing a straightforward framework that companies can use to guide their AI initiatives and take a structured approach to enforcing a consistent set of ethical guidelines to support governance across the executive, technical, and operational aspects of initiating, developing, and deploying AI.
In our next blog, we will explore Executive Design with a focus on the five key questions that an executive must consider as they start considering the use of AI within their enterprise.
This year’s KubeCon+CloudNativeCon was, to say the least, an experience. Normally sunny San Diego treated conference-goers to torrential downpours. The unusual weather turned the block party event into a bit of a sog. My shoes are still drying out. The record crowds – this year’s attendance was 12,000 up from last year’s 8000 in Seattle – made navigating the show floor a challenge for many attendees.
Despite the weather and the crowds, this was an exciting KubeCon+CloudNativeCon. On display was the maturation of the Kubernetes and container market. Both the technology and the best practices discussions were less about “what is Kubernetes” and, instead more about “how does this fit into my architecture?” and “how enterprise-ready is this stuff?” This shift from the “what” to the “how” is a sign that Kubernetes is heading quickly to the mainstream. There are other indicators at Kubecon+CloudNativeCon that, to me, show Kubernetes maturing into a real enterprise technology.
First, the makeup of the Kubernetes community is clearly changing. Two years ago, almost every company at KubeCon+CloudNativeCon was some form of digital forward company like Lyft or cloud technology vendor such as Google or Red Hat. Now, there are many more traditional companies on both the IT and vendor side. Vendors such as HPE, Oracle, Intel, and Microsoft, mainstays of technology for the past 30 years, are here in force. Industries like telecommunications (drawn by the promise of edge computing), finance, manufacturing, and retail are much more visible than they were just a short time ago. While microservices and Kubernetes are not yet as widely deployed as more traditional n-Tier architectures and classic middleware, the mainstream is clearly interested.
Another indicator of the changes in the Kubernetes space is the prominence of security in the community. Not only are there more vendors than ever, but we are seeing more keynote time given to security practices. Security is, of course, a major component of making Kubernetes enterprise-ready. Without solid security practices and technology, Kubernetes will never be acceptable to a broad swatch of large to mid-sized businesses. That said, there is still so much more that needs to be done with Kubernetes security. The good news is that the community is working on it.
Finally, there is clearly more attention being paid to operating Kubernetes in a production environment. That’s most evident in the proliferation of tracing and logging technology, from both new and older companies, that were on display on the show floor and mainstage. Policy management was also an important area of discussion at the conference. These are all examples of the type of infrastructure that Operations teams will need to manage Kubernetes at scale and a sign that the community is thinking seriously about what happens after deployment.
It certainly helps that a lot of basic issues with Kubernetes have been solved but there is still more work to do. There are difficult challenges that need attention. How to migrate existing stateful apps originally written in Java and based on n-Tier architectures is still mostly an open question. Storage is another area that needs more innovation, though there’s serious work underway in that space. Despite the need for continued work, the progress seen at KubeCon+CloudNativeCon NA 2019 point to future where Kubernetes is a major platform for enterprise applications. 2020 will be another pivotal year for Kubernetes, containers, and microservices architectures. It may even be the year of mainstream adoption. We’ll be watching.