Posted on 1 Comment

Why Technology Business Management and Technology Expense Management Are Misaligned

(Note: This presentation is also available in a presentation format on Slideshare)

One of the most frequent questions Amalgam Insights receives is how Technology Business Management and Technology Expense Management are related to each other. And what do these topics have to do with the new phrase of “FinOps,” that is starting to appear? Amalgam’s perspective is definitely different.

Why FinOps is a Misnomer
Continue reading Why Technology Business Management and Technology Expense Management Are Misaligned

Posted on 2 Comments

Developing a Practical Model for Ethical AI in the Business World: Stage 3 – Operational Deployment

In this blog post series, Amalgam Insights is providing a practical model for businesses to plan the ethical governance of their AI projects.

To read the introduction, click here.

To read about Stage 1: Executive Design, click here

To read about Stage 2: Technical Development, click here.

This blog focuses on Operational Deployment, the third of the Three Keys to Ethical AI described in the introduction.

Figure 1: The Three Keys to Ethical AI

Stage 3: Operational Deployment

Once an AI model is developed, organizations have to translate this model into actual value, whether it be by providing the direct outputs to relevant users or by embedding these outputs into relevant applications and process automation. But this part of AI also requires its own set of ethical considerations for companies to truly maintain an ethical perspective.

  • Who has access to the outputs?
  • How can users trace the lineage of the data and analysis?
  • How will the outputs be used to support decisions and actions?

Figure 2: Deployment Strategy

Who has access to the outputs?

Just as with data and analytics, the value of AI scales as it goes out to additional relevant users. The power of Amazon, Apple, Facebook, Google, and Microsoft in today’s global economy shows the power of opening up AI to billions of users. But as organizations open up AI to additional users, they have to provide appropriate context to users. Otherwise, these new users are effectively consuming AI blindly rather than as informed consumers. At this point, AI ethics expands beyond a technical problem into an operational business problem that affects every end user affected by AI.

Understanding the context and impact of AI at scale is especially important for AI initiatives that are focused on continuous improvement focused on increasing user value. Amalgam Insights recommends a focus on directly engaging user feedback for user experience and preference rather than simply depending on A/B testing. It takes a combination of quantitative and qualitative experience to optimize AI at a time when we are still far from truly understanding how the brain works and how people interact with relevant data and algorithms. Human feedback is a vital aspect for AI training and to understand the perception and impact of AI.

How can users trace the lineage of the data and analysis?

Users accessing AI in an ethical manner should have basic access to the data and assumptions used to support the AI. This means both providing quantitative logic and qualitative assumptions that can communicate the sources, assumptions, and intended results of the AI to relevant users. This context is important in supporting an ethical AI project as AI is fundamentally based not just on a basic transformation of data, but on a set of logical assumptions that may not be inherently obvious to the user.

From a practical perspective, most users will not fully understand the mathematical logic associated with AI, but users will understand the data and basic conceptual assumptions being made to provide AI-based outputs. Although Amalgam Insights believes that the rise of AI will lead to a broader grasp of statistics, modeling, and transformations over time, it is more important that both executive and technical stakeholders are able to explain how AI technologies in production are productive, relevant, and ethical based on both a business and technical basis.

How will the outputs be used to support decisions and actions?

Although this topic should already have been explored at the executive level, operational users will have deeper knowledge of how the technology will be used on a day-to-day basis and should revisit this topic based on their understanding of processes, internal operations, and customer-facing outcomes.

There are a variety of ways that AI can be used to support the decisions we make. In some cases, such as with search engines and basic prioritization exercises, AI is typically used as the primary source of output. For a more complex scenario, such as sales and marketing use cases or complex business or organizational decisions, AI may be a secondary source to provide an additional perspective or an exploratory and experimental perspective simply to provide context for how an AI perspective would differ from a human-oriented perspective.

But it is important for ethical AI outputs to be matched up with appropriate decisions and outcomes. A current example creating headlines is focused on the current launch of the Apple credit card and decisions being made about disparate credit limits for a married man and woman based on “the algorithm.” In this example, the man was initially given a much larger credit limit than the woman despite the fact that the couple filed taxes jointly and effectively shared joint income.

In this case, the challenge of giving “the algorithm” an automated and primary (and, likely, exclusive) role in determining a credit limit has created issues that are now in the public eye. Although this is a current and prominent example, it is less of a statement about Apple in particular and more of a statement regarding the increasing dependence that financial services has on non-transparent algorithms to accelerate decisions and provide an initial experience to new customers.

A more ethical and human approach would have been to figure out if there were inherent biases in the algorithm. If the algorithm had not been sufficiently tested, it should have been a secondary source for a credit limit decision that would ultimately be made by a human.

So, based on these explorations, we create a starting point for practical business AI ethics.

Figure 3: A Practical Framework

Recommendations

Maintain a set of basic ethical precepts for each AI project across design, development, and deployment. As mentioned in Part 1, these ethical statements should be focused on a few key goals that should be consistently explored from executive to technical to operational deployment. These should be short enough to fit onto every major project update memo and key documentation associated with the project. By providing a consistent starting point of what is considered ethical and must be governed, AI can be managed more consistently.

Conduct due diligence across bias, funding, champions, development, and users to improve ethical AI usage. The due diligence on AI currently focuses too heavily on the construction of models, rather than the full business context of AI. Companies continue to hurt their brands and reputation by putting out models and AI logic that would not pass a basic business or operational review.

Align AI to responsibilities that reflect the maturity, transparency, and fit of models. For instance, experimental models should not be used to run core business processes. For AI to take over significant operational responsibilities from an automation, analytical, or prescriptive perspective, the algorithms and production of AI need to be enterprise-ready just as traditional IT is. Just because AI is new does not mean that it should bypass key business and technical deployment rules.

Review and update AI on a regular basis. Once an AI project has been successfully brought to the wild and is providing results, it must be managed and reviewed on a regular basis.  Over time, the models will need to be tweaked to reflect real-life changes in business processes, customer preferences, macroeconomic changes, or strategic goals. AI that is abandoned or ignored will become technical debt just as any outdated technology is. If there is no dedicated review and update process for AI, the models and algorithms used will eventually become outdated and potentially less ethical and accurate from a business perspective.

We hope this guide and framework are helpful in supporting more ethical and practical AI projects. If you are seeking additional information on ethical AI, the ROI of AI, or guidance across data management, analytics, machine learning, and application development, please feel free to contact us at research@amalgaminsights.com and send us your questions. We would love to work with you.

Posted on 1 Comment

AI on AI – 8 Predictions for the Data Savvy Pro

When we started Amalgam Insights, we oh-so-cleverly chose the AI initials with the understanding that artificial intelligence (the other AI…), data science, machine learning, programmatic automation, augmented analytics, and neural inputs would lead to the greatest advances in technology. At the same time, we sought to provide practical guidance for companies seeking to bridge the gaps between their current data and analytics environments and the future of AI. With that in mind, here are 8 predictions we’re providing for 2020 for Analytics Centers of Excellence and Chief Data Officers to keep in mind to stay ahead while remaining practical.

1. In 2020, AI becomes a $50 billion market, creating a digital divide between the haves and have nots prepared to algorithmically assess their work in real time. Retail, Financial Services, and Manufacturing will be over half of this market.

2. The data warehouse becomes less important as a single source of truth. Today’s single source replaces data aggregation and duplication with data linkages and late-binding of data sources to bring together the single source of truth on a real-time basis. This doesn’t mean that data warehouses aren’t still useful; it just means that the single source of truth can change on a real-time basis and corporate data structures need to support that reality. And it becomes increasingly important to conduct analytics on data, wherever the data may be, rather than be dependent on the need to replicate and transfer data back to a single warehouse.

3. Asking “What were our 2020 revenues?” will be an available option in every major BI solution by the end of 2020, with the biggest challenge then being how companies will need to upgrade and configure their solutions to support these searches. We have maxed out our ability to spread analytics through IT. To get beyond 25% analytics adoption in 2020, businesses will need to take advantage of natural language queries and searches are becoming a general capability for analytics, either as a native or partner-enabled capability.

4. 2020 will see an increased focus on integrating analytics with automation, process mapping, and direct collaboration. Robotic Process Automation is a sexy technology, but what makes the robots intelligent? Prioritized data, good business rules, and algorithmic feedback for constant improvement. When we talk about “augmented analytics” at Amalgam Insights, we think this means augmenting business processes with analytic and algorithmic logic, not just augmenting data management and analytic tasks.

5. By 2025, analytic model testing and Python will become standard data analyst and business analyst capabilities to handle models rather than specific data. Get started now in learning Python, statistics, an Auto Machine Learning method, and model testing. IT needs to level up from admins to architects. All aspects of IT are becoming more abstracted through cloud computing, process automation, and machine learning. Data and analytics are no exception. Specifically, Data analysts will start conducting the majority of “data science” tasks conducted in the enterprise, either as standalone or machine-guided tasks. If a business is dependent on a “unicorn” or a singular talent to conduct a business process, that process is not scalable and repeatable. As data science and machine learning projects start becoming part of the general IT portfolio, businesses will push down more data management, cleansing, and even modeling and testing tasks to the most dependable talent of the data ecosystem, the data analyst.

6. Amalgam Insights predicts that the biggest difference between high ROI and low ROI analytics in 2020 will come from data polishing, not data hoarding. – The days of data hoarding for value creation are over. True data champions will focus on cleansing, defining, prioritizing, and separating the 1% of data that truly matters from the 99% more suited to mandatory and compliance-based storage.

7. On a related note, Amalgam Insights believes the practice of data deletion will be greatly formalized by Chief Data Protection Officers in 2020. With the emergence of CCPA along with the continuance of GDPR, data ownership is now potentially risky for organizations holding the wrong data.

8. The accounting world will make progress on defining data as a tangible asset. My expectations: changes to the timeframes of depreciation and guidance on how to value specific contextually-specific data such as customer lists and business transactions. Currently, data cannot be formally capitalized, meaning asset. Now that companies are generally starting to realize that data may be their greatest assets outside of their talent, accountants will bring up more concerns for FASB Statements 141 and 142.

Posted on 2 Comments

Developing a Practical Model for Ethical AI in the Business World: Stage 2 – Technical Development

In this blog post series, Amalgam Insights is providing a practical model for businesses to plan the ethical governance of their AI projects.

To read the introduction, click here.

To read about Stage 1: Executive Design, click here

This blog focuses on Technical Development, the second of the Three Keys to Ethical AI described in the introduction.

Figure 1: The Three Keys to Ethical AI

Stage 2: Technical Development

Technical Development is the area of AI that gets the most attention as machine learning and data science start to mature. Understandably, the current focus in this Early Adopter era (which is just starting to move into Early Majority status in 2020) is simply on how to conduct machine learning, data science efforts, and potentially deep learning projects in a rapid, accurate, and potentially repeatable manner. However, as companies conduct their initial proofs of concepts and build out AI services and portfolios, the following four questions are important to take into account.

  • Where does the data come from?
  • Who is conducting the analysis?
  • What aspects of bias are being taken into account?
  • What algorithms and toolkits are being used to analyze and optimize?

Figure 2: Technical Development

Where does the data come from?

Garbage In, Garbage Out has been a truism for IT and data projects for many decades. However, the irony is that much of the data that is used for AI projects used to literally be considered “garbage” and archival exhaust up until the practical emergence of the “Big Data” era at the beginning of this decade. As companies use these massive new data sources as a starting point for AI, they must check on the quality, availability, timeliness, and context of the data. It is no longer good enough to just pour all data into a “data lake” and hope that this creates a quality training data sample.

The quality of the data is determined by the completeness, accuracy, and consistency of the data. If the data have a lot of gaps, errors, or significant formatting issues, the AI will need to account for these issues in a way that maintains trust. For instance, a long-standing historical database may be full of null values as the data source has been augmented over time and data collection practices have improved. If those null values are incorrectly accounted for, AI can end up defining or ignoring a “best practice” or recommendation.

From a practical perspective, consider as an example how Western culture has recently started to formalize non-binary gender or transgendered identity. Just because data may not show these identities prior to this decade does not mean that these identities didn’t exist. Amalgam Insights would consider a gap like this to be a systemic data gap that needs to be taken into account to avoid unexpected bias, perhaps through the use of adversarial de-biasing that actively takes the bias into account.

The Availability and Timeliness of the data refers to the accessibility, uptime, and update frequency of the data source. Data sources that may be transient or migratory may serve as a risk for making consistent assumptions from an AI perspective. If an AI project is depending on a data source that may be hand-curated, bespoke in nature, or inconsistently hosted and updated, this variability needs to be taken into account in determining the relative accuracy of the AI project and its ability to consistently meet ethical and compliance standards.

Data context refers to the relevance of the data both for solving the problem and for providing guidance to downstream users. Correlation is not causation, as the hilarious website “Spurious Correlations” run by Tyler Vigen shows us. One of my favorite examples shows how famed actor Nicolas Cage’s movies are “obviously” tied to the number of people who drown in swimming pools.

Figure 3: Drownings as a Function of Nicolas Cage Movies


(Thanks to Spurious Correlations! Buy the book!)

But beyond the humor is a serious issue: what happens if AI assumptions are built on faulty and irrelevant data? And who is checking the hyperparameter settings and the contributors to parameter definitions? Data assumptions need go through some level of line of business review. This isn’t to say that every business manager is going to suddenly have a Ph.D. level of data science understanding, but business managers will be able to either provide confirmation that data is relevant or provide relevant feedback on why a data source may or may not be relevant.

Who is conducting the analysis?

In this related question, the deification of the unicorn data scientist has been well-documented over the last few years. But just as business intelligence and analytics evolved from the realm of the database master and report builder to a combination of IT management and self-service conducted by data-savvy analysts, data science and AI must also be conducted by a team of roles that include the data analyst, data scientist, business analyst, and business manager. In small companies, an individual may end up holding multiple roles on this team.

But if AI is being developed by a single “unicorn” focused on the technical and mathematical aspects of AI development, companies need to make sure that the data scientist or AI developer is taking sufficient business context into account and fully considering the fundamental biases and assumptions that were made during the Executive Design phase.

What aspects of bias are being taken into account?

Any data scientist with basic statistical training will be familiar with Type I (false positive) and Type II (false negative) errors as a starting point for identifying bias. However, this statistical bias should not be considered the end-all and be-all of defining AI bias.

As parameters and outputs become defined, data scientists must also consider organizational bias, cultural bias, and contextual bias. Simply stating that “the data will speak for itself” does not mean that the AI lacks bias; this only means that the AI project is actively ignoring any bias that may be in place. As I said before, the most honest approach to AI is to acknowledge and document bias rather than to simply try to “eliminate” bias. Bias documentation is a sign of understanding both the problem and the methods, not a weakness.

An extreme example is Microsoft’s “Tay” chatbot released in 2016. This bot was released “without bias” to support conversational understanding. The practical aspect of this lack of bias was that the bot lacked the context to filter racist messages and to differentiate between strongly emotional terms and culturally appropriate conversation. In this case, the lack of bias led to the AI’s inability to be practically useful. In a vacuum, the most prevalent signals and inputs will take precedence over the most relevant or appropriate signals.

Unless the goal of the AI is to reflect the data that is most commonly entered, an “unbiased” AI approach is generally going to reflect the “GIGO” aspect of programming that has been understood for decades. This challenge reflects the foundational need to understand the training and distribution of data associated with building of AI.

What algorithms and toolkits are being used to analyze and optimize?

The good news about AI is that it is easier to access than ever before. Python resources and a plethora of machine learning libraries including PyTorch, Scikit, Keras, and, of course, Tensorflow, make machine learning relatively easy to access for developers and quantitatively trained analysts.

The bad news is that it becomes easy for someone to implement an algorithm without fully understanding the consequences. For instance, a current darling in the data science world is XGBoost (Extreme Gradient Boosting) which has been a winning algorithmic approach for recent data science contests because it reduces data to an efficient minima more quickly than standard gradient boosting. But it also requires expertise in starting with appropriate features, stopping the model training before the algorithm overtunes, and appropriately fine tuning the model for production.

So, it is not enough to simply use the right tools or the most “efficient” algorithms, but to effectively fit, stop, and tune models based on the tools being used to create models that are most appropriate for the real world and to avoid AI bias from propagating and gaining overweight influence.

In our next blog, we will explore Operational Deployment with a focus on the line of business concerns that business analysts and managers consider as they actually use the AI application or service and the challenges that occur as the AI logic becomes obsolete or flawed over time.