The biggest US-based travel story at the end of 2022 was the absolute collapse of Southwest Airlines. The United States was hit by a sudden cold snap just as Christmas approached, leading to a massive travel delay across almost all travel modes including airlines, trains, and road-based transit. However, after a couple of days, most US-domestic airlines seemed to have recovered with the exception of Southwest, which suddenly and unexpectedly canceled nearly all of its flights in the last week of 2022, just as people were traveling from or to locations for Christmas, Hanukkah, New Year’s Eve, and other holidays. The timing was horrible and inexplicable. And with little to no official explanation, travelers stranded across the country could only guess whether this was due to an unannounced strike. Were there problems with Southwest’s airplane fleet? Were there problems with a specific airport?
It turns out that the problem was with Southwest’s internal scheduling tool, an in-house software application built in the 1990s and held together over the years as Southwest roughly doubled in size across passengers, planes, trips, employees, and number of destinations supported. This complexity ended up being especially challenging because Southwest’s model as a regional airline meant that it did not use a central hub as most other large airlines in the United States use. Rather, each plane flies from point to point leading to a combination of possibilities that grew exponentially rather than linearly. Although Southwest does not fly every plane from each location to every other location, the complexity of operations from roughly 45 locations in the late 1990s to roughly 100 domestic locations today is not a doubling of complexity but more along the lines of N*(N-1)/2, as long-time analytic advisor Neil Raden pointed out. This means the complexity increase is more akin to (45*44/2) = 990 vs. (100*99/2) = 4950. This level of complexity is multiplied by the challenges of organizing the thousands of pilots and flight attendants traveling from point to point every day.
The orders of magnitude in complexity associated with this scheduling system had already been strained in previous years but met a critical breaking point at the end of 2022 due to a lack of investment and modernization. This failure is a textbook example of the concept of “technical debt.”
Technical debt is often described as a concept that is difficult to articulate for a business audience, but the concept is actually very straightforward from a business perspective. Just as with financial debt, which must be paid back with interest or risk a default that threatens business assets, technical debt is an act of borrowing against the future. Like financial debt, technical debt either requires future investment (the “interest”) to fix the technology over time or to accept that the technology will fail (“default”) and lead to breaking down any processes dependent on the technology.
The lessons from this breakdown are straightforward but are potentially challenging to follow in 2023, a year where companies will be tempted to cut costs by any means possible.
Ensure that executive stakeholders are clear both on the concept of technical debt and the labor associated with current technical debt. It may not be possible to put an exact dollar amount on the technical debt that currently exists in the organization, but it should be possible to provide some guidance on the current labor and resources assigned to managing outdated technology as well as the potential points of failure associated with, say, being unable to find a FORTRAN developer quickly or the use of applications no longer supported by a vendor or by in-house developers.
Document every technology associated with each mission-critical process. With the cliché that “every company is a technology company” having been fully realized in today’s web, mobile, and automated world, IT’s job is to provide proactive guidance on the hardware, software, and skills that must either be supported or upgraded. The business value propositions of IT asset and service management are unlocked when assets are specifically aligned to business dependencies, projects, and processes.
Identify technologies where business growth lead to exponential technology demand. Southwest’s scheduling system needed to grow exponentially and eventually failed based on its legacy design. Look at the mathematics associated with key processes to see if growth is logarithmic, linear, exponential, or unpredictable. Simply assuming that a process grows linearly with revenue, employee growth, or business traffic can be a job-ending mistake.
Ensure that legacy technologies have the capacity to support forecasted business complexity or business growth. Any time technology growth needs to expand faster than overall IT spend or overall operational spend, it should serve as a warning sign to either change the technological approach or to invest in the necessary capacity.
We face a challenging year as inflation, foreign currency challenges, geopolitical issues, and supply chain bottlenecks still threaten the spectre of recession. But as executives seek to cut costs, Southwest serves as a reminder that businesses must still futureproof their technology approaches, evaluate the scalability of their processes, and invest in service delivery commensurate with their brand promise or risk lasting revenue and market capitalization losses.