This article was published on the Brandman University blog.
Technology is meant to enhance our lives by making it easier to work and play. But what happens when the best of intent goes into creating computer efficiencies and everything goes terribly wrong? From security services to automation algorithms, check out these top fails in computing history that impacted companies, people and governments from around the world.
System failure: Historic computer glitches
When Algorithms Go Rogue
Computerization of the order flow in financial markets began in the early 1970s with the dream to streamline communication, data tracking and workflows. Although stock markets can change quickly, faults for losses are generally organic in nature. That was not the case in 2012 when Knight Capital, a firm that specializes in executing trades for retail brokers, reported $440 million in cash losses due to a faulty test of new trading software.
The new (and by nature expected improved) software was originally set up by the company to work with only a few stocks with buy/sell points well outside current trading markets to ensure that nothing would actually affect the system. As you can guess, the software did not behave as projected, and it was found that the trading algorithm that the program was using was too eccentric for its own good. It essentially bought stocks at the market price and then sold them seconds later at the bid price which was a few cents less. Although it may seem minor, the exercise ran for 45 minutes and the rapid trades pushed the market price up, creating a larger spread between the market price and the bid price, making the problem worse.
The moral of the story: Test new software in a secure environment before going live.
Big Cities and Big Blackouts
If Thomas Edison was in New York City on August 14, 2003 he would have been horrified! Affecting around 55 million people in the North Eastern United States and Ontario, Canada, the North American Blackout was one of the most significant in history. For economists it would be a simple case of supply and demand when a power plant along Lake Erie, Ohio went offline due to high usage which put the rest of the power network under greater stress. From there, like when your computer works overtime, the hardware in the power lines heated up, expanded and began to hang lower, catching trees and bringing them down. This placed even more pressure on the system that created a cascading effect that eventually reduced the power network to 20% of normal output.
What’s interesting here is the role that the center control alarm system played in the situation. Similar to your home security system, its purpose is to let you know when there is danger or threats to the stability of your environment but this is when the best of computer efficiencies failed us. The blackout could have been prevented were it not for a software bug in this central system that created a “race condition” scenario that occurs when two parts of the system are competing over the same resource and were unable to resolve the conflict. This not only caused the system to freeze and stop processing alerts, but it also did it “silently,” meaning that it couldn’t notify any personnel of the issue. This meant that there were no audio or visual alerts available to control room staff that at the time over-relied on such things for monitoring. The aftermath left millions of citizens without electricity for several days and affected industry, utilities and communication.
The moral of the story: Test your alarm systems regularly and make sure there are multiple channels that notifications can reach you by.
Hurry! Update Before The World Ends!
“Cause they say two thousand zero zero, Party over, oops out of time, So tonight I’m gonna party like it’s 1999.” It makes you wonder if Prince knew the havoc that his loveable pop song would predict nearly twenty years later. Who would have known a little lack of planning and two missing digits would cause worldwide panic and the hoarding of water bottles and other survival kits in the basement of local homes. It sounds silly now but Y2K scared a lot of people and tech-driven companies.
The basis of the situation was grounded in the fact that the majority of computer systems had two digits to show the date (98 vs 1998), which seemed pretty reasonable during their development. But many engineers didn’t think about what would happen when the date passed the year 2000, which could only be represented as 00, or to a confused computer without context – 1900. This simple discrepancy would break any calculations involving ranges of years that crossed the millennium. All this chaos resulted in software companies running for cover and spending hours indoors to rapidly update their products, which controlled everything from hospital computers to train ticketing systems.
The moral of the story: Don’t be shortsighted when creating new innovations, take a long-term systems thinking approach to planning for the future.
The Downfall of Retiring
In early 2014 the financial services industry was faced with an ultimatum – upgrade their ATMs or unlock the security gates. An estimated 95% of American bank ATMs used Windows XP, and the computer giant Microsoft announced it would be retiring tech support for the operating system. That meant that the company no longer would issue security updates to patch holes in Windows XP, leaving ATMS exposed to new kinds of cyber-attacks.
Companies predicted that replacing the operating systems on ATMS would be a major undertaking. CNN reported that, “In the United States, there are 210,500 bank ATMs, about 200,000 of which run on Windows XP, according to Retail Banking Research in London. In most cases, banks must upgrade the software one ATM at a time, and some will need the entire computer inside replaced too. Labor included, it’s a process that experts in the ATM industry say could cost anywhere between $1,000 and $3,500 apiece.” Banks flipped this lofty feat into an opportunity to add newer card readers to their ATMs that accept more secure chip-and-PIN cards.
Moral of the story: Stay on top of the tech-times and upgrade earlier to avoid potential pitfalls later on.
Whose Rights are You Protecting?
The Recording Industry Association of America (RIAA), along with many of the artists, consistently campaigns against pirating music across all genres. In the internet age downloading music can be considered an epidemic with one study conducted by the Institute for Policy Innovation identifying the annual harm at $12.5 billion dollars in losses to the U.S. economy as well as more than 70, 000 lost jobs and $2 billion in lost wages to American Workers. In 2005, Sony tried to help by introducing a new form of copy protection on some of their audio CDs but their plan ultimately failed.
When CDs were played using a Windows computer, the discs would automatically install a piece of software called a “rootkit,” that buries its way deep into the computer and alters fundamental processes. Although not always malicious in nature, types of software such as this can stealthy plan hard to detect (or remove) software, such as viruses and Trojans. In Sony’s case, the goal of the rootkit was to control the way that a Windows computer used the Sony CDs to prevent copying them or converting them to MP3s in order to cut down on piracy. This aim was achieved but by taking measures to hide the rootkit from the user, it enabled viruses and other malicious software to hide along with it and the plan backfired, causing several law suits and product recall.
The moral of the story: In project management and computing technology implementation can make or break a project, so plan thoroughly and thoughtfully when innovating new processes.
A $500 Million Miscalculation
In 1996 Europe’s latest unmanned rocket, Ariane 5, was intentionally destroyed only seconds after launch on its maiden flight. Its mission was to carry its cargo of four scientific satellites to study how the Earth’s magnetic field interacts with solar winds. But the dream never came to fruition because when the guidance computer tried to convert the sideways rocket velocity from 64-bits to a 16-bit format, the product was too large and an overflow error occurred.
Of all the careless lines of code recorded in the annals of computer science this one may stand out as the most devastatingly efficient. When the guidance system shut down, control passed to an identical redundant unit, which also failed because it was running the same algorithm. Ironically, it was later found that the calculation containing the error and sealed the fate of the mission actually served no purpose for the system once the rocket was in the air. Its sole purpose was to align the system prior to taking off.
The moral of the story: The devil is in the details and if you’re going to invest millions of dollars into a project, make friends with him.
Your Passport to Nowheresville
If Y2K wasn’t enough to deal with during the turn of the century, the United Kingdom had another complication coming its way. In 1999, the U.K. Passport Agency implemented a new Siemens computer system without adequately testing it or training its staff how to operate it properly. At the same time, the government instated a law that required all children under the age of 16 traveling abroad to obtain passports. It was a perfect storm for the organization. The system that was supposed to increase computer efficiencies failed to issue passports on time for a half a million British citizens. And with the new legislation, there was a huge influx of passport applications that overwhelmed the new system.
The result of this perfect storm was mass inconvenience and a cost of £12.6 million with the agency having to pay in compensation, staff overtime and umbrellas for people queuing in the rain for passports.
The moral of the story: Test and train train train! Don’t launch a product or service prematurely and spend time investing in your staff members – they are your business’s most important asset.