Prev Next

Chapter 9. Go—AI implementation strategy

This chapter covers

The trade-offs in building an AI algorithm, buying from a provider, or using open source
Developing a Lean AI Strategy to minimize risk
Taking advantage of the virtuous cycle of AI
Learning valuable lessons from AI failures

The previous two chapters gave you all the building blocks you need to define and implement an AI project. It’s now time to put everything together. In this chapter, you’ll learn how to decide which components to build in-house or buy from a third-party, and how to manage the implementation phase. You’ll also become familiar with the most typical bumps in the road faced by inexperienced AI builders. By the time you reach the end of the chapter, you’ll feel confident enough to kick-start your project tomorrow .

9.1 Buying or building AI

Paraphrasing what a wise man once said, sometimes the best way to successfully complete a project is never to start it in the first place. As an AI evangelist, one of the most impactful decisions you can make is to rely on products and services offered by AI providers and to avoid building some or all of the technology needed for an AI project. This decision applies to all three of the components of the project: data, model, and infrastructure.

This section will teach you how to think strategically about the build or buy question. Actually, we believe that the current status of the market for AI services calls us to add a third option: borrow . As we present each of these three options in the next sections, the fundamental concept to keep in mind is implementation risk . As you decide to buy or lease technology and infrastructure off the shelf, you give up flexibility and intellectual property, but obviously sidestep the risks connected to investing in the technology yourself.

Table 9.1 sums up the main differences between Buy, Borrow, and Build solutions. It’s here for you to jump back to as needed, and we explain it in detail in the next three sections.

Table 9.2 Comparing Buy, Borrow, and Build strategies

	Buy	Borrow	Build
Need data	No	Yes	Yes
Need model development	No	No	Yes (potentially use open source models)
Need infrastructure	No	No	Yes (potentially use hosted infrastructure)
Risks	Low technological risk, but you’re bound to the provider’s agenda	Same as Buy for ML platforms, technological risk for open source	Highest technological risk
Team skill set required	Web development	Web development + minor data science	Web + ML
Up-front cost	None	Data collection	Data collection and model development
Variable cost	Little	Moderate	Amortized in the infrastructure/ cloud investment

9.1.1 The Buy option: Turnkey solutions

The simplest option to implement AI technology is to use one of the turnkey solutions offered by tech providers like Google, Microsoft, or Amazon. These Buy solutions are also called Machine Learning as a Service ( MLaaS ).

In MLaaS services, tech companies build general-purpose ML algorithms, taking care of the data collection, training, and industrialization. You use their technology to interpret your data in exchange for money.

As you can see in figure 9.1, all you have to do is send the data you want to analyze, and you get a prediction from their ML model. In technical terms, you’d be using an application programming interface (API) to send your data to a provider and get an answer. For example, almost every tech giant (Microsoft, Google, IBM) offers a computer vision API: you send them an image, and they send you back its content. Some computer vision APIs are particularly fancy and can give you interesting information like the estimation of the mood of a person in an image based on their facial expression.

Figure 9.1 The provider you buy a turnkey solution from takes care of all the technical details: you need only to send requests with your data and get the response back.

One of our ML-based side projects from a few years ago was a voice transcription service for journalists and students who wanted transcripts of interviews or lectures. After we implemented the web application and confirmed demand for the product, we were spoiled by the number of choices when it came to finding an AI provider that could run the actual transcriptions. Google, Amazon, and Microsoft all offer a voice transcription service in dozens of languages, as well as several smaller providers that focus on specific niches.

Unfortunately, when using an MLaaS solution, you’re dealing with the proverbial black box : you have no control over how the model is built, its performance, and the infrastructure that serves it. The good news is that you almost don’t have to do anything in this case, as the provider has already done all the hard work of collecting data, training a model, and setting up the infrastructure to serve it (figure 9.2). All that’s left to do is to integrate the solution into your product, a task that any web developer can sort out. This allows you to potentially get a first, quick prototype of your AI application in hours instead of weeks or months.

Figure 9.2 When buying a solution, the provider takes care of all three components (data, model, and infrastructure).

To give you an idea of the types of services offered as turnkey solutions, let’s have a (nonexhaustive) look at Google’s price list at the time of this writing:

2 cents per minute for transcribing speech into text
$1 per document for natural language understanding: extracting structured data from documents and performing sentiment analysis
$20 per million characters for translation
$1 per image for detecting logos or faces on pictures

There are two main takeaways here. First of all, all turnkey solutions are focused on solving generic problems. This makes sense, as polishing and marketing these products takes a lot of investment that providers will commit to only if they think there’s going to be a broad interest in the market. For instance, Google is never going to offer “cucumber-sorting AI as a service” because market demand would never be high enough to recoup development costs. Instead, it makes more sense for them to offer a service with broader interest that can recognize everyday objects like cats, dogs, or cars.

Second, from a financial perspective, Buy solutions have nearly zero fixed costs and higher variable costs than other strategies. In fact, with these solutions, there’s no effort in data collection, data cleaning, ML engineering, and deployment. All you have to do is integrate an API: a task that takes the average web developer just a few hours. As you’ve seen in the previous price list, the costs are all pay-per-use.

From a technical standpoint, the level of risk associated with these solutions is extremely low. Because you’re using products built by renowned tech companies, you can be fairly sure that you’ll get state-of-the-art performance. Of course, this is true only as long as your application fits the task for which the service was engineered.

Beware: the fact that you’re outsourcing all the technical risks doesn’t mean that Buy solutions are risk-free. All Buy solutions have a major threat in common: because you don’t own the underlying technology, you’re beholden to someone else’s agenda. They might decide to discontinue the product you’re using and leave you to fend for yourself.

Our favorite cautionary tale involves San Francisco-based startup Smyte. The company developed ML models and a platform to help online communities fight spam, fraud, and harassment by proactively identifying malicious trends in comments and interactions. When Smyte was acquired by Twitter in 2018, customers woke up to find that the platform had been shut down, as Twitter decided it would focus on serving its internal needs only. High-profile customers like Meetup and Zendesk were left scrambling for a replacement as the turnkey solution they had been relying on simply stopped responding to their queries.

The risks of providers shutting down or changing their products creates lock-in toward their services. Lock-in effects also extend to security and privacy, as turnkey solutions are limited to running within the provider’s network and cloud infrastructure, which might not be acceptable in security-minded industries like health care and defense.

However, the incredible speed and low fixed cost of these solutions makes a perfect case for using them for quick prototypes that can validate the market demand for your products.

9.1.2 The Borrow option: ML platforms

As you have seen, turnkey solutions all but guarantee state-of-the-art performance on many generic AI tasks. What about situations where your needs are more specific? For example, what about the cucumber classification task described in chapter 4? As we said, you won’t find a “cucumber quality classification” product on Amazon’s AI platform, because the market for it would be vanishingly small.

Instead, you can choose to use an ML platform product. Compared to turnkey solutions, ML platforms allow you to upload your own training data to optimize a ready-made model for your specific needs. In the case of cucumber classification, you would be able to take advantage of Google’s image classification models, and fine-tune them with a labeled dataset of cucumbers. From the AI point of view, ML platforms often take advantage of the strengths of transfer learning to allow customers like you to use the provider’s massive investments in modeling while adapting to niche needs (like cucumber classification).

From the point of the integration, ML platforms are just like Buy solutions: the model is hosted in the provider’s cloud, and you just submit requests to it (figure 9.3). Prices for ML platforms are two to four times higher than for turnkey solutions to account for the added complexity. But the pricing model is the same: you pay only a small fee for each request, and a token amount based on the amount of custom data you want to train on.

While ML platforms require no up-front investments per se, you’ll need to budget for collecting and potentially labeling data, so you can take advantage of their customizability. ML platform products generally solve the same broad classes of problems as turnkey solutions; for example, you can upload your own text database to improve sentiment analysis, or submit a multilanguage corpus to improve translation accuracy.

Figure 9.3 When borrowing a solution, the provider takes care of the model and the infrastructure, and you provide the data.

We created a Borrow category for ML platforms because you’re investing in a “borrowed” platform: you still don’t own the IP for the model. However, you do own the training data that in many cases constitutes your real asset.

That being said, the lock-in effect for ML platforms might be even stronger than for turnkey solutions. The latter are so generic that, even if your provider shuts down, you’re likely to find a similar offering from a competitor. On the other hand, the interactions between the secret sauce in the ML platform and your training data can be harder to replicate.

AI providers often advertise ML platforms as a great fit for projects using the structured business data described in chapters 2 and 3. Because the computational needs of these models are often modest, ML platforms offer automatic ML products that try multiple models on your data and automatically choose the best one. It’s worth mentioning that, if a problem is simple enough to be solved by these automatic tools, the average ML engineer is likely to get similar performance with limited effort.

In our time helping companies with their projects, we found that for problems dealing with structured data (typically marketing, sales, or core business data), the biggest effort by far is in cleaning and preparing the data.

In one case, we needed to build an algorithm working with the call center data of a pharmaceutical company. This data had been collected for 14 years, but no one had put any effort into trying to get value out of it: it was a sunk cost used just for reporting. When an enlightened manager proposed building an AI algorithm, the first challenge we faced was trying to go through 14 years of history in order to understand the logic behind the data, and to clean outliers and potential data entry errors. If this wasn’t enough of a challenge, over the years the company used two CRM providers, and therefore had inconsistencies in the formats and fields of data collected. Long story short, getting the data ready to be used took us three months. Once we had the data ready, building a model that performed well took roughly two weeks.

Finding the right model and tweaking it can be surprisingly quick. ML platforms won’t spare you from the initial effort of getting your data ready to be used for an ML application, and thus provide only minor savings in terms of time and up-front costs. However, they do add variable costs as their business model is pay-per-use.

9.1.3 The Build option: Roll up your sleeves

If neither Buy nor Borrow solutions are a good fit for your project, you’re going to have to go the Build route. By building your own model, you gain complete flexibility and control over your technology. In general, building your own model means you need to take care of all three components: data, model, and infrastructure (figure 9.4).

These are the two most common situations for which it’s impossible to find Buy or Borrow solutions:

You use domain-specific data --For example, if you’re working on a 3-D MRI or other medical diagnostic tools, you’ll work with three-dimensional black-and-white images. It’s unlikely that you’ll ever find a turnkey or ML platform product that can support them.
You have very specific needs --For instance, you may want to build an ML algorithm that identifies cancer from X-rays of lungs and highlights where the cancer is. This last bit is an object localization task, a relative niche application that’s covered by few or no providers.

From the point of view of development risk, the Build choice is the riskiest, because your team has full responsibility for the product. Luckily, they’re often not alone, and can rely on two important accelerators of progress: open source models and hosted infrastructure. Let’s see how each can help you build your project.

Open source models are freely available software (and, often, training data) that solves a specific problem. As long as its license is compatible with your intended use (for example, commercial use), your engineers can simply copy the code, customize it as needed, and use it in your product. If you’re not familiar with the open source world, you might be wondering why people would offer their work for free. Well, the two biggest contributors of open source models are large tech companies and academics. Tech providers like Google and Microsoft release cutting-edge products to strengthen their reputation as AI leaders and attract people to their turnkey products and ML platforms. Researchers in academia publish their software to increase awareness of their work, attracting funding and recognition. For more niche cases, you might not even find any open source code at all. Your best bet then is to get your researchers to dig up relevant scientific papers from the academic community. Often, the description of the model included in the paper is enough for your ML engineers to replicate it with little effort, and then you’re off to collecting training data.

Figure 9.4 When building a solution, your team has to take care of all three components: data, model, and infrastructure.

Open source code is usually a great way to jump-start your development efforts. Your team will be able to start from a working solution and iterate it to adapt to your specific problem. Compared to ML platforms, the ability to tweak the model enables the ML engineers in your team to debug problems and experiment with changes to improve its accuracy.

Generally speaking, if you build your own model, you’ll also need to provide and maintain the infrastructure for it, often by managing your own cloud resources. Hosted infrastructure frees your software engineers from this burden by letting you run your custom models in the cloud of a provider. As long as your model fits the provider’s technical requirements (for example, in terms of programming language), you can use their hardware and networks to run your models. This works because, as we explained in chapter 8, most of the magic in building AI lies in having the right data and creating a good model. The software to manage the execution has quickly been standardized and thus commoditized in the past few years.

Open source code and hosted infrastructure can do a lot to reduce the investment needed to build your own models, saving your team many man-years of work. Most importantly, they can also give you a confidence boost. At the beginning of the project, knowing that somebody else was able to crack the same problem will clear away many doubts about its feasibility. Toward the end of the project, having the option to offload much of the deployment work to a vendor makes the team run faster toward the finish line.

At the end of the day, which scenario (build, borrow, or buy) you land on depends on the specifics of each AI project you’ll work on. The next section will guide you through this decision.

9.2 Using the Lean Strategy

How can you choose which of the Buy, Borrow, and Build strategies is the best fit for your team and the project you’re working on? Is it even possible to find a single strategy that can fit a project at every stage? This section introduces our answer to this question: the Lean AI Strategy , a framework we developed to help you build your AI products incrementally.

The word lean comes from lean manufacturing , a philosophy born in the Japanese manufacturing industry (particularly in automotive companies like Toyota) that aims to minimize production waste without sacrificing efficiency. Eric Ries adapted the idea to the world of entrepreneurship in his landmark book The Lean Startup (Crown Business, 2011). Ries introduced a process to manage innovation that encourages testing market and technology assumptions through continuous experiments. The core insight behind this strategy is that even CEOs and founders have little information about what the marketplace truly values, so they’d better quickly build a first iteration of the product, see if customers like it, and then improve it based on their feedback.

The Lean AI Strategy has the same goals: it’s a process to incrementally implement AI projects while minimizing risk and resource waste (time, money, and team morale). Following the Lean AI Strategy, you’ll learn to choose the best implementation plan for a given project (build, buy, or borrow).

As you begin the project, the first step is understanding the technological risks associated with each AI development path. Each option has an associated complexity that comes with short-term risks, while giving you a degree of flexibility that is related to long-term liabilities. The relationships between these risks and flexibilities are represented in figure 9.5.

Figure 9.5 Points on the Build versus Buy spectrum stack up when it comes to risk and flexibility.

Now, imagine you’re just starting the implementation of an AI project. This is the moment where uncertainty is highest, mainly because of three factors:

Business risk --Does my solution hold business value?
Workflow risk --Can I successfully integrate my solution in the current business?
Technological risk --Will the technology work well enough?

The methodologies introduced in chapter 7 are meant to minimize business and workflow risks, but nothing compares to real-world testing with real people. To put your new product in front of real users, you’ll often need to build at least some technology. The Lean Startup introduced the concept of minimum viable product ( MVP ): the smallest product you could possibly build while still delivering value to customers. In the world of AI projects, it is useful to think in terms of minimum viable AI ( MVAI ): the smallest and least risky technology that could possibly power your first MVP.

9.2.1 Starting from Buy solutions

The most important role of the Lean AI Strategy is getting you on the right path to build the MVAI . Given the high degree of uncertainty, it’s a good idea to focus on a short-term strategy and defer long-term decisions to later, after the viability of the idea has been validated by customers. If you look at figure 9.6, this means starting from Buy solutions.

Figure 9.6 The first step when building the Lean AI Strategy for a new AI project is to start scouting for Buy solutions, and follow this decision tree.

The technical vocabulary we taught you in part 1 is enough to start scouting for Buy solutions on Google. For instance, let’s assume you run a food-delivery startup and want to sort food reviews into positive and negative ones. As you recall from chapter 5, this application is called sentiment analysis . A good query to search for Buy solutions is “Sentiment Analysis API,” which will return multiple results. In many cases, you’ll find turnkey products from both large tech providers (such as Google, Amazon, or Microsoft) and smaller specialized companies. In the world of AI, you really should consider both. Large players have the advantage of scale, but small players might have the edge because their data collection and engineering efforts might be more closely aligned to the specific goals of your project.

If you find what you’re looking for among the constantly growing offerings of AI providers, you’re in luck. Adopting one of these is likely to be your best bet for developing the initial iteration of your AI project.

In other cases, you might not find a solution for the exact AI task your project needs. This isn’t so uncommon; after all, providers invest resources into building only those products that have a large potential market. If this is the case for your project, you have two options:

Reframe your AI project so it fits into an existing Buy offering.
Move up the technological risk ladder and look for Borrow solutions.

You should begin by considering option 1 first. Is there any way you can rephrase your AI task so you can use what’s on the market without compromising your value proposition? For instance, let’s assume you are a telco company and want to test a super-fast customer support service: users write a message, and the AI automatically routes the customer to the best call center agent; for instance, “My internet isn’t working” would be a technical issue, and “I’d like to activate the new offer” a commercial claim. Aside from specialized AI companies with a full-blown product for your specific project, it’s unlikely that you’ll find a turnkey product for this exact task.

On the other hand, you’ll find several very accurate sentiment analysis algorithms on the market today. Although these algorithms wouldn’t solve 100% of the task, what about using them to route angry customers to customer support directly, and neutral customers to manual sorting by a human operator? Obviously, this won’t be the final solution, but it allows you to test basic assumptions:

Business assumptions --Are users willing to engage with customer support through an automatic system, or do they ignore it completely?
Workflow assumptions --Is the new system a time-saver for operators?

If you can’t find a way to scale down your problem as we did in this example, it’s time to consider switching to Borrow solutions.

You also might consider moving to a Borrow solution when you’ve tried a Buy solution and it didn’t perform well enough for the project. But first, you should reconsider what it means to be “good enough”: even if the AI isn’t perfect and makes obvious mistakes from the point of view of a human, the project might still be a success for the organization. Even if that’s not the case, take a second and pat yourself on the back. Thanks to the Lean AI Strategy and a focus on turnkey models, you’ve discovered important technical information about your project with little to no up-front investment in technology. Next, engage the ML engineers on your team to try to find out why exactly the Buy product doesn’t work. Keep in mind that Buy products are the result of substantial engineering investments, and thus your team can hope to do better under only a single circumstance: lack of problem-specific training data (for example, cucumber classification, or a niche vocabulary in NLP tasks). For example, if you’re finding that a Buy model has bad performance in translating documents from English to French, it’s unlikely that you would be able to outdo Google and improve things by adding more training data. If you determine that the lack of task-specific training data is killing your project, Borrow models can be your way out of the problem.

9.2.2 Moving up to Borrow solutions

Remember that the power of Borrow models comes from allowing you to augment existing models from AI providers with your own training data. We called these customizable products ML platforms . If you do a good job of collecting (and potentially labeling) data that’s a good fit for your project, you’ll get the best of both worlds: using years of investment in precanned models while achieving good performance on your specific tasks. As a bonus, Borrow models free you from having to design, deploy, and maintain the infrastructure for your project. By using a Borrow solution, you’ll also start to grow the in-house AI expertise that drives your data collection effort, and build up institutional knowledge about building, comparing, and evaluating the performance of various AI models.

In other words, stepping up from Buy to Borrow allows you to iterate on the AI components of the project while getting answers fast, since ML platforms are essentially a bolt-in replacement for turnkey models. To find a suitable Borrow solution, start from the search you’ve done for Buy solutions and take it from there. Keep in mind that the options for ML platforms might be even more limited than turnkey models, as adding the customization points for transfer learning is more work for the AI provider.

If you can’t even find a Borrow solution for your AI project, you have to make the same decision you had to face when you were evaluating Buy options: Can you reframe your solution so it fits what’s on the market? If not, your only option is to move up the technological risk ladder once again and start building your own model, as you can see in figure 9.7.

Figure 9.7 The same decision tree that we designed for Buy solutions works for Borrow ones, with different starting and end points (we start from a failed Buy scouting and end with moving to Buy solutions).

9.2.3 Doing things yourself: Build solutions

If you reach this stage of the Lean AI Strategy, there’s not a lot left to decide. Your project doesn’t fit any Buy or Borrow products on the market, and you can’t reframe it to do so. This means that your team members are going to have to roll up their sleeves and start building the model themselves (likely taking advantage of the open source community and hosted infrastructure as described in chapter 8).

Because Buy is the “end of the road” for the Lean AI Strategy, we want to comment on each path that might lead your project here. The most typical situation occurs when the data that your project is using is uncommon or very specific. Think about an audio application that plans to classify bird species by their chirp. No ML platform product is likely to support transfer learning from animal sounds, and no turnkey solution is going to offer that either.

In other cases, we have seen teams with a working Borrow solution that want to try building their own model, expecting a cost reduction or an improvement in performance. Although it’s true that building your own model (and managing all aspects of its operations) might have lower variable costs, many teams make the mistake of ignoring the substantial up-front investment that went into creating the Borrow solution they had been enjoying. Perceived performance benefits are often illusory as well: any Borrow model on the market went through extensive testing to verify its accuracy in a variety of scenarios. It’s easy to get excited about positive early results and underestimate the amount of engineering effort needed to bring a Build solution up to scratch. The best way to approach a migration to a Build solution is to treat it like a side project, and not bet the entire AI strategy on it. This will help reduce the risk of the migration through constant monitoring of the in-house effort compared to the Borrow solution.

Besides these technical reasons, sometimes organizational or business issues make Build the only option. For example, both turnkey models and ML platforms generally run within the cloud resources of the AI provider. This factor makes them a no-no for privacy- or security-conscious industries like health or defense, because the potentially sensitive data must be shared with the provider in order for the system to work. Other organizations might decide to invest in creating a first-party IP even if this choice is not the cheapest or fastest way to implement a given project.

We can’t really argue with a strategic point of view like this, but remember that you’ll have plenty of time to develop your own IP after you’ve made a working prototype. Even if in the long run you’ll want to develop your own model, we still encourage you to start by following the Lean AI Strategy and kick off your efforts with Buy solutions. This may be just a stepping stone, but it will still be invaluable to have a quick working prototype that you can use to test your most pressing business assumptions.

As a final word of caution, we warn you against going too deep into the R&D rabbit hole. After you’ve decided that you have to build your own model, it’s easy to go overboard and become more and more ambitious. Experimenting with cutting-edge AI models is exciting and great for humanity. As a side bonus, your team is really energized and puts in its 110%. However, chances are that, if you’re reading this book, you and your organization are not ready to embark on a multiyear R&D project to validate novel uses of AI or entirely new algorithms. Consider holding out on these “moon-shot” projects to pick them up after the AI culture in your organization is more mature.

Figure 9.8 The Lean AI Strategy decision tree for Build solutions

The decision tree for the Build option is represented in figure 9.8.

In conclusion, the Lean AI Strategy encourages you to minimize implementation risk by delaying commitment to expensive and risky engineering tasks. Instead, it encourages you to kick-start the project by using Buy or Borrow solutions that enable you to take advantage of the economies of scale of large providers. As the project and your organization mature, you’ll have plenty of opportunities to embark on technically more ambitious implementation plans.

As you jump from Buy, Borrow, and Build solutions following the Lean AI Strategy, you might have trouble deciding when to pull the trigger and deploy the project. Chapter 2 took a technical view of this aspect, and introduced the important measures of false positives and false negatives. The next section completes this discussion by telling you a more high-level story of how performance evolves through the life cycle of a typical AI project.

9.3 Understanding the virtuous cycle of AI

Let’s rewind to the beginning of this book, when we first put ourselves in the shoes of the managers of a real estate website. We worked on a platform where homeowners could list their property for sale, and prospective buyers could look at the offers, schedule a visit, and finally buy one of the homes. We envisioned building a new feature enabling an AI model to automatically and instantaneously predict the best listing price for a home.

Our assumption was that users will like this feature, and it will give us a competitive advantage compared to other real estate platforms on the market. If we were right, people would start using our website more than those of our competitors, driving increased revenues, more users, and . . . more data.

What happens to ML algorithms when you have more data? Usually, their performance improves. Your product can become even better, and your predictions more accurate, making users happier and thus attracting more users. The result, again, is even more data.

What we just described is the virtuous cycle of AI : if AI improves your product, bringing more users and more utilization, you’ll generate more data that you can use to feed back into the model, further improving it. It’s a self-reinforcing loop that, potentially, never ends. You can visualize this loop in figure 9.9.

Figure 9.9 The AI virtuous cycle. As AI improves your product, more users want to use it. This leads to more data being generated, which improves your AI, which further improves your product, bringing even more users in.

Let’s see how the virtuous cycle of AI influences real-world products, using our real estate platform example. Before we got started with the first AI project, all we had was a standard website, probably similar to offerings by competitors in the same space. The first AI product we decided to build was a home-price predictor: an ML algorithm that would learn from past property sales and automatically compute the most likely price for each property on the market.

This novel and curious feature generated press buzz and created a new advantage for our platform that differentiated us from the competition. Reasonably, we imagined that new customers would start to use our platform, attracted by our new house-price predictor. New customers don’t just bring additional revenue, but also new data. When these new customers start selling their homes, the additional data we collect can be used to retrain our algorithms with even more data, improving its performance. Reasonably, an even better algorithm can bring us even more data, which brings us even more customers, and so on, as you can see in figure 9.10. The virtuous circle of AI can carry on forever, or until you start getting diminishing returns from your algorithms and reach a technology plateau.

Figure 9.10 The AI virtuous cycle applied to the real estate example

You may ask, What if I don’t have a lot of data, and my initial model can’t match the standards of quality of my current product? We argue that in this situation, the virtuous cycle of AI is particularly important, and you should ship whatever brings some value to your customers, even if it’s far from perfect.

To understand why, let’s cover some basic concepts of how technological innovation is adopted and marketed. In 1962, communication studies professor Everett Rogers published Diffusion of Innovations , in which he presented a sociological model that describes the adoption or acceptance of a new product or innovation. The main insight is that a market is not made of one homogenous group of people; it’s divided instead into groups of individuals who share similar characteristics:

Innovators --These risk-takers have high social status, are social, and have financial liquidity. These are the people who line up in front of an Apple store the day before the release of a new iPhone.
Early adopters --They are also risk-takers, are social, and have high purchasing power. Their choices are more conservative than innovators, so they may not line up the day before a new iPhone release, but you can be sure that they’ll buy it the next day.
Early majority --This is a large group of people with a mixed social status and purchasing power, who adopt an innovation after it has been released for some time. They value the improvement that technological advancements bring to their life but don’t want to take risks. These are the people who either wait months after the release of a new phone to see what other people think, or can barely tell the difference from the older model.
Late majority --Another large group of people, mainly skeptical and risk-averse. They value safety more than innovation, and don’t want to switch to a new technology until they absolutely have no doubt about the return they’ll get. These are the people with smartphones that are two to three generations behind.
Laggards --The last ones to adopt an innovation. These are the people who still use a mechanical keyboard phone, and replace it with the cheapest model on sale when it breaks.

In 1991, Geoffrey A. Moore published Crossing the Chasm : Marketing and Selling High-Tech Products to Mainstream Customers , which built on the preceding theory and added a simple yet crucial insight: when introducing a technological innovation, passing from early adopters to early majority is not trivial, and many innovations fail at doing so. Moore introduced the concept of a chasm : a tough period of uncertainty when a company has reached all the innovators and early adopters in its market and needs to reach the early majority, as you can see in figure 9.11. This chasm exists because the early market (innovators and early adopters) and the late market (early majority, late majority, and laggards) are driven by completely different goals. Whereas the former value innovation and the worth that it brings, the latter value safety and are risk-averse, and it’s hard to make them see the worth of your innovative product.

Figure 9.11 The market for a technological innovation divided into its five customer groups. After the early adopters, we reach a chasm: a moment of high uncertainty when switching from early to mainstream users.

Now, let’s go back to our real estate platform and see how our ML-based innovation relates to the five groups we’ve introduced. Imagine that your first algorithm is worse than a human broker: let’s say that its predictions have a 10% margin of error, while human brokers have only a 5% error rate. Hurried homeowners won’t be concerned about the increased error rate, and are more interested in getting a number right away without waiting for a visit by a human broker. Or maybe they’re technology enthusiasts rushing to try the new AI-based features that you’re offering. They’re the innovators and early adopters who will kick-start the virtuous cycle.

After these people start using your product, you’ll start getting more data. You can use this new data to train your AI algorithms again. Let’s assume that the performance of your algorithm improves and gets closer to a flesh-and-blood broker, say with a 7% margin of error. Now that you started closing the accuracy gap with human brokers, even less-brave users will choose to trade off that 2% accuracy for increased speed. The result, again, will be more data that you can use to keep improving the model.

Now imagine that thanks to the new data you collected through the primitive ML model, you finally reach the crucial milestone of human-machine parity: your algorithms are as good as an expert broker. Using your product is a no-brainer at this stage; users get the same performance as an expert broker, but it takes just a second to get it, without having to schedule visits. This is the level of performance needed to attract the early majority: they’re OK with using new technology as long as it doesn’t come with any downside compared to the status quo.

As in any other business venture, it’s crucial that you’re able to correctly communicate the performance of your technology to people, and they’re able to understand it. This isn’t trivial, and you shouldn’t take it for granted.

If you’re successful at communicating the potential of your technology, home sellers will want to use your model to have an instant quote for their home. Can you guess what the result would be? You’re correct: even more data. Now, your algorithms can benefit from this additional data to even surpass human accuracy. Congrats, you now have a superhuman AI home-price predictor. Figure 9.12 shows the self-reinforcing relationship between technology performance and adoption.

If you had not shipped the first primitive product, you would never have kicked off this process that led you to a superhuman AI price predictor. For this reason, it’s important to account for the AI virtuous cycle and ship AI products even if they’re not perfect. Ship early, and design your strategy so that each product iteration is functional to improving your algorithms, until your advantage is so big you can’t be ignored.

If you’re confident that acquiring more data is the key to allowing you to gain an advantage over your competition, you can consider unorthodox ways to get those first early adopters. An extreme strategy can be to heavily discount your new service or even give it away for free so you can quickly attract early adopters. Kicking off the AI virtuous cycle is so important that you may even plan to pay people to start using your product. You’ll make up for these initial losses after the AI virtuous cycle kicks in, your technology starts improving rapidly, and the mainstream segment starts using your new service and paying for the value it brings them.

9.4 Managing AI projects

The Lean AI Strategy helps you decide whether to build, borrow, or buy a technological solution. No matter the option, you’ll still have some development work to do. For example, even if you adopt a Borrow solution for the home-price prediction problem, the team will still have to write the plumbing code to connect the outputs of the model into the web application.

One of the worst-kept secrets in the industry is that managing engineers is like herding cats. This section is going to give you some inputs to lead the day-to-day operations during the implementation phase of an AI project. Even if you don’t want to take on this responsibility personally, or decide to outsource the whole effort to consultants, it’s still useful to know what modern software engineering practices look like.

The lean philosophy also extends to the implementation stage: it’s best to break development into chunks that an individual team member can carry out in one or two weeks. Such a rapid iteration pace makes it possible to keep up with changing requirements from the organization, and even feedback from customers and users of the product.

Figure 9.12 Because of the power of data, the performance of an ML-based problem depends on market adoption. Early users may be willing to compromise on performance, but as they use the product and data increases, you’ll improve your technology until you can reach the mainstream market.

Lean also means shipping software as soon as possible--in many cases, as soon as an MVP is ready. We have seen many teams endlessly strive for perfection, thus delaying the crucial moment when they can get meaningful information from their users.

The lean approach gives you a conceptual framework that emphasizes quick iteration, continuous improvement, and attention to feedback. But how does this work out in practice throughout the daily activities of your AI task force? This book is not about project management, but we feel it’s useful for you to gain some knowledge about how the sausage is made.

A practice that aligns well with the tenets of the Lean AI Strategy and is used by most Silicon Valley companies today is Scrum, a flavor of Agile development. Think of Scrum as a practical implementation and rule book that helps teams put the concepts of the Lean AI Strategy into practice. Quick iteration is achieved by timeboxing work in periods of two weeks, called sprints . Progress toward the overall goal is decomposed in terms of stories , which represent a self-contained item of work that delivers a complete feature to the product. At the beginning of each sprint, the team decides which stories they’re going to be working on during the next two weeks.

Because tasks are decided and allocated every two weeks, there’s plenty of opportunity to align to changing requirements, or even failed attempts to implement models or data collection strategy. In many fields of AI, it’s not uncommon to spend a few weeks developing a model, cleaning up the data, and doing the training, just to find out that the approach doesn’t perform as well as expected. In those cases, keeping a quick iteration pace is critical to ensure that the team will eventually converge on a solution.

Lean is a general concept that works great for many types of software engineering projects, especially when technical uncertainty or changing requirements are involved. This also applies to AI projects, where in many cases you won’t know how well your model performs until you’ve tried it on your task. However, for some aspects, AI changes the rules of the game, and this section tells you how. We have decided to focus on the two main aspects that are guilty of pushing AI projects off track:

You’re always worrying about data.
AI projects are never really “done.”

Let’s talk about the first. Data collection (and cleaning) and model development are often carried out by different people within your team, just because the skill sets required are different. This creates a kind of chicken-and-egg problem: the software engineers depend on the ML engineers to make progress, and the ML engineers don’t have data on hand for their experiments.

Following the principles of incremental iteration, a good way to break down this vicious cycle is to timebox multiple cycles of data collection and model development. As soon as software engineers have completed even a minimal portion of the training dataset, they can hand it off to ML engineers for them to develop and try models. In the meantime, they can move on to improve and extend the scale of their code to improve performance, while receiving feedback from ML people about the most important areas for improvement. The process starts over at the next cycle, giving you a good way to measure progress over time.

Let’s cover the second problem now: AI projects are never really “done.” If you’re working on conventional software projects, clear requirements help you figure out when the product is “done.” If you’re leading a team that’s making a home alarm system, you definitely need to make sure that opening a window will trigger the siren. As an AI evangelist, you don’t have this luxury. This is because modern AI is based on machine learning, and thus will never reach the 100% perfection that we associate with conventional engineering.

In many projects, it’s tricky to tie an organization-wide metric, like top-line growth, to the accuracy of the model. You’ll know you’re in one of those situations when you struggle to tell your team what level of performance you require out of the model. Once again, incremental iteration is the key to successfully get you out of these situations. The earlier you get a working prototype of the project out the door, the sooner you can get the organization to use it, and thus see how the accuracy of the model correlates with improvement in the organization (or happiness of the users). A great example that we have already discussed is churn prediction, as we can immediately see how improvements to the model increase retention. Even low-accuracy prototypes can be a great win for the organization and will undoubtedly help you get your message across.

We made sure to include several examples of failed AI projects and misunderstood technical points throughout the book. However, being the conscientious writers that we are, we couldn’t end this chapter without a more complete discussion about what to do when things start to go awry.

9.5 When AI fails

Working in technology has taught us that if everything was working as advertised, many people would be out of a job. Building and maintaining software is hard, and AI is no exception. If anything, AI adds more challenges of its own. This entire section is devoted to stories of AI project failures. We have two good reasons for doing this:

To warn that AI is not a silver bullet
To prove that mistakes in AI strategy can be catastrophic

The goal is not to scare you away from tackling your projects, but rather to discover how some of the strategies we introduced in this book might have saved the protagonists of these stories. We decided to leave them for the end of part 2 so you can see how everything falls into place.

9.5.1 Anki

The first example is about Anki, a robotics startup that shut down in 2019 after receiving more than $200 million in funding. Back in 2013, Anki was so promising that it enjoyed the rare privilege of being invited onstage at the yearly Apple keynote. There, Apple CEO Tim Cook told thousands of technology enthusiasts that the budding company was going to “bring AI and robotics into our daily lives.”

Fast-forward six years, and a company spokesperson told Recode this:

Despite our past successes, we pursued every financial avenue to fund our future product development and expand on our platforms. We were left without significant funding to support a hardware and software business and bridge to our long-term product roadmap.

As the company was shutting down, its most advanced product on the market was Vector, a $250 robotic toy that pioneered Anki’s flavor of “emotional intelligence,” the ability to perceive the environment and express emotions to nearby humans. And yet, Anki was already advertising that it had more ambitious plans in the works, going as far as humanoid maids.

We don’t want to indulge in armchair critique, but we do want to point out the potential imbalance we saw between short-term wins and long-term strategy. In this book, we’ve emphasized the importance of running timeboxed AI projects with quick returns for the organization. Instead, Anki was diverting resources from product development into “tomorrow’s AI,” the master plan for world domination that never actually came true. The AI-based vision of the “everyday robot” was in place, but it had never been broken into smaller bits that could be implemented over time.

9.5.2 Lighthouse AI

Another failure story comes from Lighthouse AI, a company that raised $17 million in funding to build an AI-powered home-security camera. The camera used AI to extract useful information from recorded video, and allowed you to ask about what happened while you were away, in natural language. For instance, you could ask, “What time did the kids get home yesterday?” and get video footage as an answer.

Sounds useful, right? The market didn’t agree. As CEO Alex Teichman wrote on the company’s website:

I am incredibly proud of the groundbreaking work the Lighthouse team accomplished--delivering useful and accessible intelligence for our homes via advanced AI and 3-D sensing. . . . Unfortunately, we did not achieve the commercial success we were looking for and will be shutting down operations in the near future.

Although the product was definitely impressive from a technical standpoint, it looks like the company didn’t do enough market research and experimentation before jumping into the venture. You can use different strategies to test market interest before committing millions of dollars in R&D; for instance, running a crowdfunding campaign or collecting preorders. It’s hard to say whether this strategy would have been a definitive fix, but you surely can learn from this example that even the coolest technology won’t be enough to create a market. Instead, identify the biggest business threats to your project (first of all, people not bothering to buy it) and design creative strategies to manage them.

9.5.3 IBM Watson in Oncology

Now let’s talk about another grandiose case of AI failure: the debacle of IBM Watson in Oncology. In 2013, IBM issued a press release boasting its involvement with one of the leading medical centers in the world:

The University of Texas MD Anderson Cancer Center and IBM today announced that MD Anderson is using the IBM Watson cognitive computing system for its mission to eradicate cancer.

Just five short years later, STAT (a journal focused on the health industry) reviewed IBM’s internal documents about the MD Anderson project and shared a quote by one of the doctors who took part in the pilot:

This product is a piece of s**t. We bought it for marketing and with hopes that you would achieve the vision. We can’t use it for most cases.

What happened here? This time, market demand was definitely strong: everyone on earth wants to see cancer eradicated. The first issue we see with this case is an unjustified focus on long-term AI vision. “Eradicating cancer” is as desirable as it is unlikely to be solved single-handedly by one tech company just because it closed a deal with a hospital.

On top of this, you’ll be able to spot a whole series of technical mistakes as we shed some light on the inner workings of IBM Watson for Oncology. The medical technology was built on top of what IBM had developed to play the US game show Jeopardy and attracted some attention after winning against two human champions in 2011.

Jeopardy is a quiz show in which participants receive answers that they need to guess the questions for (basically, a reverse quiz). How did IBM transform the Jeopardy model to help oncologists? It started by finding suitable datasets. One of them was a set of 5,000 medical questions from the American College of Physicians (ACP). The following are examples of question/answer pairs in the dataset:

Q: The colorectal cancer screening test associated with highest patient adherence

A: Fecal immunochemical testing

Q: Non-Hodgkin lymphoma characterized by extranodal involvement and overexpression of cyclin D1

A: Mantle cell lymphoma

Q: Mechanism of acute varicocele associated with renal carcinoma

A: Obstruction of the testicular vein

Basically, IBM Watson was using text data as input and output of the model. Since you learned in chapter 4 that text data is still the trickiest sector for AI, let’s see what specific issues IBM had to face when transferring its technology into hospitals. At MD Anderson, both “acute lymphoblastic leukemia,” a kind of blood cell cancer, and “allergy” were often referred to with the acronym “ALL”. Obviously, cancer and allergies are two very different medical conditions, but Watson couldn’t distinguish between the two because they used the same acronym.

Another issue is that Watson struggled to consider multiple aspects of a patient, leading to potential disasters. For example, it suggested that a 65-year-old man with diagnosed lung cancer and evidence of severe bleeding should be treated with a combination of chemotherapy and a drug called bevacizumab, which can cause “severe or fatal hemorrhage” and shouldn’t be administered to patients experiencing serious bleeding.

What can we learn from this? We already knew some of the things that emerge from this article: text data is hard to deal with, especially when you’re working in an extremely complicated context. In the width-depth framework we introduced to describe the complexity of an AI application (chapter 4), Watson would score extremely high in both width and depth. Width is high because the model needs to understand a vast spectrum of words coming from different branches of the medical field, and all the potential acronyms and abbreviations as well. Depth is also extremely high, as we’re asking our algorithm to come up with a sentence that describes a potentially complex combinations of drugs.

The challenges in working with this kind of data were recognized by Dr. Amy Abernethy, chief medical officer at Flatiron Health and former director of cancer research at the Duke Cancer Institute. In an interview for a paper published in the 2017 Journal of the National Cancer Institute , she stated, “The MD Anderson experience is telling us that solving data quality problems in unstructured data is a much bigger challenge for artificial intelligence than was first anticipated.”

One way to simplify things is to reduce depth by posing the problem as a classification task dealing with structured data. Another is to reduce width by limiting the application to a specific kind of cancer. By applying both measures, instead of feeding the algorithm with wordy sentences filled with medical jargon, they would have used a table with patient information expressed in values (for example, “blood pressure = 110mm Hg”). The algorithm would have responded by choosing one out of a number of suggested therapies known in advance. A project like this is certainly less cool than an all-knowing AI that responds in natural language, but we argue that it’s better to have something simple that works than something complex that doesn’t.

9.5.4 Emotional diary

Finally, let’s cover two examples from our consulting work that you won’t find in the news. We changed some details to protect the privacy of our clients and respect our confidentiality agreements, but the lessons are still valid.

The first example is actually more of an averted disaster. A health-care company wanted to build an app to give emotional support to couples undergoing fertility treatment. Not only are these couples stressed out because they have difficulty conceiving, but the woman is also prescribed hormones that can have a strong impact on her mood.

The original idea was to build an “emotional diary” app for the couple. Every day, it would ping them to upload a selfie with a smile or a sad face. An algorithm would recognize their mood and answer with an appropriate motivating sentence. At the end of the therapy, the app would have created a “diary” of the emotional journey that the couple went through, hopefully ending with the beautiful gift of a child.

The company had already started to collect quotes from various technological providers and had come up with a marketing plan to promote the app across treatment centers. One morning, we discovered that Microsoft offers an “emotional detection API”: a simple service that allows you to send a selfie to Microsoft’s servers, which run an AI algorithm and return their evaluation of the emotion expressed by the person. We asked to pause the strategy for a day. The day after, we came back with a prototype app linked to Microsoft’s service, which already included a few sentences to encourage the couple.

We took our bare-bones prototype and went to a fertility center to talk to some users. The verdict was unanimous: the women wanted to throw that phone at a wall. What we thought would be a supportive tool was instead seen as a disrespectful invasion of their privacy and intimacy. They were already stressed and doubtful they would ever become mothers; the last thing they wanted was an app that wanted selfies to send cheeky motivational quotes.

We canceled the program and went back to the drawing board. Fast prototyping and the use of APIs allowed us to find out early in the process that we were about to go full speed right into a wall.

9.5.5 Angry phone calls

Another example is a company that needed an AI algorithm that could identify emotions from voice recordings. During a phone call, the app should indicate when either speaker is angry, sad, happy, or neutral and provide suggestions to deal with the negative feelings. Our client came to us claiming that they had already found a technology provider that was reliable, and we dove head-down into execution without questioning the technology. Together, we drafted a comprehensive strategy, a business plan, and designed a minimum viable product (MVP) that integrated the third-party technology. Once we shipped the MVP, we found out that the performance of the tech provider was nowhere close to what they claimed (and we expected).

Apparently, the software was designed around a dataset of German speakers. We needed to apply it to the Italian market, and it turns out that an angry German and an angry Italian sound quite different. After acknowledging that the vendor was underperforming, we dug into the scientific literature to learn about the state-of-the-art performance in this task. We found out that even the state of the art was far away from the 90%-plus accuracy achieved in other AI tasks. Lesson learned? Test, test, test. Always question and validate what you read. Even if a technology provider claims to have a great product, spend one or two days to research the state of the art and to test whether it can reach the performance you need.

9.5.6 Underperforming sales

In another example, a large corporation asked us to cluster its sales data to spot underperforming customer accounts. The request came from the IT department, which managed large amounts of data and wanted to prove the value of that data collection effort. We ran our analysis and found a bunch of underperforming stores through which the company could have made $8.5 million more per year by changing the sales strategy! Everyone was excited, so we crafted a beautiful presentation for the CEO. He interrupted us after just a few minutes saying: “Guys, you’re not telling me anything new. We made a special deal 10 years ago with these stores you call underperforming; I can tell you that the profit is very high with them. You’re looking at the wrong metrics.”

It turned out that the stores in question had custom volume discounts that were booked differently, and therefore didn’t show up in the database we used. What do we learn from this story? Always include the business perspective when designing an AI project. Don’t fall into the trap of designing AI projects with a tech-first approach: make sure the business value is there before thinking about algorithms.

You probably noticed that none of the solutions to these fledgling projects was to invest in better technology. The hero they needed is not a super-skilled data scientist, but an enlightened leader who understands the principles we covered in this book and has the domain knowledge, critical spirit, and drive to bring AI into an organization.

Congratulations! You basically (almost) made it to the end of the book! Part 2 covered a lot of material with real-life situations and stories that will help you find and complete impactful AI projects in your organization. Together with the groundwork that we laid down in part 1, you’re now ready to think critically about how AI can benefit your organization. Take a deep breath and get ready for the last chapter, where we take a step back and discuss how the AI tools you’ve been learning about will shape society in the future.

Summary

When building an AI project, you must decide whether to build technology, buy it from tech providers, or borrow it by using a mixture of third-party technology and your data.
The Lean AI Strategy helps you minimize implementation risk by guiding you in the build/buy/borrow decision-making process.
The AI virtuous cycle allows you to continuously improve your AI models by exploiting the new data you’re collecting with your product.
AI isn’t a silver bullet. Countless companies have failed at AI, often because of poor strategy or lack of market interest in their AI-based products.