Four steps enterprises can take to avoid their next earnings call becoming a retrospective on AI gone awry
AI can power phenomenal revenue growth – until it doesn’t. That lesson is being learned the hard way at a growing number of companies where issues with AI systems are not caught and remedied before materially impacting revenue.
The latest example is Unity Software, a platform for creating and operating interactive and real-time 3D (RT3D) content. On its most recent earnings call, Unity revealed that it missed top line expectations and lowered its revenue guidance for the rest of the year due in part to a “self-inflicted wound” in AI.
Specifically, the company’s CEO and Executive Chairman John Riccitiello cited several issues related to machine learning (ML) models that caused an estimated impact to the business of approximately $110 million in 2022:
- Performance Degradation: The first problem “was a fault in our platform that resulted in reduced accuracy for our Audience Pinpointer tool, a revenue expensive issue given that our Pinpointer tool experienced significant growth” in the wake of privacy changes by Apple. For context, Audience Pinpointer is an ML-powered ad targeting tool that leverages Unity’s first party data to help marketers better reach specific audiences.
- Data Quality Issues: The company also “lost the value of a portion of…training data due in part to us ingesting bad data from a large customer.”
- The Need for Robust, Real-Time Model Monitoring: As part of a solution, the company is “deploying monitoring, alerting and recovery systems and processes to promptly mitigate future complex data issues.”
A Call For Unity
When AI fails on the public stage like this, the temptation to pile onto whatever company is on the chopping block is sometimes irresistible (see: Zillow). Data scientists and machine learning engineers reflexively talk about why company X’s approach to infrastructure or building models is subpar. Vendors publish “I told you so” pieces arguing their platform could have single handedly prevented the problem. And journalists dissect every way the company went wrong. While this cycle is mostly healthy – an industry examining itself and learning valuable lessons – it sometimes glosses over the bigger story.
Here’s the truth: this could happen to hundreds of companies. It isn’t just about Unity or Zillow or any other company in the spotlight; problems with models are likely lurking undetected across every industry, waiting to be uncovered. A growing number of enterprises are even disclosing as much on their annual reports. According to a recent paper, 47 companies – around one in ten (9.4%) of the Fortune 500 – cite AI and machine learning as a risk factor in their most recent annual financial reports, an increase of 20.5% year-over-year. This probably understates the risks given the large number of companies leveraging AI in production.
The good news is that there are best practices for preventing common issues with ML before they materially impact revenue. Here are four steps every company can take to better manage AI risk from an organizational and technical perspective.
1) Know What Can (and Will) Go Wrong
When it comes to deployed AI, it is a matter of when – not if – models will encounter issues in production. Unlike the largely rules-based system of software development, successful outcomes in machine learning are dependent on not just system health but also the various complexities of models and underlying data layers. Concept and feature drift, training-production skew, cascading model failures, data pipeline issues, and outliers challenge even the most sophisticated machine learning teams deploying models that perform flawlessly in training.
2) Implement ML Observability
Of course, knowing there is a problem is only half the battle; teams also need to figure out why. 84.3% of data scientists and ML engineers cite the time it takes to detect and fix issues with their machine learning models as a pain point today, with over one in four saying it takes them a week or more. Full-stack machine learning observability with ML performance tracing can help close this gap by helping teams automatically pinpoint the source of model performance problems. Leveraging a platform like Arize, teams can automatically surface the cohorts where performance impact or drift impact are highest and adjust accordingly.
3) Invest In the Right People
In its most recent annual financial report, Unity Software flagged the company’s ability to compete for talent as a potential risk factor, noting that “competition is intense” for both “engineers experienced in designing and developing cloud-based platform products” and “data scientists with experience in machine learning and artificial intelligence.”
While this is an industry-wide problem that eludes easy answers, companies can avoid unforced errors. One common mistake that companies make when investing in AI is hiring too many data scientists and not enough machine learning engineers. While data scientists are essential for building an organization’s first models, they often lack the technical skills for getting a model into production and maintaining it once there. Machine learning engineers specialize such tasks and can help ensure an organization has an ML stack to scale AI efforts before the first model is ever deployed.
4) Ensure ML Teams Are Closer to the Businesses They Serve
Great things happen when model builders and product owners are in alignment; the opposite is also true. If businesses add centralized machine learning teams or AI centers of excellence to their technical organizations, one thing that should not get lost in the process but often does (more on this in a future piece) is tight collaboration with internal clients. Technology can help. Leveraging ML observability, data scientists and ML engineers can tie model metrics to business results – sharing the results with product teams and business executives.
While the public spotlight can be unforgiving and at times unfair, Unity Software deserves a lot of credit for its transparency and pointing a way forward for the whole industry. AI risk is much bigger than just one company, and leveraging tools like ML observability to better detect and troubleshoot problems as they arise should be table stakes.