Skip to main content

The AI Plateau Is Real — How We Jump To The Next Breakthrough

Imagine asking an LLM for advice on making the perfect pizza, only to have it suggest using glue to help your cheese stick — or watching it fumble basic arithmetic that wouldn’t trip up your average middle school student. Such are the limitations and quirks of generative AI models today.

History tells us that technological advancement never happens in a straight line. An almost undetectable buildup of knowledge and craft is met with a spark, resulting in an explosion of innovation, and eventually reaching a plateau. These innovations, across centuries, share a common pattern: The S-Curve. For example:

  • TCP/IP synthesized several innovations dating to the 1960s. After the initial 1973 release, development significantly accelerated, eventually stabilizing with v4 in 1981, still in use on most of the Internet today.
  • During the Browser Wars of the late 90s, browser technology experienced significant improvements. A passive terminal became a fast, interactive, and fully programmable platform. Transformation among browsers since then is incremental by comparison.
  • The launch of the App Store led to an explosion of innovation in mobile apps in the early 2010s. Today, novel mobile products are few-and-far-between.

The AI Plateau

We have just witnessed this exact pattern occur in the AI revolution. Alan Turing was one of the first computer science pioneers to explore how to actually build a thinking machine in a 1950 paper. 

Over seventy years later, OpenAI seized on decades of sporadic progress to create a Large Language Model that arguably beats the Turing Test, answering questions in a way that is indistinguishable from a human. (Albeit, still far from perfect.)

When ChatGPT was first released in November 2022, it created a global shockwave. For a while, each subsequent release, and releases of models from other companies like Anthropic, Google and Meta, offered drastic improvements.

These days, the incremental progress of each new LLM release is limited. Consider this chart of performance increases of OpenAI’s flagship model:

MMLU is a benchmark designed to measure knowledge across 57 subjects across STEM, the humanities, the social sciences, and more
MMLU is a benchmark designed to measure knowledge across 57 subjects across STEM, the humanities, the social sciences, and more

Although every benchmarking system has shortcomings, clearly the pace of change is no longer setting the world on fire. What’s needed now, and what we hope is coming, is the jump to the next S-Curve.

We believe we understand what’s caused AI to plateau, and what is needed to make the next jump: Access to the next frontier of data.

The Next Curve: Proprietary Business Data

Today’s large language models are trained on public data from the internet. But public textual training data on the internet has long been harvested (think Github, Reddit, Wordpress and other public web pages). This has forced AI companies to scavenge other sources. For example, with Whisper, OpenAI has transcribed a million hours of YouTube videos for GPT-4. Yet another tactic is to employ low-cost, offshored human “labelers” through services like ScaleAI.

Model providers can keep following this path (there are an estimated 150 million hours of YouTube videos, after all), but they won’t escape the flattening S-Curve this way. Improvements are likely to be marginal; returns, declining. Synthetic data is another pathway, but it has its own limitations and weaknesses.

We believe the real breakthrough that will allow humanity to jump to the next S-Curve is data produced at work. Workplace data is of far higher quality than what’s left of public data for training purposes, especially compared to running the dregs of the internet through the transformer mill. (The results of which may be why a lot of AI-generated content is already being called “slop.”)

A product spec, a sales deck, or a medical study produced in a work context is many times more valuable than an unverified Wikipedia page or Reddit post. Even better is when this work comes from an expert at the top of their field.

Startups that unlock the world’s business data will be poised to create many more multiples of value. As a proxy, we compared the average revenue per user (ARPU) of top consumer apps against the per-seat price of select B2B apps. Even the most “consumer-oriented” business apps, such as Notion, still earn much more revenue per user than consumer tech companies:

The math is simple. The value proposition of AI for B2B is vast, and, we believe, still largely untapped.

Meanwhile, knowledge workers continuously produce business data at an incredible cadence:

  • In 2020, Zoom captured 3.3 trillion meeting minutes — 55 billion hours — of dynamic human engagement. This dwarfs the estimated 150 million hours of total YouTube content.
  • Ironclad processes more than 1 billion documents annually.
  • Slack delivers over a billion messages per week.

Data produced in a work context will drive the next S-Curve.

A Slippery Slope

As LLM providers start to tackle business use cases (see OpenAI’s Rockset acquisition and Anthropic’s recent launch), enterprises are right to be wary. OpenAI and Anthropic today claim that they do not train models on data from business tier subscriptions. History tells us that the pressures of growing their business might force them to backtrack. 

Take Facebook as an example. Meta long claimed to ignore users’ activity on partner websites while they were logged out. $725M in privacy-related settlements later, it’s still gobbling up consumer data at a massive rate. As a cloud software pioneer, Salesforce originally committed that all customer data would not be shared with third-parties. Their current privacy policy negates this.

History repeats itself, but this time, the stakes are higher. With the rise of cloud, SaaS applications were primarily used in “non-core processes” – anything absolutely core to a business would be built in house. With AI, data being fed into closed-source models could include everything from a company’s knowledge bases to its internal processes, contracts, PII, and a host of other proprietary, sensitive data. 

All this rich context composes the sustainable competitive advantage for businesses. In order for businesses to protect what’s theirs, we believe that they need to own their own proprietary models.

Just as the New York Times is fighting to protect its IP, businesses should resist the big AI companies’ appetite to harvest their proprietary data in the manner they did with public data.

To fully leverage the brilliance within their organizations, businesses should own their models. Owning their models allows them to continuously improve while sustaining their competitive advantage. We believe this is the right way to make the jump to the next S-Curve. 

Big AI companies are rapidly becoming incumbents, but all is not lost. We have identified four areas of opportunity for new startups to solve the AI plateau in a way that is compatible with the needs and imperatives companies face.

Four Key Opportunities

These are the four key areas of opportunity emerging for new startups. We’re already seeing each of these areas experience a large market pull, making them fertile ground for new disruptors.

1. Engage Experts

There is a large opportunity to create novel ways to source AI training data. The highest quality data will come from experts in each field, vs today’s services which primarily utilize crowdsourced human labelers.

Opportunities

  • Build community
    Sourcing expert knowledge requires startups to embed themselves in communities with top-tier talent, vs traditional offshore teams. Centaur Labs has built a network of thousands of doctors, researchers, and medical professionals. Turing, which started off as a staffing agency, now uses their network of 3 million software engineers to support model providers in data labeling and RLHF.
  • Explore novel incentive structures, such as gamification
    Datacurve collects high-quality programming data by turning customer data requests into gamified “quests,” recruiting engineers graduating from top universities to earn rewards as they solve them.

2. Leverage Latent Data

A treasure trove of data already exists in an organization’s business apps (think Salesforce, Notion and Slack). There is an opportunity to help enterprises prepare this data for model training or inference. OpenAI’s recent acquisition of Rockset, which will power the retrieval infrastructure of the ChatGPT's enterprise products, speaks to increasing investment in this area.

Opportunities

  • Help businesses prepare internal data for AI use cases
    Unstructured and Reducto help ingest complex unstructured documents for use with LLMs, for example.
  • Create next-generation data frameworks
    Connect data in business apps (Salesforce, Notion, Google Drive, etc) for use in models. For example, LlamaIndex allows enterprises to load in 160+ data sources and formats for processing.
  • Help companies identify risks, gaps, and contradictions
    For example, Shelf helps identify inaccuracies and risks within a company’s knowledge assets.

3. Capture in Context

Allow businesses to capture net new data that is continually generated each day. Do this without interruption to employees’ normal workflows, instead of through out-of-context data tagging initiatives.

Opportunities

  • Capture human brilliance
    Our 2017 thesis on AI-based Coaching Networks speaks to why collecting in context is so important. Apps like Chorus (acq. by ZoomInfo) and Textio allow organizations to guide workers into doing their jobs more effectively, and make better decisions across the organization.
  • Create a “universal capture layer” that works across applications
    Microsoft Recall was one attempt at this, but the winner in this space will need to build with privacy and security top of mind.
  • Move beyond text
    Most paradigms around training data are still text-based, losing the full richness of our work. We believe there is a large opportunity to help companies understand and make sense of various types of multimodal content. For example, Superlinked is building a platform that transforms any type of data into a vector embedding. Laminar is building tooling that captures the full richness of data within AI+human co-creation experiences.

4. Secure the Secret Sauce

Help enterprises to create and deploy their own custom models, letting them stay in control and protect proprietary IP.

Opportunities

  • Help enterprises build and deploy their own custom models
    Do this quickly and cost effectively including utilization of open-source models. Our investment in Together AI is one example of our conviction in this area.
  • Ensure model alignment
    Businesses need models to be aligned with their goals and values. For example, Credo and Holistic are working on tools for governance and visibility into all of a company’s AI models.
  • Protect individual users through on-device models
    Techniques such as federated learning (which Apple is heavily investing in to protect consumer privacy) allow models to be trained without sensitive data leaving a user’s device. Frameworks developed by Flower and FedML are helping organizations productionize this technique.

This is just the tip of the iceberg: There are myriad opportunities to solve the AI plateau, leaping to the next S-Curve in AI performance. This is just the latest chapter in the ongoing story of human technological advancement.

It’s crucial that this next wave of technology looks like those that came before it. Advancement in AI should be based on human discovery and knowledge, crafted with human-centric attributes of privacy and quality in mind. In the dance of co-creation with AI, humans need to lead.

If you’re building an enterprise-focused AI tool that is attempting to solve these problems, we’d love to hear from you. Send a note anytime to gordon@emcap.com or wendy@emcap.com.