Why Building On LLMs is a Gold Rush

People are excited about LLMs. AI has hit the top of the peak cycle again, peaking higher than ever before. Unlike a couple of previous hypes (*cough* *cough* metaverse web3) the new capabilities already have been built into all sort of valuable tools. LLMs have a lot limitations, too, but surprisingly its their limitations that are largely driving the opportunity for startups.

I think what has excited us about LLMs are their emergent capabilities that don't seem memorizable or even obviously generalizable from the input corpus of human digitized language. LLMs have absorbed proficient levels of knowledge across many fields. Even better, large scale neural networks like GPT-4 have generalized the ability to critique, refine, guide, and synthesize. This has an enormous value. These machines have emerged in our world as quick thinkers we can use as advisors. Oddly impersonal, memoryless, invoking human culture, like some weird sort of summoned disincarnate intelligence. Now real in our world.

LLMs may be stochastic parrots, but from one point of view so are we. Our brains learn from exposure to language, we learn priors, and these priors inform the words we generate (and our internal dialogue.) Linguistics based on hypothetical models of the inherent structure of languages seem like so many castles built on sand. Learning it seems is a fundamental process, not dependent on substrate, and like other processes in the Universe is stochastic by nature. Randomness in learning may start generally with the environment. The pejorative that LLMs explicitly model conditional probabilities rather than understand some elegant model of language ignores the much more important nature of learning. More pragmaticly, though, LLM architecture does place limits on what they can do.

And I think it is their limitations which are really what is powering the gold rush for developers similar to the one for iPhone apps. Without these limitations, one single app would rule everything and there wouldn't be as much room to play in building on top of LLMs. And there are a lot of things that LLMs can't do! Do you trust their math? Do you trust them not to harm people? Are they consistent? Factful? Correct? Can they reason? Can they generate randomness if needed? A lot of effort is put into figuring out how to get LLMs to be more capable. But we already know how to program computers or use human judgment to do many of these things. Wolfram Alpha, for instance, is an incredible pair for an LLM.

The gold rush lies in designing ways to combine traditional computing with LLMs, getting the user interface and business model right, and then bringing that to market in a space where the NFT grifters have moved in. This is much more than a simple wrapper around an LLM API, complete with - in my opinion - defensibility.