The Value of Information at the Edges of AI Knowledge
A view from Tokyo skytree where you can see denser areas of the city and less dense areas as you move further out
The current setup
Large language models can be thought of as housing a map of information on many layers of a high-dimensional space.
This is a simplification but consider if you modeled the information of the entire world and everything in it conceptually on a single 3-dimensional space and everything was related to each other.
The well-known factual information, or let’s just say repetitively “known” information (i.e. it can be objectively wrong but many people could reference and believe it) would be towards the center of this 3-d space. It would end up looking like a semantic cloud.
If you can imagine this as a density cloud, the well-known information would be in the darker more dense parts of the cloud.
The less well-known information would be in the periphery.
Again that’s a simplification but that’s a useful model to think about information today.
Personally, I have a hard time visualizing any dimensional space beyond 3 dimensions but I suspect if you are human reading this, you might also have that issue.
The well-known information
The well-known information is easily accessible, and because today every LLM (ChatGPT, Gemini, Grok, etc.) seems to be building essentially the same information map, we can assume that information like:
- The number of amino acids.
- The physical location and factual information about Lake Superior.
- The albums produced by The Beatles.
is all quite “well known”.
Therefore, while it is valuable, it’s valued at some kind of very low fixed cost.
Let’s say it’s worth something like $20 a month ^_^
Non-well-known information
Conversely non-well-known information, information at the edge of this map is more of an unknown.
This information - which I would call… “edge facts”? Actually let’s call it “Esoteric Data”.
I think this term is good because esoteric data accurately conveys that the data is obscure but:
- It does not imply value - it could be worthless or priceless
- It does not imply veracity - it could be objectively factual or completely fictional.
Esoteric data is where all value lies
The very important thing is that now - in the periphery of the global knowledge graph, this fog of esoteric data, this is where all the action is.
If you are doing anything in business, a lot of what you are doing is solving problems for people. But in many cases part of the reason you are solving those problems is because you have some special combination of information or training to make it easy for you to solve the problem.
For example - you are selling coffee online. OK, simple. You have some really specific experience related to a special way to brew coffee with a specific kind of bean. It’s very niche. But part of the way in which you win is this:
- You love coffee yourself
- You know how to appeal emotionally to coffee drinkers
- You know about some special subset of coffee drinkers like the special way it’s brewed.
- Generally you know where this niche customer base lives, what else they like, and you know how much they are willing to pay, etc.
- You can probably even guess what specific copy will hyper appeal to this specific niche because you also share life experiences with them on average. E.g. you like the same causes, bands, hobbies that overlap.
Do you see how 1, 2, and 3 maybe are all probably well-known information?
But, as you get down to level 4 and 5 it gets harder. That information is hard to get from an LLM in current form. It’s hard to get unless you know it personally through life experience.
This is the esoteric data that gives you an edge in business. A mega-corporation cannot compete with you because they cannot hire a random person off the street to know what marketing copy to write. They would need to do market research and A/B testing, and even then the market would be too small and they will never get it as right as you would on the first try, because you yourself are the target customer.
Lots of data does not imply connecting it well.
A couple of pieces of conjecture:
-
In the era of “data is the new oil” (2012 to like 2021 maybe?) everyone wanted to collect data and “gain valuable insights”. But generally speaking unless you went into that with a specific plan 99% of the time the data was never used and amounted to nothing. So just collecting data often turns out to not magically make your business more efficient or whatever.
-
It’s been remarked upon that LLMs seem to have all of this information in one place but they are not great at generating novel new insights from it. E.g. if you were a physicist + actuary, maybe you should have some unique experience in either direction to apply to either research or insurance. But the LLMs never seem to be able to come up with Eureka moments like this. And you would expect they could.
So given this, it still seems that there is something about humans going out in the world having these experiences then being forced to connect generates real unique information in a way you cannot synthetically in a purely digital system.
This is my thesis:
This esoteric data that humans generate physically in the real world will continue to be valuable, and in fact become more valuable in the future as the well known information becomes more ubiquitous and connected.
Gargoyles
You probably have read or heard of Neal Stephenson’s Snow Crash and may remember the concept of “Gargoyles”, which were people that strapped a bunch of data recording devices on them to record data and sell it.
Smartphones are one step in this direction. Arguably we are all already gargoyles, but add in things like smart watches, health trackers, and AI devices like Meta AI Glasses where you have way more opportunities to record more data - this entire phenomenon will become more intense as time goes on.
The demand for new data - this data from the periphery of the world model, this esoteric data will increase and some of it will be valuable.
Side note - I just took a trip overseas and spent some time playing around with the Meta AI Glasses and they were super fun. I took way more photos with them on and I did use the AI assistant sometimes. But to be honest mostly I used them for taking photos, taking a couple of FPV shots, and listening to podcasts while walking around.
Nara deer through the Meta AI glasses (FPV)