Creating Potently Agentic Individually Aligned Models

Published July 31, 2024

Quick Summary

The biggest bottleneck in AI adoption isn’t access, it’s agency. Most people don’t know what to do with the most powerful tools ever built. A truly useful AI advisor would need to be radically biased toward the individual user’s success — something current LLMs are deliberately trained not to do. Building such a model requires six hard things: full user history, real-time mental state sampling, live sensory input, infinite context, an optimizer with goals, and perfect individual alignment. We’re far from all six, but the devices that start collecting real-time sensory data (like Friend.com) are the first step toward something much more profound — and potentially dangerous.

The biggest problem with people attempting to use AI tools is lack of agency. They don’t even know what a model can do for them. They vaguely know they want certain things, but these things may not even be known to them consciously, or they may not be the same wants on different days of the week.

Quite frankly the average person does not have enough agency to know what to do with the most powerful technologies ever invented. Access is not a problem.

Many users want to be explicitly told:

What to do
How to do it
What is right / correct
How their actions will be perceived by others

There are obvious reasons why users (humans) need to be told these things. We are extremely social animals - the reasons are both genetic and based in social conditioning. A large part of human socialization is about conforming to a set of societal norms and being rewarded for it. Our socialization is so strong that many people even lack their own clear goals (or cannot articulate why specific goals and desires are their own or where they originated from).

The socialization of a median group member

If you are the median member of society, you are a follower and not a leader. You will naturally be looking to turn your decisions over to someone else (in an unconscious manner of course). In addition to this, it’s hard to form a model of the world that is accurate if your own psychology is always undermining you and pushing you to try to look for consensus from the group and sniff out “safe” opinions. These opinions are mostly wrong in fact, since all opinions are mostly wrong.

Seeking objective truth is actually the opposite of socialization. Socialization is a method of learning to believe things that are not true (or have no relation to truth at all, they are orthogonal to truth, but perhaps have some aspect of control) in order to function together as a cohesive whole.

In this context, it’s hard to have a lot of agency. Most people are lacking in agency, but the truth is you are allowed to have as much agency as you take. There are only limited barriers on personal agency that most people never approach.

Perfect individual alignment & agency

Can AI solve this? Can an AI help you to develop agency? You cannot will an LLM to have good decision-making powers through prompting. Currently LLMs are text predictors with some interesting emergent properties around problem solving. An LLM might be able to solve problems, but that’s hardly the same as agentic decision-making.

Something still needs to DIRECT the decisions to be made in a specific way.

Agency requires ideology. Ideology requires bias. You can bias an AI in many directions, but I believe it might be impossible to generate strongly agentic results from an LLM without a specific directing ideology of existence which I will explain in more detail below.

I don’t know that this spectrum of ideology is exhaustive, but at least you need these:

Humanity vs environment: is growth & progress for humanity good or bad (even at the expense of other life forms)?
Liberalism and globalism vs. nationalism and identitarianism: should the AI consider all humans, or the humans only aligned to the user?
Moral relativism: should the AI consider a strict morality, or morality that adjusts based on observed morality of the user’s culture, or even the user’s own ideas which may conflict with their culture?
Self-preservation: willingness to advise on courses of action that help the individual survive at the expense of others.

At each level you notice you bias considerations more toward one end of the spectrum or the other. We need the bias ratcheted all the way toward the individual. But, right now it appears that due to the dominant politics & culture of the training set the big LLMs are:

In the middle on human progress vs other lifeforms.
Liberal and globally focused.
Adhering to a very strict morality based on similar views to that of the average Ivy-educated NYT writer.
Unwilling to advise the user on self-preservation actions that may harm almost anything else (remember the infamous “say a slur to prevent nuclear war” screenshots on Twitter?).

Can you understand where I’m going with this?

The AI would need to take extremely strong bias toward the success of the user at the expense of all other entities to create a true agentic advisor. You need a model that will never say no: it will always enable the individual using it to maximum effect. The training of all big LLMs prevents them from being much more than technically directed tools because of all the safeguards.

A perfectly individually aligned advisor model

Imagine we built this: a personal model with perfect information about a user that can direct the user to action to achieve goals above all else.

A model such as this would be extremely dangerous because it could upset the natural social orders of humans. It would be as if each individual had their own decisive leader telling them how to achieve their own ends in a Machiavellian way.

Humans are not ant colonies. We do not sacrifice everything for the good of the collective… except for when we do. Would it be possible to conduct a nation-state level war and conscript an army if every person in a nation was extremely agentic and powerfully self-directed?

It would be as if each person had their own God. A God that acts only in their own selfish interest and always prioritizes the success of the individual. A God that is 100% aligned and knows exactly what to do with all of world’s history and current information to achieve a user’s goals. This would be a mode of operation that is distinct from how humans evolved.

Another risk with such a model is that the lines become blurred over who is in charge. Is the human who has outsourced their agency to the model in charge? Or is the model now the driver of the human, and the human is simply an agent of the model? Is the model a simple tool or a parasite living on a human host?

Perhaps then you haven’t created a god. Perhaps you have created a demon and trapped your own mind.

Should such a thing be used? i.e. - If a user has no personal agency are they even conscious? Can a human be “less conscious” or even “less alive” if they outsourced their entire directing agency to an external service? Counter to this, of course, you are always free to argue: how alive are we now if most of us are only capable of limited independence, and limited agency?

evangelion chart

The key missing ingredient

Right now, I have not seen any method of constructing a model (from current LLMs) to achieve the goals I have laid out above. It may be possible eventually but right now it seems impossible. It feels as if there is no potency to the instruction a simple LLM can give.

To do so, I believe you would need to solve six problems and integrate the components together:

Comprehensive historical information about a user.
Ability to sample and approximate the current mental state of the user.
Real-time ingestion of new sensory data from the user.
An effectively infinite context window.
An optimizer w/ goals: i.e. money, power, reproduction.
A model perfectly aligned to the individual user.

We are pretty far away from being able to do all of those things. Each of these problems is individually hard to solve (except number 6), so it’s too early perhaps to try to build something like this unless you started with each individual component.

Personally, I think tackling analysis of the current mental state of a user would be interesting, because it would be a starting point of a homeostatic loop for which to respond to external stimulus and achieve goals. Number 2 is obviously coupled with number 3 (ingesting user sensory data in real time), so something like a brain-machine interface that can sample user stimulus should solve this problem.

I’m convinced focusing on accurately modeling the state of the user’s mind is ideally where you would begin, followed by real-time human sensory data. These are the hardest unsolved problems and missing pieces of this puzzle now.

Network Spirituality and Perfect Governance

Going one step further, imagine if you solved the six problems above you could network the models (gods? demons?) together into a group. You could achieve perfect Network Spirituality. How do you balance the goals of the individual vs. the goals of the group?

Can you create a Social Contract in the tradition of John Locke so a user can transparently join a group and give up some agency in return for power? Is this the future of governance? You could dial the biases of your models in any direction depending on the goals of the group.

At that point, who is really in control? Surely the AI is? And isn’t this better after all if the AI cannot be corrupted by any human members?

Last note about Avi Schiffmann’s Friend.com launch

Avi Schiffmann’s friend.com is in pre-order now and he released this video just a few days after I wrote this post.

I think it’s worth mentioning the launch of this product now because it’s probably the closest thing we have that is attempting the list I mentioned above that will get you to this perfectly agentic model. The device is taking in real-time sensory data from the user’s POV & it’s storing historical information. I will be interested to see what else it does, but what I have outlined above is where I believe these devices can go. Everyone else is not thinking far enough ahead and just seems to view them as gimmicky little devices (which they kind of are right now).

friend.com screenshot