What they basically say:
* Temporal nature of the world (things happening in real time, continuously) and the constant interaction of the humans with it, is critical for the functioning of human intelligence.
* Humans constantly use routines, to function in the world. It allows them to save tons of computational energy.
* Humans use mental routines, when they are achieving goals, solving problems, puzzles.
* You cannot model good intelligence, without a mechanism for formation of routines and their usage.
Why do they say, that good AI cannot be created, if it is not in constant contact with continuous time, real time environment? Because this constant interaction with the environment, allows us to remove the need to make prediction in 95% of cases. It allows you to use much simpler routines, that still achieve highly accurate results. Saving tons of energy, computation, memory. It allows you to remove the need for 95% of memories.
Example:
Lets say you want to dive into a pool. But then realize that it might be very cold.
There are 2 things you can do:
make a prediction about probability of the pool being cold, from the previously known information. Make plans, predictions, and then decide in the spot, to jump into the pool, cannon ball style.
Just put your finger in the pool. If its cold, you would decide to not dive into the pool.
For 95% of the tasks that humans constantly encounter, the second way of achieving goals, of doing tasks, is sufficient. Because truly, if you used your full cognition for literally every single micro-decision you have to make, your brain will just get fried. It simply won't be able to keep up with updating time. By the time you make a prediction, plan, goal, and decide to take an action based on it, the time already went by, and you have 10 more tasks you need to urgently finish.
In this particular instance, the second solution, is a routine for automatic error-correction, self correction. Sure, your finger is now wet. But that is not a tragedy, it is a trivial loss. Yet it allowed you to avoid having to plan, predict, define goals, etc. in this scenario, saving tons of brian energy.
There are 100s of such error-correction, self-correction routines in the human brains, that allow you to avoid having to make predictions, plans, etc. saving tons of brain power and time.
Second example:
You guys probably have PC or laptops. Well, you don't need to plan every day, to sit in front of it. What happens, is that you see your PC, and that activates a habit/routine in your brain, that makes your turn it on and scroll reddit. Planning is unnecessary here, because the environment itself serves as a trigger, for an appropriate action, in appropriate time and place.
Now this makes it more obvious, why LLMs are very problematic for achieving general intelligence. Because they are cut-off from the constant interaction with the world. Making them hugely reliant, on planning, prediction making, goal driven behavior, because it cannot leverage the interaction with the real world, to develop simple routines, to course correct its behavior along the way.
By this analogy, Language models do use their full 100% cognition for every micro decision they have to make. Unlike us humans.
Fun fact - a "disadvantage" of liquid neural networks, is that they can only be trained on temporal, continuous-time data. Like video, audio. And not on text. Constant interaction with the world, is the life and blood of liquid neural networks! It literally cannot function without it. Just like real human cognition.
(To clarify, there are liquid network based language models, so it is possible to find a solution around this problem. But by default liquid networks cannot be trained on non-temporal data.)
What is a routine? Let me give you examples of mental routines we use, when we solve problems, puzzles.
* When you ride a bicycle, do you constantly predict the position of your body, its inertia, etc. based on laws of physics, using formulas, and then after making a prediction, adjust your actions, then make a new prediction, again and again? No, you just ride the bicycle, without awareness of such calculations. Because such calculations are not happening. Such predictions are not happening. What happens in actuality, is that you simply developed routines, for self-correcting your center of mass. When you lean slightly right, more than you should, it simply triggers a routine in your brain that makes you tilt slightly to the opposite side.
* We use the same invisible routines, when we solve problems. Example: When you have an object at hand, you are capable of instantly seeing how far you can throw it, what trajectory it will follow, and where it will roughly land. This is problem solving. Yet, you perform it constantly, without using any kind of physics formulas. Because humans have developed effortless mental routines, for correctly throwing things.
And there are 100s or more such routines we use for problem solving, that we are simply are not aware of, that we can't explicitly write into the AI model. The only way the AI can learn those routines, is by learning those routines by itself.
The LLMs cannot solve ARC-AGI puzzles, that average humans can easily solve, because it has no knowledge about the process of problem solving. Only about its description. Current top LLMs only are able to infer only small amount of implicit hidden mental routines, that humans use for problem solving, from texts available in the internet.
LLMs are good at math and coding, because the problem solving routines for those tasks are explicit, are extensively described in texts. With formulas, etc. There are no textbooks, describing the formulas of implicit routines inside the human brain.
This is where my previously described neural network model comes in.
It is my belief that Liquid Time Constant Networks, work based on routines, just like humans. That is what allows it to perform the same task that would take a traditional neural network 1000s of neurons, in just 19 neurons. Because it doesn't need to make any predictions. It is able to encode just handful or routines in those 19 neurons, that enable it do the same tasks, without a need to make any kind of predictions.
If my proposed neural network is better, surely it could solve an ARC-AGI puzzle then, right? I believe so. Here is how this AI model can solve the ARC-AGI puzzles.
* Record many videos, of people solving ARC-AGI puzzles, solving the public dataset problems.
* Put eye trackers on those people, so that it is visible where those people are looking at.
* Record the brain scans of the people solving those puzzles. Certain mental routines will activate certain brain regions, in certain sequences, giving the AI more clues for reverse-engineering those routines.
* Train the liquid neural network on this data.
Here is the result i expect:
* The liquid neural network will be able to reverse-engineer the problem solving routines people use, and be able to use it itself.
Then just ask it to solve a new ARC-AGI problem, and it will solve it.
This post is all over the place. But yea, i hope you got the general idea behind this AGI architecture.
TL/DR: Listen to this audio podcast version of this post. Explains what i tried to convey, much better than me. In just 6 minutes (if you use 2x speed). https://notebooklm.google.com/notebook/ec78988a-b2d3-42ca-ace6-48e49bdb56cf/audio