Here, I aim to organize the Artificial Learning Intelligence System (ALIS) by covering its concepts and principles, through to its basic design and development methods.
Concept
Current generative AI, primarily large language models, are trained based on supervised learning using neural networks.
We position this neural network training process as innate learning.
ALIS is a system that enables comprehensive inference by integrating both innate and acquired learning processes, by incorporating an acquired learning process separate from innate learning.
In this acquired learning, learned knowledge is stored externally to the neural network and utilized during inference.
Therefore, the technical core of ALIS lies in the extraction, storage of reusable knowledge, and the selection and utilization of knowledge during inference.
Furthermore, ALIS is not merely a single elemental technology but also a system technology that combines innate and acquired learning.
Elements of a Learning Intelligence System
ALIS treats both existing innate learning and future acquired learning as operating under the same principles within the framework of learning and inference.
To explain the principles of learning in ALIS, we define five elements of a learning intelligence system:
The first is the Intelligent Processor. This refers to a processing system that performs inference using knowledge and extracts knowledge for learning.
Representative examples of intelligent processors include LLMs and parts of the human brain.
The second is the Knowledge Store. This refers to a storage location where extracted knowledge is saved and can be retrieved as needed.
In LLMs, the knowledge store is the parameters of the neural network. In humans, it corresponds to long-term memory in the brain.
The third is the World. This refers to the external environment as perceived by learning intelligence systems such as humans or ALIS.
For humans, the world is reality itself. In the case of LLMs, the mechanism that receives output from the LLM and provides feedback to it is considered to be the equivalent of the world.
The fourth is the State Memory. This refers to an internal temporary memory, like a scratchpad, that a learning intelligence system uses during inference.
In LLMs, this is the memory space used during inference, known as hidden states. In humans, it corresponds to short-term memory.
The fifth is the Framework. This is the so-called framework of thought. In the terminology of a learning intelligence system, it refers to the criteria for selecting necessary knowledge during inference and the logical state space structure for organizing the state memory.
In LLMs, it is the semantic structure of hidden states, and generally, its content is vague and incomprehensible to humans. Furthermore, knowledge selection is integrated into the attention mechanism, which selects which existing tokens to reference for each token being processed.
For humans, as mentioned above, it is the framework of thought. When thinking using a specific framework of thought, certain sets of know-how are recalled from long-term memory and loaded into short-term memory. Then, the currently perceived information is organized according to the framework of thought to understand the situation.
Principles of a Learning Intelligence System
A learning intelligence system operates as follows:
The intelligent processor acts upon the world. The world responds with results based on that action.
The intelligent processor extracts reusable knowledge from these results and stores it in the knowledge store.
When the intelligent processor acts repeatedly on the world, it selects knowledge from the knowledge store and uses it to modify its mode of action.
This is the basic mechanism.
However, fundamentally, the methods for knowledge extraction, storage, selection, and utilization determine whether the system can achieve meaningful learning.
Humans possess mechanisms that enable effective knowledge extraction, storage, selection, and utilization, which allows them to learn.
Neural networks, including LLMs, have mechanisms for storage, selection, and utilization, although the extraction part is handled by an external teacher. This allows them to learn as long as a teacher provides the input.
Furthermore, a learning intelligence system can achieve more complex learning by also learning the extraction, storage, and selection of frameworks, and their utilization in state memory, as knowledge.
Types of Knowledge
Based on this principle, when designing acquired learning, it is necessary to clarify what form of information acquired knowledge will take.
It is conceivable to learn acquired knowledge separately as parameters of a neural network.
However, acquired knowledge does not need to be limited solely to neural network parameters. A realistic candidate is knowledge textualized in natural language.
If knowledge is textualized in natural language, it can be extracted and utilized by leveraging the natural language processing capabilities of LLMs. Furthermore, it can be treated as data in a regular IT system, making storage and selection easy.
Moreover, knowledge textualized in natural language is easy for humans and other LLMs to check, understand, and, in some cases, edit.
It can also be shared with other learning intelligence systems, and merged or split.
For these reasons, the acquired knowledge in the ALIS concept will initially be designed to target knowledge textualized in natural language.
Acquired State Memory and Framework
I have explained the advantages of choosing natural language textualized knowledge as acquired knowledge.
Similarly, natural language text can also be used for the state memory and framework for inference.
The framework, which is a conceptual structure, can also be stored and utilized in the knowledge store as natural language textualized knowledge.
When initializing or updating states based on the structure defined by that framework, text-based state memory can be used.
By designing ALIS to use text format not only for acquired knowledge but also for frameworks and state memory, ALIS can leverage the natural language processing capabilities of LLMs for both acquired learning and general inference.
Formal Knowledge
Acquired knowledge, frameworks, and state memory can be represented not only by natural language text but also by more rigorous formal languages or formal models.
While I wrote "select," the goal for ALIS is to incorporate multiple acquired knowledge learning mechanisms to allow for a hybrid utilization of innate and acquired learning.
Knowledge represented by formal languages or formal models can be more rigorous and free from ambiguity.
Furthermore, if a framework is expressed using a formal language or formal model, and an initial state is expanded in state memory, then a formal model can be processed by an intelligent processor (not an LLM) to perform rigorous simulations and logical reasoning.
A prime example of such formal languages and formal models is programming languages.
As the system learns about the world, if it can express the underlying laws and concepts as programs within a framework, these can then be simulated by a computer.
Column 1: Types of Knowledge
As we organize the knowledge within a learning intelligence system, it becomes clear that it can be broadly categorized into three systems and two types.
The three systems are: network parameter knowledge handled by neural networks, natural knowledge in natural language, and formal knowledge in formal languages.
The two types are stateless and stateful.
Stateless network parameter knowledge is intuitive knowledge, like that found in deep learning AI. The characteristics of cats and dogs, which cannot be thought about or identified verbally, can be learned as stateless network parameter knowledge.
Stateful network parameter knowledge is fuzzy, iterative process-derived knowledge, like that found in generative AI.
Stateless natural knowledge is knowledge like the meaning associated with a word.
Stateful natural knowledge is knowledge including the context found within a sentence.
Some natural knowledge is inherently included in stateful network parameter knowledge, but there is also knowledge that can be acquired post-natally from natural language text.
Stateless formal knowledge is knowledge that can be expressed by mathematical formulas that do not include iteration. Stateful formal knowledge is knowledge that can be expressed by programs.
One's own brain's short-term memory can also be used as a state memory for natural and formal knowledge.
However, as it is short-term memory, there is a problem that it is difficult to stably maintain a state. Also, it is not good at holding knowledge in a formalized, unambiguous state.
On the other hand, paper, computers, or smartphones can be used as state memory for writing down and editing natural language text, formal languages, or formal models.
Generally, data on paper or computers is often perceived as something for storing knowledge as a knowledge store, but it can also be used as state memory for organizing thoughts.
Thus, it is evident that humans perform intellectual activities by skillfully utilizing these three systems and two types of knowledge.
ALIS also has the potential to dramatically improve its capabilities by enabling and enhancing intellectual activities that leverage these same three systems and two types of knowledge.
In particular, ALIS has the strength of being able to utilize vast knowledge stores and state memory. Furthermore, it can easily prepare multiple instances of each and perform intellectual tasks by switching or combining them.
Column 2: Intellectual Orchestration
While there is a strength in being able to store a large amount of knowledge in the knowledge store, simply having a large quantity of knowledge is not necessarily advantageous for intellectual activity due to the limitations on the number of tokens a generative AI can use at once and the constraint that irrelevant knowledge becomes noise.
On the other hand, by appropriately segmenting the knowledge store and creating high-density, specialized knowledge stores that gather knowledge necessary for specific intellectual tasks, the problems of token limits and noise can be mitigated.
In exchange, such specialized knowledge stores would only be usable for those specific intellectual tasks.
Many intellectual activities are complex combinations of various intellectual tasks. Therefore, by dividing knowledge into specialized knowledge stores according to the type of intellectual task and subdividing intellectual activity into intellectual tasks, ALIS can execute the entire intellectual activity while appropriately switching between specialized knowledge stores.
This is like an orchestra composed of professional musicians playing different instruments and a conductor leading the whole.
Through this system technology, "intellectual orchestration," ALIS will be able to organize its intellectual activities.
ALIS Basic Design and Development Method
From here, I will organize the development approach for ALIS.
As already stated in the principles and columns, ALIS is inherently designed to easily extend its functions and resources. This is because the essence of ALIS lies not in specific functions, but in the processes of knowledge extraction, storage, selection, and utilization.
For example, multiple types of knowledge extraction mechanisms can be prepared, and then selected from or used simultaneously, depending on the system design.
Furthermore, ALIS can be made to perform this selection itself.
Storage, selection, and utilization can similarly be freely selected or parallelized.
Therefore, ALIS can be developed incrementally and agilely, without the need to design the entire functionality in a waterfall manner.
The Beginning of ALIS
Now, let's design a very simple ALIS.
The basic UI will be the familiar chat AI. Initially, user input will be passed directly to the LLM. The LLM's response will then be displayed on the UI, and the system will await the next user input.
When the next input arrives, the LLM will receive not only the new input but also the entire chat history between the user and the LLM up to that point.
Behind this chat AI UI, we will prepare a mechanism to extract reusable knowledge from the chat history.
This can be added to the chat AI system as a process executed when a conversation ends or at regular intervals. Of course, an LLM will be used for knowledge extraction.
This LLM will be given the ALIS concept and principles, along with knowledge extraction know-how, as system prompts. If knowledge is not extracted as intended, the system prompts should be refined through trial and error.
The knowledge extracted from the chat history will be stored directly in a knowledge lake. A knowledge lake is a mechanism for simply storing knowledge in a flat, unstructured state before it is structured.
Next, we will prepare a structuring mechanism to make it easier to select knowledge from the knowledge lake.
This means providing embedding vector stores for semantic search, as typically used in RAG, and keyword indexes, among other things.
More advanced options include generating a knowledge graph or performing category classification.
This collection of structured information for the knowledge lake will be called a knowledge base. This entire knowledge base and knowledge lake will constitute the knowledge store.
Next, we will integrate the knowledge store into the chat UI processing.
This is basically the same as a general RAG mechanism. For user input, relevant knowledge is selected from the knowledge store and passed to the LLM along with the user input.
This allows the LLM to automatically utilize knowledge when processing user input.
This way, knowledge will accumulate with each conversation with the user, realizing a simple ALIS that uses knowledge accumulated from past conversations.
Simple Scenario
For example, imagine a user developing a web application using this simple ALIS.
The user reports that the code proposed by the LLM resulted in an error. After the user and LLM collaborate to troubleshoot, they discover that the external API specification known to the LLM was outdated, and the program works correctly after being adapted to the latest API specification.
From this chat thread, ALIS could then accumulate knowledge in its knowledge store: specifically, that the API specification known by the LLM is old, and what the latest API specification is.
Then, the next time a program using the same API is created, ALIS would be able to leverage this knowledge to generate a program based on the latest API specification from the outset.
Improvements to the Initial ALIS
However, for this to happen, this knowledge must be selected in response to user input. It's possible that this knowledge won't be directly linked to the user's input, as the problematic API name might not appear in the user's input.
In that case, the API name would only emerge during the LLM's response.
Therefore, we will slightly extend the simple ALIS by adding mechanisms for pre-analysis and post-checking.
Pre-analysis is similar to the "thought mode" in recent LLMs. A memory capable of holding text as state memory will be prepared, and the system prompt will instruct the LLM to perform pre-analysis upon receiving user input.
The LLM's pre-analysis result will be stored in the state memory. Based on this pre-analysis result, knowledge will be selected from the knowledge store.
Then, the chat history, pre-analysis result, knowledge corresponding to user input, and knowledge corresponding to the pre-analysis result will be passed to the LLM to receive a response.
Furthermore, the result returned by the LLM will also be used to search for knowledge from the knowledge store. Including the knowledge found there, the LLM will be asked to perform a post-check.
If any issues are found, the problematic points and reasons for the指摘 will be included and passed back to the chat LLM.
By providing opportunities to select knowledge during pre-analysis and post-checking, we can increase the chances of utilizing accumulated knowledge.
Outlook
This approach of building an initial ALIS and then adding improvements to address its weaknesses perfectly illustrates agile development and the incremental improvement of ALIS.
Furthermore, as exemplified, the initial ALIS is most suitable for use in software development. This is because it is a high-demand field and also one where knowledge can be clearly accumulated easily.
It is a genre where things are clearly black or white, yet it is also a crucial field where trial-and-error, iterative knowledge accumulation is necessary and important.
In addition, since ALIS development itself is software development, the fact that ALIS developers can be ALIS users themselves is also appealing.
And, along with the ALIS system, the knowledge lake can also be openly shared on platforms like GitHub.
This would allow many people to collaborate on ALIS system improvements and knowledge accumulation, with everyone benefiting from the results, further accelerating ALIS development.
Of course, knowledge sharing is not limited to ALIS developers but can be gathered from all software developers using ALIS.
The fact that knowledge is in natural language offers two further advantages:
The first advantage is that knowledge can be leveraged even when the LLM model changes or is updated.
The second advantage is that the vast accumulated knowledge lake can be used as a pre-training dataset for LLMs. This can be done in two ways: by using it for fine-tuning, or by using it for LLM pre-training itself.
In any case, if LLMs that have innately learned the knowledge accumulated in the knowledge lake can be utilized, software development will become even more efficient.
Furthermore, within software development, there are various processes such as requirements analysis, design, implementation, testing, operation, and maintenance, and specialized knowledge exists for each software domain and platform. If a mechanism is created to segment the vast accumulated knowledge from these perspectives, an ALIS orchestra can also be formed.
Thus, the elemental technologies for ALIS are in place. The key now is to practically try various methods—such as knowledge extraction know-how, appropriate knowledge selection, specialized knowledge segmentation, and how to utilize state memory—to discover effective approaches. Also, as complexity increases, processing time and LLM usage costs will rise, necessitating optimization.
These trial-and-error and optimization processes can be pursued adaptively through the development and improvement of frameworks.
Initially, the developers, as users, will likely incorporate frameworks into ALIS through trial and error. However, even then, the LLM itself can be made to generate framework ideas.
And by incorporating frameworks into ALIS that improve or discover frameworks based on the results received from the world and extracted knowledge, ALIS itself will perform trial-and-error and optimization adaptively.
ALIS in the Real World
Once ALIS has been refined to this stage, it should be capable of learning knowledge not only in the world of software development but broadly across various domains.
Similar to software development, ALIS is expected to expand its scope to various intellectual activities that humans perform using computers.
Even in such purely intellectual activities, ALIS possesses a kind of embodied AI nature with respect to the target world.
This is because it recognizes the boundary between itself and the world, acts upon the world through that boundary, and can perceive information received from the world.
What we generally refer to as a "body" is a boundary with the world that is physically visible and localized in one place.
However, even if the boundary is invisible and spatially distributed, the structure of perception and action through a boundary is the same as having a physical body.
In that sense, ALIS, when performing intellectual activities, can be considered to possess the nature of a virtually embodied AI.
And once ALIS is refined to a stage where it can appropriately learn even in new, unknown worlds, there is a possibility that ALIS can be integrated as part of a real embodied AI that possesses a physical body.
In this way, ALIS will eventually be applied to the real world and will begin to learn from it.