Skip to Content
This article has been translated from Japanese using AI
Read in Japanese
This article is in the Public Domain (CC0). Feel free to use it freely. CC0 1.0 Universal

Artificial Learning Intelligence System: The ALIS Concept

Here, I would like to organize the Artificial Learning Intelligence System (ALIS) by detailing its concept, principles, basic design, and development methodology.

Concept

Current generative AI, primarily large language models, is trained based on neural network-based supervised learning.

As a learning process, we define this neural network learning as innate learning.

ALIS integrates an acquired learning process, separate from innate learning, to enable inference that combines both learning processes.

In this acquired learning, the learned knowledge is accumulated externally to the neural network and utilized during inference.

Therefore, the technical core of ALIS lies in the extraction, storage, and the selection and utilization of reusable knowledge during inference.

Furthermore, ALIS is not merely a single elemental technology, but a system technology that combines innate learning and acquired learning.

Elements of a Learning Intelligence System

ALIS operates under the principle that both existing innate learning and future-considered acquired learning follow the same framework of learning and inference.

To explain the principles of learning in ALIS, we define five elements of a learning intelligence system.

The first is the intellectual processor. This refers to a processing system that performs inference using knowledge and extracts knowledge for learning.

Large Language Models (LLMs) and parts of the human brain are prime examples of intellectual processors.

The second is the knowledge store. This refers to a storage location where extracted knowledge can be saved and retrieved as needed.

In LLMs, the knowledge store consists of the parameters of the neural network. In humans, it corresponds to long-term memory in the brain.

The third is the world. This refers to the external environment as perceived by a learning intelligence system, such as humans or ALIS.

For humans, the world is reality itself. In the case of LLMs, a mechanism that receives output from the LLM and provides feedback to it can be considered equivalent to the world.

The fourth is state memory. This refers to an internal temporary memory-like component used by a learning intelligence system during inference.

In LLMs, this is the memory space used during inference, known as hidden states. In humans, it corresponds to short-term memory.

The fifth is the framework. This is, so to speak, a thinking structure. In the terminology of learning intelligence systems, it refers to the criteria for selecting necessary knowledge during inference and a logical state space structure for organizing state memory.

In LLMs, it is the semantic structure of the hidden states, and its contents are generally ambiguous and incomprehensible to humans. Furthermore, knowledge selection is embedded in the attention mechanism, which selects which existing tokens to refer to for each token being processed.

In humans, as mentioned above, it is a thinking structure. When thinking using a specific framework, a particular set of know-how is recalled from long-term memory and loaded into short-term memory. Then, currently perceived information is organized according to the thinking framework to understand the situation.

Principles of a Learning Intelligence System

A learning intelligence system operates as follows:

An intellectual processor acts upon the world. The world, in response to this action, returns results.

The intellectual processor extracts reusable knowledge from these results and stores it in the knowledge store.

When acting upon the world iteratively, the intellectual processor selects knowledge from the knowledge store and uses it to modify its actions.

This is the basic mechanism.

However, fundamentally, the methods of knowledge extraction, storage, selection, and utilization determine whether the system can perform meaningful learning.

Humans possess mechanisms that effectively handle this knowledge extraction, storage, selection, and utilization, enabling them to learn.

Neural networks, including LLMs, have their extraction handled by external teachers, but they possess mechanisms for storage, selection, and utilization. This allows them to learn as long as they are provided with a teacher.

Furthermore, a learning intelligence system can also learn the extraction, storage, and selection of frameworks, and their utilization methods within state memory, as knowledge, thereby enabling more complex learning.

Types of Knowledge

Based on these principles, when designing acquired learning, it is necessary to clarify what form the acquired knowledge will take.

One could consider a method where acquired knowledge is also learned separately as neural network parameters.

However, acquired knowledge does not have to be limited solely to neural network parameters. A practical candidate is knowledge textualized in natural language.

Knowledge textualized in natural language can be extracted and utilized by leveraging the natural language processing capabilities of LLMs. Furthermore, since it can be handled as data in standard IT systems, storage and selection are also easy.

Moreover, knowledge textualized in natural language is easy for humans and other LLMs to check, understand, and, in some cases, even edit its content.

It can also be shared, merged, or split with other learning intelligence systems.

For these reasons, the acquired knowledge in the ALIS concept will initially be designed to target knowledge textualized in natural language.

Acquired State Memory and Frameworks

We have explained the advantages of selecting natural language text as the format for acquired knowledge.

Similarly, natural language text can also be used for the state memory and frameworks for inference.

Frameworks, as conceptual structures, can be stored and utilized in the knowledge store as knowledge textualized in natural language.

Even when initializing or updating states based on the structure defined by a framework, text-format state memory can be used.

By designing not only acquired knowledge but also frameworks and state memory to be in text format, ALIS can leverage the natural language processing capabilities of LLMs for acquired learning and inference in general.

Formal Knowledge

Acquired knowledge, frameworks, and state memory can be expressed not only in natural language text but also in more rigorous formal languages or formal models.

While I wrote "select," the aim for ALIS is to incorporate multiple distinct acquired knowledge learning mechanisms to enable hybrid use of innate and acquired learning.

Knowledge represented by formal languages or formal models can be made more precise and unambiguous.

Furthermore, if a framework is expressed using a formal language or model and an initial state is unfolded in state memory, then a simulation or logical development can be performed with a rigorous model by an intellectual processor capable of processing formal models, rather than an LLM.

A prime example of such formal languages or formal models is programming languages.

As the system learns about the world, if it can express the laws and concepts found therein as a program in a framework, then it can simulate them on a computer.

Column 1: Types of Knowledge

When organizing the knowledge within a learning intelligence system, it becomes clear that it can be broadly categorized into three types of knowledge systems and two types of state.

The three knowledge systems are: network parameter knowledge, handled by neural networks; natural knowledge, expressed in natural language; and formal knowledge, expressed in formal languages.

The two types of state are stateless and stateful.

Stateless network parameter knowledge is intuitive knowledge, like that found in deep learning AI. The features of cats and dogs, which cannot be explicitly thought about or verbally identified, can be learned as stateless network parameter knowledge.

Stateful network parameter knowledge is knowledge that emerges through fuzzy, iterative processes, such as in generative AI.

Stateless natural knowledge is knowledge like meanings tied to individual words.

Stateful natural knowledge is knowledge that includes context within sentences.

Some natural knowledge is innately included in stateful network parameter knowledge, but there is also knowledge that can be acquired from natural language text.

Stateless formal knowledge is knowledge that can be expressed in mathematical formulas without iteration. Stateful formal knowledge is knowledge that can be expressed as a program.

One can also use one's own short-term memory as state memory for natural knowledge and formal knowledge.

However, as it is short-term memory, there is a problem that it is difficult to stably maintain a state. Furthermore, it is not adept at holding formalized, unambiguous states.

On the other hand, paper, computers, and smartphones can be used as state memory to write down or edit natural language text, formal languages, or formal models.

Generally, data on paper or computers is often perceived as a knowledge store for memorizing knowledge, but it can also be used as state memory for organizing thoughts.

Thus, it is evident that humans perform intellectual activities by making full use of these three knowledge systems and two types of state.

ALIS, too, holds the potential to dramatically enhance its capabilities by enabling and strengthening intellectual activities that leverage these same three knowledge systems and two types of state.

In particular, ALIS has the strength of being able to utilize vast knowledge stores and state memory. Furthermore, it can easily perform intellectual tasks by preparing many of each and switching or combining them.

Column 2: Intelligent Orchestration

While there is an advantage in being able to accumulate a vast amount of knowledge in a knowledge store, the quantity of knowledge does not simply translate to an advantage in intellectual activity due to limitations on the number of tokens a generative AI can process at once and the noise generated by irrelevant knowledge.

Conversely, by appropriately dividing the knowledge store and transforming it into high-density specialized knowledge stores, each containing knowledge necessary for a specific intellectual task, the problems of token limits and noise can be mitigated.

In exchange, each specialized knowledge store becomes usable only for its designated intellectual task.

Many intellectual activities are complex composites of various intellectual tasks. Therefore, by dividing knowledge into specialized knowledge stores according to the type of intellectual task and subdividing intellectual activity into individual tasks, ALIS can execute the entire intellectual activity by appropriately switching between these specialized knowledge stores.

This is analogous to an orchestra, composed of professional musicians playing different instruments and a conductor leading the ensemble.

Through this system technology, intelligent orchestration, ALIS will be able to organize its intellectual activities.

ALIS Basic Design and Development Method

From here, we will organize the development of ALIS.

As already discussed in the principles and columns, ALIS is inherently designed for easy expansion of functions and resources. This is because the essence of ALIS lies not in specific functions, but in the processes of knowledge extraction, storage, selection, and utilization.

For example, multiple types of knowledge extraction mechanisms can be provided, and the system design allows for free choice to select from them or use them simultaneously.

Furthermore, ALIS itself can be made to perform this selection.

Similarly, storage, selection, and utilization can also be freely chosen or parallelized.

Therefore, ALIS can be developed incrementally and agilely, without needing to design the entire functionality in a waterfall manner.

The Beginning of ALIS

Now, let's design a very simple ALIS.

The basic UI will be a familiar chat AI. Initially, user input is passed directly to the LLM. The LLM's response is displayed on the UI, and the system waits for the next user input.

Upon receiving the next input, the LLM is provided with not only the new input but also the entire chat history between the user and the LLM.

Behind the UI of this chat AI, a mechanism is prepared to extract reusable knowledge from the chat history.

This mechanism can be added to the chat AI system as a process that runs when a conversation ends or at regular intervals. Of course, an LLM is used for knowledge extraction.

This LLM is provided with the ALIS concept and principles, along with knowledge extraction know-how, as a system prompt. If knowledge is not extracted as intended, the system prompt should be improved through trial and error.

The knowledge extracted from the chat history is stored directly in a knowledge lake. A knowledge lake is simply a mechanism for storing knowledge in a flat state before it is structured.

Next, a structuring mechanism is prepared to make it easier to select knowledge from the knowledge lake.

This involves providing an embedding vector store for semantic search, as used in typical RAG, and keyword indexes.

Other possibilities include generating more advanced knowledge graphs or performing category classification.

This collection of structured information for the knowledge lake will be called a knowledge base. This entire knowledge base and knowledge lake will constitute the knowledge store.

Next, the knowledge store is integrated into the chat UI's processing.

This is basically the same as a general RAG mechanism. For user input, relevant knowledge is selected from the knowledge store and passed to the LLM along with the user input.

This allows the LLM to automatically utilize knowledge when processing user input.

This way, knowledge increases with each conversation with the user, enabling a simple ALIS that utilizes accumulated knowledge from past conversations.

Simple Scenario

For example, imagine a scenario where a user is developing a web application using this simple ALIS.

The user would report that the code proposed by the LLM resulted in an error. Then, the user and the LLM would collaborate to troubleshoot the problem. Let's say they discover that the external API specification the LLM was aware of was outdated, and adapting to the latest API specification resolved the issue.

In this case, knowledge that the LLM's API specification was old and what the latest API specification is could be accumulated in the knowledge store from this chat thread.

Then, when creating a program that uses the same API next time, ALIS could leverage this knowledge to generate a program based on the latest API specification from the outset.

Improving the Initial ALIS

However, for this to happen, this knowledge needs to be selected in response to user input. It might be that this knowledge isn't directly linked to the user's input, as the name of the problematic API is unlikely to appear in the initial user input.

In such a case, the API name would only emerge for the first time in the LLM's response.

Therefore, we will slightly extend the simple ALIS by adding a mechanism for pre-check comments and post-check comments.

Pre-check comments are similar to the recent "thought mode" in LLMs. We prepare a memory that can hold text as state memory, and instruct the LLM via a system prompt to perform pre-check comments upon receiving user input.

The LLM's pre-check comment result is then placed in state memory, and based on this result, knowledge is selected from the knowledge store.

Then, the chat history, pre-check comment result, knowledge corresponding to user input, and knowledge corresponding to the pre-check comment result are passed to the LLM to receive its output.

Furthermore, for the result returned by the LLM, knowledge is searched for in the knowledge store. Including any knowledge found there, the LLM is then asked to perform a post-check.

If any issues are found, they are passed back to the chat LLM along with the problem points and reasons for the指摘 (comments/feedback).

By providing opportunities to select knowledge during both pre-check comments and post-check comments, we can increase the chances of utilizing the accumulated knowledge.

Outlook

The process of creating the initial ALIS and adding improvements to address its weaknesses is precisely agile development, demonstrating that ALIS can be incrementally enhanced.

Furthermore, as exemplified, the initial ALIS is most suitable for use in software development. This is because it is a high-demand field and one where knowledge can be clearly accumulated.

It is a domain where outcomes are unambiguous, yet it necessitates and benefits significantly from trial-and-error, iterative knowledge accumulation.

Additionally, since ALIS development itself is software development, the fact that ALIS developers can also be ALIS users is an attractive aspect.

Moreover, along with the ALIS system, the knowledge lake can be openly shared on platforms like GitHub.

This would allow many individuals to contribute to the improvement of the ALIS system and the accumulation of knowledge, with everyone enjoying the benefits and further accelerating ALIS development efficiently.

Of course, knowledge sharing is not limited to ALIS developers; it can be gathered from all software developers using ALIS.

The natural language nature of knowledge offers two additional advantages.

The first advantage is that knowledge can still be utilized even when LLM models change or are updated.

The second advantage is that the vast accumulated knowledge lake can be used as a pre-training dataset for LLMs. There are two ways to use this: as fine-tuning, or for LLM pre-training itself.

In any case, if an LLM that has innately learned from the knowledge accumulated in the knowledge lake can be utilized, software development will become even more efficient.

Furthermore, software development involves various processes such as requirements analysis, design, implementation, testing, operation, and maintenance. Specialized knowledge also exists for each software domain and platform. By creating a mechanism to divide the vast amount of accumulated knowledge from these perspectives, an ALIS orchestra can be formed.

Thus, the elemental technologies for ALIS are in place. The remaining crucial step is to practically experiment with various methods—such as knowledge extraction know-how, appropriate knowledge selection, specialized knowledge segmentation, and state memory utilization—to discover effective approaches. As complexity increases, processing time and LLM usage costs will also rise, necessitating optimization.

These trial-and-error processes and optimizations can be advanced in a learning-oriented manner through the development and refinement of frameworks.

Initially, developers, as users, will likely integrate frameworks into ALIS through trial and error. However, even then, the LLM itself can be tasked with generating framework ideas.

Then, by incorporating a framework for improving and discovering frameworks into ALIS, based on results received from the world and extracted knowledge, ALIS itself will perform trial-and-error and optimization in a learning-driven manner.

ALIS in the Real World

Once ALIS has been refined to this stage, it should be capable of acquiring knowledge in a wide variety of domains, not just limited to the world of software development.

Similar to software development, ALIS is expected to expand its scope of application to various intellectual activities that humans perform using computers.

Even in such purely intellectual activities, ALIS will possess a quality akin to an embodied AI in relation to its target world.

This is because it recognizes the boundary between itself and the world, acts upon the world through that boundary, and can perceive information received from the world.

When this boundary with the world is physically visible and localized in one place, we generally refer to it as a body.

However, even if the boundary is invisible and spatially distributed, the structure of perception and action through a boundary remains the same as when possessing a physical body.

In this sense, an ALIS performing intellectual activities can be considered to virtually possess the characteristics of an embodied AI.

And, if ALIS is refined to a stage where it can appropriately learn even in new, unknown worlds, there is a possibility that ALIS could be incorporated as part of a real embodied AI that possesses an actual physical body.

In this way, ALIS will eventually be applied to the real world and begin to learn from it.