De current generative AI wey we get now na one AI technology wey really shine well-well because dem invent Transformer. Na one big breakthrough.
De main thing wey make Transformer special na wetin dem call Attention Mechanism. You go see am clear for de title of de paper wey announce Transformer: “Attention Is All You Need.”
De reason for dis na say, for dat time, AI researchers bin dey try all dem possible best and dey do trial-and-error to make AI fit handle natural language as human dey do am well. So, dem go name de methods wey work well and publish papers about dem.
Many researchers bin believe say if dem combine all de different mechanisms wey dey work well for different ways, dem go fit slowly create AI wey go fit handle natural language like human. So, dem bin dey focus on finding new mechanisms wey fit work with others, and to find de best combinations of all these mechanisms.
But, de Transformer come change dis old way of thinking. De message wey dey de paper title clear say, no need to combine plenty mechanisms, say na only Attention Mechanism be wetin dem need.
Of course, Transformer itself get plenty mechanisms inside am, but no doubt say among all of dem, Attention Mechanism na de one wey really break ground and stand out.
Wetin Attention Mechanism Be
De Attention Mechanism na system wey dey allow AI to learn which words, out of de many wey dey inside previous sentences, e suppose focus on when e dey process a specific word for natural language.
Dis one dey make AI fit understand well wetin a word dey refer to. For example, when e dey deal with words like "this," "that," or "the aforementioned" (wey dey point to a word for a previous sentence), or words wey dey show position like "the opening sentence," "the second example listed," or "the preceding paragraph."
Wetin pass dat, e fit correctly understand modifying words even if dem far apart for a sentence. And even for long texts, e fit understand words without losing de context wey de current word dey refer to, so e no go get lost among other sentences.
Dis na de usefulness of "attention."
On de other hand, dis mean say when e dey understand de word wey e currently dey process, words wey no necessary dey hidden and removed from de understanding.
By keeping only de words wey dey essential for understanding a particular word and removing de ones wey no relevant, de group of words to be understood remain small, no matter how long de text be. Dis dey prevent de understanding from becoming too shallow.
Virtual Intelligence
Now, make we change topic small, I don dey think about dis idea of virtual intelligence.
For now, when dem dey use generative AI for business, if dem gather all de information inside one company and give am to de AI as one single knowledge base, de plenty amount of knowledge fit just confuse am. Dis go make de AI no fit process de knowledge well.
Because of dis, e beta to separate de knowledge by task. You fit prepare AI chats for each task or create AI tools wey specialize for specific operations. Dis way go work better.
So, when you dey do complex tasks, you go need to combine these AI chats or AI tools, since each one get its own separate knowledge.
Even though dis na one limitation for de current generative AI, fundamentally, even with future generative AI, if dem focus only on de knowledge wey dem need for a specific task, e go make am more accurate.
Instead, I believe say future generative AI go fit internally differentiate and use de necessary knowledge according to de situation, even without humans needing to break down dat knowledge.
Dis ability na virtual intelligence. E be like virtual machine wey fit run different different operating systems on top one single computer. E mean say inside one single intelligence, plenty virtual intelligences with different specializations fit function.
Even de generative AI wey we get now fit already simulate discussions among plenty pipo or generate stories wey get plenty characters. So, virtual intelligence no be special ability, na just an extension of de current generative AI.
Micro Virtual Intelligence
De way virtual intelligence dey work, wey dey select only de knowledge wey e need for one particular task, na similar to wetin Attention Mechanism dey do.
To put am anoda way, e be like Attention Mechanism because e dey focus and process only de relevant knowledge based on de task wey e dey do for dat time.
On de other hand, you fit say Attention Mechanism na one way to achieve sometin' similar to virtual intelligence. But, while de virtual intelligence wey I dey imagine dey choose relevant knowledge from plenty knowledge, Attention Mechanism dey work with groups of words.
Because of dis, we fit call Attention Mechanism Micro Virtual Intelligence.
Explicit Attention Mechanism
If we see de Attention Mechanism as small small virtual intelligence, then, de virtual intelligence wey I talk about before, fit happen if we build a big attention mechanism.
And dis big attention mechanism no need to dey added to de inside structure of big language models or involve neural network learning.
E fit just be a clear statement wey dem write for normal language, like: "When you dey do Task A, check Knowledge B and Knowledge C."
Dis one go make de knowledge wey Task A need clear. Dis statement itself na one kind of knowledge.
We fit call dis an Explicit Attention Mechanism. Dis statement fit be Attention Knowledge, wey clearly say de knowledge wey dem suppose focus on when dem dey do Task A.
Wetin pass dat, generative AI fit create or update dis Attention Knowledge.
If one task no work because of not enough knowledge, de Attention Knowledge fit be updated to include more knowledge as a reference for dat task, based on wetin happen.
Conclusion
De Attention Mechanism don make generative AI power grow well-well.
E no just be one mechanism wey just happen to work well; instead, as we don see for here, de actual way wey e dey dynamically reduce de information wey e need to check for each situation, fit be de main thing for advanced intelligence.
And, just like virtual intelligence and clear attention knowledge, de Attention Mechanism also be de key to continuously make intelligence better for different different levels.