Another obstacle to the base model of artificial intelligence

This is the second column I am writing about the limitations of the ‘Foundation Model’. A few months back, research around Artificial Intelligence (AI) models took a new turn with the long-awaited release of foundation models such as GPT-3, DAL-E and BERT. These models try to map and include almost every document available on the Internet. It is for use that can only be described as a brute-force attempt to globally provide a repository for future AI programs to base their own ‘training’ data on.

The idea behind such foundation models, funded by the world’s largest technology companies, such as Microsoft and Google, is that these models can form the basis for all kinds of AI applications. Most importantly, they differ from other cognitive models that use smaller data sets to train AI systems, as their origins as foundation models stemmed from scouring almost every piece of information available on the web. is, a data store that is already vast and doubling down. size every two years or so.

In AI models that do not use such foundation models, the program is trained on data that is very specific to the task at hand – such as the analysis of an electro cardio graph (ECG) for evidence of a heart attack. Doing. In these cases, pattern recognition is important. Therefore, training an AI program to look for specific patterns in a data-set that certainly contains examples of such patterns is an easier task than training it on a global data-set that contains all kinds of information.

Simply put, if I were to write a computer program to read an ECG to identify a potential heart attack, I would not feed its training model with every bit of data available on the Internet. Instead I would feed it as many ECG readings as I can, and only these. I will not add stock-market charts or Shakespeare’s sonnets to the feed. As the number of ECG reports in my training model increases, I will in theory be able to increase the accuracy of my pattern recognition model to the point where it is better than an experienced doctor looking at the same ECG data and drawing conclusions. becomes more efficient. Using a widely fed foundation model for this important (but still limited) cognitive task would be a clear case of overkill.

I wrote in the final installment of IT Matters that these foundation models for AI must conquer new peaks, and the above ECG scenario provides an example where foundation models based on basically all the knowledge available on the Internet struggle. will do. Provide precise accuracy all around the repetitive but important task of correcting ECG readings.

Contextual awareness embodies all the subtle nuances of human learning – it is the ‘who’, ‘why’, ‘when’ and ‘what’ that inform human decisions and behaviour.

For example, if an ECG was taken soon after a patient entered the hospital with complaints of severe chest pain, the context would be very different than if this reading was taken during a normal health checkup of a completely healthy person. Went. The Foundation model faces a challenge in getting the right context.

Apart from context, these larger models also lack ‘world views’. The key to obtaining a world view model is a decoupling between building the building blocks of the world model and their subsequent use in simulating possible outcomes. In simple words, a foundation model will have access to all kinds of graphs and charts – from stock market charts and scientific and engineering graphs to sine-wave amplitude models and so on – but all those graphs and charts cannot be simply thrown away. It is a typical simulation that looks for possible outcomes such as a crash in the stock market.

Humans and truly intelligent machines need to use world models or ‘world views’ to make sense of observations and assess a possible future in order to select the best course of action. In the example above, we will exclude all charts that are not related to the stock market.

Direct interaction in a specialized setting that includes multiple actors, effectively down-scales the world model, to change from a general large-scale setting (such as answering web search queries or a dictionary search for the meaning of a word). and must be adapted to the task at hand. A different, modular (meaning smaller-scale) and optimization approach is a logically distinct and a far less complex architecture that tries to simulate and reason everything in an ‘input-output function’ phase, where the foundation model would be We have to go

As human beings, you and I do not answer questions without reference to this world view; In both of our cases, our response is informed by many years of specialized schooling as well as real-world experiences, and we create a situation by taking small, specific components from this piece of learning and using them precisely. will answer. task at hand. We don’t go through every single piece of information our brains carry; Instead we use our worldviews to improve (or down-scale and use what we know, computer programming speak) our responses to a question or situation.

An AI program designed for a specific task based on a foundation model would be useless without providing the ability to improve the program’s ‘thinking’ or use the world view to create components, as we humans do. Throwing zettabytes of training data at it won’t work.

Siddharth Pai is the co-founder of Sienna Capital and the author of ‘Takeproof Me’.

catch all business News, market news, today’s fresh news events and breaking news Updates on Live Mint. download mint news app To get daily market updates.

More
low

subscribe to mint newspaper

, Enter a valid email

, Thank you for subscribing to our newsletter!

post your comment