MIT researchers have developed a new method to help artificial intelligence (AI) systems conduct complex reasoning tasks in three areas including coding, strategic planning and robotics.
Large language models (LLMs), which include ChatGPT and Claude 3 Opus, process and generate text based on human input, known as “prompts.” These technologies have improved greatly in the last 18 months, but are constrained by their inability to understand context as well as humans or perform well in reasoning tasks, the researchers said.
But MIT scientists now claim to have cracked this problem by creating “a treasure trove” of natural language “abstractions” that could lead to more powerful AI models. Abstractions turn complex subjects into high-level characterizations and omit non-important information — which could help chatbots reason, learn, perceive, and represent knowledge just like humans.
Currently, scientists argue that LLMs have difficulty abstracting information in a human-like way. However, they have organized natural language abstractions into three libraries in the hope that they will gain greater contextual awareness and give more human-like responses.
The scientists detailed their findings in three papers published on the arXiv pre-print server Oct. 30 2023, Dec. 13 2023 and Feb. 28. The first library, called the “Library Induction from Language Observations” (LILO) synthesizes, compresses, and documents computer code. The second, named “Action Domain Acquisition” (Ada) covers AI sequential decision making. The final framework, dubbed “Language-Guided Abstraction” (LGA), helps robots better understand environments and plan their movements.
Related: ‘It would be within its natural right to harm us to protect itself’: How humans could be mistreating AI right now without even knowing it
These papers explore how language can give AI systems important context so they can handle more complex tasks. They were presented May 11 at the International Conference on Learning Representations in Vienna, Austria.
“Library learning represents one of the most exciting frontiers in artificial intelligence, offering a path towards discovering and reasoning over compositional abstractions,” said Robert Hawkins, assistant professor of psychology at the University of Wisconsin-Madison, in a statement. Hawkins, who was not involved with the research, added that similar attempts in the past were too computationally expensive to use at scale.
The scientists said three library frameworks use neurosymbolic methods — an AI architecture combining neural networks, which are collections of machine learning algorithms arranged to mimic the structure of the human brain, with classical program-like logical approaches.
Smarter AI-driven coding
LLMs have emerged as powerful tools for human software engineers, including the likes of GitHub Copilot, but they cannot be used to create full-scale software libraries, the scientists said. To do this, they must be able to sort and integrate code into smaller programs that are easier to read and reuse, which is where LILO comes in.
The scientists combined a previously developed algorithm that can detect abstractions, known as “Stitch” — with LLMs to form the LILO neurosymbolic framework. Under this regime, when an LLM writes code, it’s then paired with Stich to locate abstractions within the library.
Because LILO can understand natural language, it can detect and omit vowels from strings of code and draw snowflakes — just like a human software engineer could by leveraging their common sense. By better understanding the words used in prompts, LLMs could one day draw 2D graphics, answer questions related to visuals, manipulate Excel documents, and more.
Using AI to plan and strategize
LLMs cannot currently use reasoning skills to create flexible plans — like the steps involved in cooking breakfast, the researchers said. But the Ada framework, named after the English mathematician Ada Lovelace, might be one way to let them adapt and plan when given these types of assignments in, say, virtual environments.
The framework provided libraries of cooking and gaming plans by using an LLM to find abstractions from natural language datasets related to these tasks — with the best ones scored, filtered and added to a library by a human operator. By combining OpenAI’s GPT-4 with the framework, the scientists beat the AI decision-making baseline ‘Code as Policies’ at performing kitchen simulation and gaming tasks.
By finding hidden natural language information, the model understood tasks like putting chilled wine in a kitchen cupboard and building a bed — with accuracy improvements of 59% and 89%, respectively, compared to carrying out the same tasks without Ada’s influence. The researchers hope to find other domestic uses for Ada in the foreseeable future.
Giving robots an AI-assisted leg up
The LGA framework also allows robots to better understand their environments like humans — removing unnecessary details from their surroundings and finding better abstractions so they can perform tasks more effectively.
LGA finds task abstractions in natural language prompts like “bring me my hat” with roots performing actions based on training footage.
The researchers demonstrated the effectiveness of LGA by using Spot, Boston Dynamics’ canine-like quadruped robot, to fetch fruits and recycle drinks. The experiments showed robots could effectively scan the world and develop plans in chaotic environments.
The researchers believe neurosymbolic frameworks like LILO, Ada and LGA will pave the way for “more human-like” AI models by giving them problem-solving skills and allowing them to navigate their environments better.