Feb 24 (Reuters) – Meta Platforms Inc (META.O) said on Friday it is releasing for researchers a new large language model, the core software of a new artificial intelligence system, sparking an AI arms race as Big Tech companies scramble to integrate the technology into their products and impress investors.
The public battle to dominate the AI technology space began late last year with the launch of Microsoft-backed OpenAI’s ChatGPT and prompted tech heavyweights from Alphabet Inc (GOOGL.O) to China’s Baidu Inc (9888.HK) to launch their to trumpet its own offer .
Meta’s LLaMA, short for Large Language Model Meta AI, will be available under a non-commercial license to researchers and entities affiliated with government, civil society and academia, it said in a blog.
Large language models mine massive amounts of text to summarize information and generate content. For example, they can answer questions with sentences that can read as if they were written by humans.
Latest updates
View 2 more stories
The model, which Meta said requires “much less” computing power than previous offerings, has been trained in 20 languages with a focus on those with Latin and Cyrillic alphabets.
“Meta’s announcement today appears to be a step in testing their generative AI capabilities so they can implement it in their products in the future,” said Gil Luria, senior software analyst at DA Davidson.
“Generative AI is a new application of AI that Meta has less experience with, but is clearly important for the future of their business.”
AI has emerged as a bright spot for investment in the tech industry, whose slowing growth has led to widespread layoffs and a cut in experimental bets.
Meta said LLaMA could outperform competitors that examine more parameters or variables that the algorithm takes into account.
Specifically, it said a version of LLaMA with 13 billion parameters can outperform GPT-3, a recent predecessor to the model ChatGPT is built on.
It described its LLaMA model with 65 billion parameters as “competitive” with Google’s Chinchilla70B and PaLM-540B, which are even larger than the model Google used to show off its Bard chat-driven search.
A spokeswoman for Meta attributed the achievement to a greater amount of “cleaner” data and “architecture improvements” in the model that improved training stability.
Meta released the large language model OPT-175B in May last year, also aimed at researchers, which formed the basis of a new iteration of its chatbot BlenderBot.
It later introduced a model called Galactica, which could write scientific papers and solve math problems, but quickly withdrew the demo after it generated authoritative-sounding false answers.
Reporting by Yuvraj Malik and Eva Mathews in Bengaluru and Katie Paul in New York; Edited by Shailesh Kuber and Grant McCool
Our Standards: The Thomson Reuters Principles of Trust.
Leave a Reply