Uploaded by Ярослав Ахметчин

What are LLMs

advertisement
What are LLMs, and how are they used in generative AI?
Large language models are the algorithmic basis for chatbots like OpenAI's
ChatGPT and Google's Bard. The technology is tied back to billions — even
trillions — of parameters that can make them both inaccurate and non-specific for
vertical industry use. Here's what LLMs are and how they work.
Senior Reporter, Computerworld | FEB 7, 2024 2:17 PM PST
When ChatGPT arrived in November 2022, it made mainstream the idea
that generative artificial intelligence (AI) could be used by companies and
consumers to automate tasks, help with creative ideas, and even code software.
If you need to boil down an email or chat thread into a concise summary, a chatbot
such as OpenAI’s ChatGPT or Google’s Bard can do that. If you need to spruce up
your resume with more eloquent language and impressive bullet points, AI can
help. Want some ideas for a new marketing or ad campaign? Generative AI to the
rescue.
ChatGPT stands for chatbot generative pre-trained transformer. The chatbot’s
foundation is the GPT large language model (LLM), a computer algorithm that
processes natural language inputs and predicts the next word based on what it’s
already seen. Then it predicts the next word, and the next word, and so on until its
answer is complete.
Along with OpenAI’s GPT-3 and 4 LLM, popular LLMs include open models
such as Google’s LaMDA and PaLM LLM (the basis for Bard), Hugging Face’s
BLOOM and XLM-RoBERTa, Nvidia’s NeMO LLM, XLNet, Co:here, and GLM130B.
Open-source LLMs, in particular, are gaining traction, enabling a cadre of
developers to create more customizable models at a lower cost. Meta’s February
launch of LLaMA (Large Language Model Meta AI) kicked off an explosion
among developers looking to build on top of open-source LLMs.
LLMs are a type of AI that are currently trained on a massive trove of articles,
Wikipedia entries, books, internet-based resources and other input to produce
human-like responses to natural language queries. That's an immense amount of
data. But LLMs are poised to shrink, not grow, as vendors seek to customize them
for specific uses that don’t need the massive data sets used by today’s most popular
models.
For example, Google’s new PaLM 2 LLM, announced earlier this month, uses
almost five times more training data than its predecessor of just a year ago — 3.6
trillion tokens or strings of words, according to one report. The additional datasets
allow PaLM 2 to perform more advanced coding, math, and creative writing tasks.
Shutterstock
Training up an LLM right requires massive server farms, or supercomputers, with
enough compute power to tackle billions of parameters.
So, what is an LLM?
“Hallucinations happen because LLMs, in their in most vanilla form, don’t have an
internal state representation of the world," said Jonathan Siddharth, CEO of
Turing, a Palo Alto, California company that uses AI to find, hire, and onboard
software engineers remotely. "There’s no concept of fact. They’re predicting the
next word based on what they’ve seen so far — it’s a statistical estimate."
Because some LLMs also train themselves on internet-based data, they can move
well beyond what their initial developers created them to do. For example,
Microsoft’s Bing uses GPT-3 as its basis, but it’s also querying a search engine
and analyzing the first 20 results or so. It uses both an LLM and the internet to
offer responses.
“We see things like a model being trained on one programming language and these
models then automatically generate code in another programming language it has
never seen,” Siddharth said. “Even natural language; it’s not trained on French, but
it’s able to generate sentences in French.”
“It’s almost like there’s some emergent behavior. We don’t know quite know how
these neural network works,” he added. “It’s both scary and exciting at the same
time.”
Another problem with LLMs and their parameters is the unintended biases that can
be introduced by LLM developers and self-supervised data collection from the
internet.
How will LLMs become smaller, faster, and cheaper?
Today, chatbots based on LLMs are most commonly used "out of the box" as a
text-based, web-chat interface. They’re used in search engines such as Google’s
Bard and Microsoft’s Bing (based on ChatGPT) and for automated online customer
assistance. Companies can ingest their own datasets to make the chatbots more
customized for their particular business, but accuracy can suffer because of the
massive trove of data already ingested.
“What we’re discovering more and more is that with small models that you train
on more data longer…, they can do what large models used to do,” Thomas Wolf,
co-founder and CSO at Hugging Face, said while attending an MIT
conference earlier this month. “I think we’re maturing basically in how we
understand what’s happening there.
“There’s this first step where you try everything to get this first part of something
working, and then you’re in the phase where you’re trying to…be efficient and less
costly to run,” Wolf said. “It’s not enough to just scrub the whole web, which is
what everyone has been doing. It’s much more important to have quality data.”
LLMs can cost from a couple of million dollars to $10 million to train for specific
use cases, depending on their size and purpose.
When LLMs focus their AI and compute power on smaller datasets, however, they
perform as well or better than the enormous LLMs that rely on massive,
amorphous data sets. They can also be more accurate in creating the content users
seek — and they're much cheaper to train.
Eric Boyd, corporate vice president of AI Platforms at Microsoft, recently spoke at
the MIT EmTech conference and said when his company first began working on
AI image models with OpenAI four years ago, performance would plateau as the
datasets grew in size. Language models, however, had far more capacity to ingest
data without a performance slowdown.
Microsoft, the largest financial backer of OpenAI and ChatGPT, invested in the
infrastructure to build larger LLMs. “So, we’re figuring out now how to get similar
performance without having to have such a large model,” Boyd said. “Given more
data, compute and training time, you are still able to find more performance, but
there are also a lot of techniques we’re now learning for how we don’t have to
make them quite so large and are able to manage them more efficiently.
“That’s super important because…these things are very expensive. If we want to
have broad adoption for them, we’re going to have to figure how the costs of both
training them and serving them,” Boyd said.
For example, when a user submits a prompt to GPT-3, it must access all 175
billion of its parameters to deliver an answer. One method for creating smaller
LLMs, known as sparse expert models, is expected to reduce the training and
computational costs for LLMs, “resulting in massive models with a better accuracy
than their dense counterparts,” he said.
Researchers from Meta Platforms (formerly Facebook) believe sparse models can
achieve performance similar to that of ChatGPT and other massive LLMs using “a
fraction of the compute.”
“For models with relatively modest compute budgets, a sparse model can perform
on par with a dense model that requires almost four times as much compute,” Meta
said in an October 2022 research paper.
Smaller models are already being released by companies such as Aleph
Alpha, Databricks, Fixie, LightOn, Stability AI, and even Open AI. The more agile
LLMs have between a few billion and 100 billion parameters.
Shutterstock
Privacy, security issues still abound
While many users marvel at the remarkable capabilities of LLM-based chatbots,
governments and consumers cannot turn a blind eye to the potential privacy issues
lurking within, according to Gabriele Kaveckyte, privacy counsel at cybersecurity
company Surfshark.
For example, earlier this year, Italy became the first Western nation to ban further
development of ChatGPT over privacy concerns. It later reversed that decision, but
the initial ban occurred after the natural language processing app experienced a
data breach involving user conversations and payment information.
“While some improvements have been made by ChatGPT following Italy’s
temporary ban, there is still room for improvement," Kaveckyte said. "Addressing
these potential privacy issues is crucial to ensure the responsible and ethical use of
data, fostering trust, and safeguarding user privacy in AI interactions."
Kaveckyte analyzed ChatGPT's data collection practices, for instance, and
developed a list of potential flaws: it collected a massive amount of personal data
to train its models, but may have had no legal basis for doing so; it didn’t notify all
of the people whose data was used to train the AI model; it’s not always accurate;
and it lacks effective age verification tools to prevent children under 13 from using
it.
Along with those issues, other experts are concerned there are more basic problems
LLMs have yet to overcome — namely the security of data collected and stored by
the AI, intellectual property theft, and data confidentiality.
“For a hospital or a bank to be able to use LLMs, we’re doing to have to solve
[intellectual property], security, [and] confidentiality issues,” Turing’s Siddharth
said. “There are good engineering solutions for some of these. And I think those
will get solved, but those need to be solved in order for them to be used in
enterprises. Companies don’t want to use an LLM in a context where it uses the
company’s data to help deliver better results to a competitor.”
Not surprisingly, a number of nations and government agencies around the globe
have launched efforts to deal with AI tools, with China being the most proactive so
far. Among those efforts:




China has already rolled out several initiatives for AI governance, though
most of those initiatives relate to citizen privacy and not necessarily safety.
The Biden administration in the US unveiled AI rules to address safety and
privacy built on previous attempts to promote some form of responsible
innovation, though to date Congress has not advanced any laws that would
regulate AI. In October 2022, the administration unveiled a blueprint for an
“AI Bill of Rights” and an AI Risk Management Framework and more
recently pushed for a National AI Research Resource.
The Group of Seven (G7) nations recentlty called for the creation of
technical standards to keep AI in check, saying its evolution has outpaced
oversight for safety and security.
And the European Union is putting the finishing touches on legislation that
would hold accountable companies that create generative AI platforms like
ChatGPT that can take the content they generate from unnamed sources.
Ссылка на статью https://www.computerworld.com/article/3697649/whatare-large-language-models-and-how-are-they-used-in-generativeai.html#:~:text=Not%20surprisingly%2C%20a%20number%20of%20nations%20a
nd%20government%20agencies%20around%20the%20globe%20have%20launche
d%20efforts%20to%20deal%20with%20AI%20tools%2C%20with%20China%20
being%20the%20most%20proactive%20so%20far.
Download