What are LLMs, and how are they used in generative AI? Large language models are the algorithmic basis for chatbots like OpenAI's ChatGPT and Google's Bard. The technology is tied back to billions — even trillions — of parameters that can make them both inaccurate and non-specific for vertical industry use. Here's what LLMs are and how they work. Senior Reporter, Computerworld | FEB 7, 2024 2:17 PM PST When ChatGPT arrived in November 2022, it made mainstream the idea that generative artificial intelligence (AI) could be used by companies and consumers to automate tasks, help with creative ideas, and even code software. If you need to boil down an email or chat thread into a concise summary, a chatbot such as OpenAI’s ChatGPT or Google’s Bard can do that. If you need to spruce up your resume with more eloquent language and impressive bullet points, AI can help. Want some ideas for a new marketing or ad campaign? Generative AI to the rescue. ChatGPT stands for chatbot generative pre-trained transformer. The chatbot’s foundation is the GPT large language model (LLM), a computer algorithm that processes natural language inputs and predicts the next word based on what it’s already seen. Then it predicts the next word, and the next word, and so on until its answer is complete. Along with OpenAI’s GPT-3 and 4 LLM, popular LLMs include open models such as Google’s LaMDA and PaLM LLM (the basis for Bard), Hugging Face’s BLOOM and XLM-RoBERTa, Nvidia’s NeMO LLM, XLNet, Co:here, and GLM130B. Open-source LLMs, in particular, are gaining traction, enabling a cadre of developers to create more customizable models at a lower cost. Meta’s February launch of LLaMA (Large Language Model Meta AI) kicked off an explosion among developers looking to build on top of open-source LLMs. LLMs are a type of AI that are currently trained on a massive trove of articles, Wikipedia entries, books, internet-based resources and other input to produce human-like responses to natural language queries. That's an immense amount of data. But LLMs are poised to shrink, not grow, as vendors seek to customize them for specific uses that don’t need the massive data sets used by today’s most popular models. For example, Google’s new PaLM 2 LLM, announced earlier this month, uses almost five times more training data than its predecessor of just a year ago — 3.6 trillion tokens or strings of words, according to one report. The additional datasets allow PaLM 2 to perform more advanced coding, math, and creative writing tasks. Shutterstock Training up an LLM right requires massive server farms, or supercomputers, with enough compute power to tackle billions of parameters. So, what is an LLM? “Hallucinations happen because LLMs, in their in most vanilla form, don’t have an internal state representation of the world," said Jonathan Siddharth, CEO of Turing, a Palo Alto, California company that uses AI to find, hire, and onboard software engineers remotely. "There’s no concept of fact. They’re predicting the next word based on what they’ve seen so far — it’s a statistical estimate." Because some LLMs also train themselves on internet-based data, they can move well beyond what their initial developers created them to do. For example, Microsoft’s Bing uses GPT-3 as its basis, but it’s also querying a search engine and analyzing the first 20 results or so. It uses both an LLM and the internet to offer responses. “We see things like a model being trained on one programming language and these models then automatically generate code in another programming language it has never seen,” Siddharth said. “Even natural language; it’s not trained on French, but it’s able to generate sentences in French.” “It’s almost like there’s some emergent behavior. We don’t know quite know how these neural network works,” he added. “It’s both scary and exciting at the same time.” Another problem with LLMs and their parameters is the unintended biases that can be introduced by LLM developers and self-supervised data collection from the internet. How will LLMs become smaller, faster, and cheaper? Today, chatbots based on LLMs are most commonly used "out of the box" as a text-based, web-chat interface. They’re used in search engines such as Google’s Bard and Microsoft’s Bing (based on ChatGPT) and for automated online customer assistance. Companies can ingest their own datasets to make the chatbots more customized for their particular business, but accuracy can suffer because of the massive trove of data already ingested. “What we’re discovering more and more is that with small models that you train on more data longer…, they can do what large models used to do,” Thomas Wolf, co-founder and CSO at Hugging Face, said while attending an MIT conference earlier this month. “I think we’re maturing basically in how we understand what’s happening there. “There’s this first step where you try everything to get this first part of something working, and then you’re in the phase where you’re trying to…be efficient and less costly to run,” Wolf said. “It’s not enough to just scrub the whole web, which is what everyone has been doing. It’s much more important to have quality data.” LLMs can cost from a couple of million dollars to $10 million to train for specific use cases, depending on their size and purpose. When LLMs focus their AI and compute power on smaller datasets, however, they perform as well or better than the enormous LLMs that rely on massive, amorphous data sets. They can also be more accurate in creating the content users seek — and they're much cheaper to train. Eric Boyd, corporate vice president of AI Platforms at Microsoft, recently spoke at the MIT EmTech conference and said when his company first began working on AI image models with OpenAI four years ago, performance would plateau as the datasets grew in size. Language models, however, had far more capacity to ingest data without a performance slowdown. Microsoft, the largest financial backer of OpenAI and ChatGPT, invested in the infrastructure to build larger LLMs. “So, we’re figuring out now how to get similar performance without having to have such a large model,” Boyd said. “Given more data, compute and training time, you are still able to find more performance, but there are also a lot of techniques we’re now learning for how we don’t have to make them quite so large and are able to manage them more efficiently. “That’s super important because…these things are very expensive. If we want to have broad adoption for them, we’re going to have to figure how the costs of both training them and serving them,” Boyd said. For example, when a user submits a prompt to GPT-3, it must access all 175 billion of its parameters to deliver an answer. One method for creating smaller LLMs, known as sparse expert models, is expected to reduce the training and computational costs for LLMs, “resulting in massive models with a better accuracy than their dense counterparts,” he said. Researchers from Meta Platforms (formerly Facebook) believe sparse models can achieve performance similar to that of ChatGPT and other massive LLMs using “a fraction of the compute.” “For models with relatively modest compute budgets, a sparse model can perform on par with a dense model that requires almost four times as much compute,” Meta said in an October 2022 research paper. Smaller models are already being released by companies such as Aleph Alpha, Databricks, Fixie, LightOn, Stability AI, and even Open AI. The more agile LLMs have between a few billion and 100 billion parameters. Shutterstock Privacy, security issues still abound While many users marvel at the remarkable capabilities of LLM-based chatbots, governments and consumers cannot turn a blind eye to the potential privacy issues lurking within, according to Gabriele Kaveckyte, privacy counsel at cybersecurity company Surfshark. For example, earlier this year, Italy became the first Western nation to ban further development of ChatGPT over privacy concerns. It later reversed that decision, but the initial ban occurred after the natural language processing app experienced a data breach involving user conversations and payment information. “While some improvements have been made by ChatGPT following Italy’s temporary ban, there is still room for improvement," Kaveckyte said. "Addressing these potential privacy issues is crucial to ensure the responsible and ethical use of data, fostering trust, and safeguarding user privacy in AI interactions." Kaveckyte analyzed ChatGPT's data collection practices, for instance, and developed a list of potential flaws: it collected a massive amount of personal data to train its models, but may have had no legal basis for doing so; it didn’t notify all of the people whose data was used to train the AI model; it’s not always accurate; and it lacks effective age verification tools to prevent children under 13 from using it. Along with those issues, other experts are concerned there are more basic problems LLMs have yet to overcome — namely the security of data collected and stored by the AI, intellectual property theft, and data confidentiality. “For a hospital or a bank to be able to use LLMs, we’re doing to have to solve [intellectual property], security, [and] confidentiality issues,” Turing’s Siddharth said. “There are good engineering solutions for some of these. And I think those will get solved, but those need to be solved in order for them to be used in enterprises. Companies don’t want to use an LLM in a context where it uses the company’s data to help deliver better results to a competitor.” Not surprisingly, a number of nations and government agencies around the globe have launched efforts to deal with AI tools, with China being the most proactive so far. Among those efforts: China has already rolled out several initiatives for AI governance, though most of those initiatives relate to citizen privacy and not necessarily safety. The Biden administration in the US unveiled AI rules to address safety and privacy built on previous attempts to promote some form of responsible innovation, though to date Congress has not advanced any laws that would regulate AI. In October 2022, the administration unveiled a blueprint for an “AI Bill of Rights” and an AI Risk Management Framework and more recently pushed for a National AI Research Resource. The Group of Seven (G7) nations recentlty called for the creation of technical standards to keep AI in check, saying its evolution has outpaced oversight for safety and security. And the European Union is putting the finishing touches on legislation that would hold accountable companies that create generative AI platforms like ChatGPT that can take the content they generate from unnamed sources. Ссылка на статью https://www.computerworld.com/article/3697649/whatare-large-language-models-and-how-are-they-used-in-generativeai.html#:~:text=Not%20surprisingly%2C%20a%20number%20of%20nations%20a nd%20government%20agencies%20around%20the%20globe%20have%20launche d%20efforts%20to%20deal%20with%20AI%20tools%2C%20with%20China%20 being%20the%20most%20proactive%20so%20far.