This is the third installment of what will be a four-part series on how generative AI works. I began my explanation with the Gutenberg printing press and how it created an explosion of content, similar to what generative AI is likely to create. I walked through the basics of how full-text searching works and how the legal industry was at the forefront of search. And last month, I walked through concepts like large language models (LLMs), vectors and parameters, and how they enable ChatGPT to answer questions. ChatGPT takes a question and can predict the sequence of words in a response with remarkable accuracy, on a massive scale.
This month, I will cover the concepts of how training works for generative AI tools like ChatGPT. Similar to last month’s article, the intent here is to develop an understanding of concepts. A well-trained data scientist should be consulted for more detailed explanations or advice for specific usage of generative AI applications.
Training Generative AI: Supervised Learning, Unsupervised Learning, And Reinforcement Learning
There are three steps that are typically taken to train an LLM.
1. Supervised Learning. In the first approach, humans develop good examples with highly structured data. A good example for attorneys is contract data. Humans can tag the various clauses in a contract — Term, Termination, Indemnification, Warranty — and a good representation of values related to each clause. Term is a duration that can be expressed in time or an end date. This process is called Supervised Learning and can be applied by any organization. It is part of the discipline of machine learning and is part of most any AI solution, including contract analytics at a firm or in a law department.
Supervised Learning can be applied to smaller language models and LLMs alike. Generative AI solutions like ChatGPT just do this at massive scale across all sorts of relationships in language. Last month, I walked through how LLMs catalogue exhaustive relationships about words, including grammar, how words are used, word meaning, context, proper names, and many more attributes. To use the word “green” as an example, there is a catalog of information about the word. Green can represent a color, it can represent inexperience, and it can represent environmental friendliness.
2. Unsupervised Learning. The second approach uses supervised learning to add structure and relationships to other documents. This is known as Unsupervised Learning. Using a smaller language model focused on contracts again as an example, a representative sample of contracts from a corporation’s customers can be used to identify anomalies across all customer agreements in the corporation, like identifying contracts that have been highly negotiated or that were written on an older contract template. The unsupervised learning task makes tagging of data elements more accurate in the final AI application. Extending the example, generative AI solutions can perform Unsupervised Learning across massive data sets for LLM training.
3. Reinforcement Learning. The third approach to training an LLM involves feedback. When humans are involved, it is called Reinforcement Learning with Human Feedback (RLHF). In the contract example, RLHF can help identify how to address new terms or unique language in a clause. The process can be simple as a “thumps up” or “thumbs down.” One of the great advantages that ChatGPT has had over other language models is related to the significant amount of RLHF to remove hate and offensive material from its training sets.
Currently, Training An LLM Is Expensive
LLMs are expensive to create at the scale of ChatGPT. Scraping the internet and the initial training of the models requires massive computing power. It is estimated that the training process for GPT-3 might cost up to $4.6 million every time the model is trained, and GPT4 would be much more. LLMs require specialized hardware to run efficiently, and there is a shortage of hardware right now. Companies like Nvidia that create graphical processing units (GPUs) are scrambling to manufacture more hardware that is specialized to process LLMs and generative AI applications. Nvidia now has a valuation of over $1 trillion, and its stock price has nearly tripled since the beginning of the year.
This is why companies like OPENAI/Microsoft, Google (BERT and BARD), Meta (LLaMa), and Amazon are leading the way with generative AI products. It was initially too expensive for anyone but the largest tech companies to create LLMs — but that is changing, and will continue to change very quickly.
Fine-Tuning An LLM
Generative AI like ChatGPT learns. Every time a user interacts with ChatGPT in conversation, it adapts based upon new information for that user. This is sometimes called In-Context Learning.
Fine-tuning is the process of providing information that helps generative AI perform better in a specific task or domain. It can vary based upon the LLM and the degree of access to the underlying model. For law firms using ChatGPT, it is best to use the Microsoft-provided version of the technology hosted in Azure. This allows the firm to better control what happens with uploading documents or information that would be confidential. Every document that gets uploaded into the public version of ChatGPT can end up in the core model.
Fine-tuning in ChatGPT specifically is changing even as this article was written. OPEN AI is deprecating its current fine-tuning capabilities in older versions and actively working to support fine-tuning in GPT3.5 and GPT4.
That being said, here is a quick example of fine-tuning based upon the OPENAI ChatGPT3 Davinci model. Let’s say a law firm’s Intellectual Property practice wants to leverage a series of templates used for various patent filings. Those templates could be uploaded via an application programming interface (API) into the law firm’s private instance of ChatGPT. Once this is completed, a law firm user can ask ChatGPT questions that will leverage those templates. An attorney could prompt ChatGPT to “Please draft the following patent filing using the firm’s patent filing template for new pharmaceutical drugs.” There are better ways to approach this, including better prompts, but this explanation illustrates the idea.
What Is Changing?
Important news comes out all the time on generative AI. Acquisitions, new product announcements, startups, and new language model developments have all occurred in the last month. There are techniques for reducing hallucinations. Organizations are beginning to use more than one LLM in applications. APIs exist to integrate generative AI into other applications. New examples of ethical considerations, congressional hearings, and potential regulation of the technology are evolving.
There is indeed something new every day. Next month, I’ll close out this series with a set of updates regarding what is changing, the economics of AI, and what lawyers should keep in mind about the future of this technology.
Ken Crutchfield is Vice President and General Manager of Legal Markets at Wolters Kluwer Legal & Regulatory U.S., a leading provider of information, business intelligence, regulatory and legal workflow solutions. Ken has more than three decades of experience as a leader in information and software solutions across industries. He can be reached at firstname.lastname@example.org.