Generative AI and LLM

Introduction to Generative AI and LLM

What is Generative Artificial Intelligence or GenAI?

Generative AI is a type of AI that creates new content based on content it has learned from. Generative AI can be used to create text, images, videos, code and a lot more.

It was popularized by the introduction of ChatGPT created by OpenAI launched on November 30, 2022. GPT 3.5 is a large language model or LLM for short. GPT stands for Generative Pre-trained Transformer. ChatGPT is the interface into the GPT 3.5 LLM created by OpenAI. The initial ChatGPT was based on the GPT 3 version of the LLM.

GPUs or Graphical Process Units are one of the key innovations used in developing Generative AI models. Nvidia is the leading company in the field of GPU’s. The advent of AI and now Generative AI has made Nvidia a trillion-dollar company due to its leading GPU market share.

Different types of Generative AI can be categorized based on the type of data they generate and their intended applications. Here are some common types.

Genertative AI

Text Generation AI

Language Models: These AI systems generate human-like text, often used for tasks like natural language understanding, chatbots, content generation, and more. The most popular LLM’s are ChatGPT by OpenAI and Bard by Google.

Image Generation AI

GANs (Generative Adversarial Networks): GANs consist of two neural networks, a generator and a discriminator, which work together to generate high-quality images, often used in art generation, image enhancement, and deepfake generation. Midjourney and DALL.E 3 are two of the more well-known image generation products.

Audio Generation AI

Text-to-Speech (TTS) Systems: These systems convert text input into spoken language, used in voice assistants, accessibility tools, and more. Music Generation Models: AI models that can generate music, such as lyrics, melodies, or even entire compositions. MusicLM from Google can generate music from text inputs. Murf.ai can convert text to audio which can be used to generate lessons, presentations, voiceovers and more.

Video Generation AI

Video Synthesis Models: These AI systems can create video content, such as deepfake videos, video generation for special effects, and more. Pictory, Veed.io and Visla are products that helps to convert text to video.

Large Language Models (LLM)

In this blog I will cover Large Language Models in more detail. LLM’s or Large Language Models are a subset of deep learning. They are trained on petabytes (1PB = 1000 Terabytes and 1 TB = 1000 Gigabytes) of text data from online resources, books and other sources and generate text when given a prompt. A prompt is a piece of text given as input to get an output. It is way of asking the LLM to create new content. LLMs are powered by deep neural networks, which consist of multiple layers of interconnected nodes. These networks are designed to process sequential data, making them well-suited for language tasks.

Neural Networks and Deep Learning

Neural Networks

These two terms are sometimes used interchangeably but they are not the same.
Neural networks try to imitate how the neurons in the human brain works. Neurons in the brain collect information and communicate this information with other neurons that they are connected to. The human brain has billions of neurons connected to each other.

In the simplest neural network, there are 3 layers. An input layer that collects information and passes it to a hidden layer that processes it and delivers it to the output layer.

Deep learning uses neural networks to run complex algorithms that process huge amounts of labelled and unlabeled data. Deep learning uses a neural network of at least a minimum of 3 hidden layers and generally more hidden layers. The word deep comes from the depth of the hidden layers in the neural network.

Intro to Large Language Models (LLM)

LLM models are  foundational models. Only few companies can create foundational models as they require huge amounts of data and compute resources. The number of parameters used in these models are also very large. The GPT-3 model uses 175 billion parameters. Parameters are used by machine learning models in conjunction with input data. The goal of the machine learning algorithm is to find the best values of the parameters that can create the most accurate predictions for new input data fed to the model. In contrast a simple linear regression model used in supervised machine learning can use just a handful of parameters to predict the output.

The process of training and using LLMs can be broken down into several key steps:

Pre-training

LLMs are initially pre-trained on massive datasets containing a vast amount of text from the internet. During this phase, the model learns to predict the next word in a sentence. It develops an understanding of grammar, context, and various linguistic patterns. This pre-training phase often takes several weeks and requires substantial computational resources.

Fine-tuning

After pre-training, LLMs can be fine-tuned on specific tasks or domains. Fine-tuning involves training the model on a narrower dataset tailored to a particular application, such as text summarization, translation, or question-answering. Fine-tuning helps adapt the LLM to perform well on specific tasks.

Inference

Once trained and fine-tuned, LLMs can be used for various natural language processing tasks. During inference, users input text or prompts, and the model generates responses, summaries, translations, or any other relevant output.

A prompt is a piece of text given as input to the LLM to get an output which in this case is to generate text content.. It is a way of asking the LLM to create new content.

What is prompt design?

Prompt Design

Prompt design is the process of crafting input instructions or queries that are given to Large Language Models (LLMs) to guide their behavior and produce desired outputs. Effective prompt design is a crucial aspect of working with LLMs, as it influences the model’s responses, performance, and the relevance of the generated content.  The process of prompt design works as follows.

Understanding the Task

To design a prompt, you must first have a clear understanding of the task you want the LLM to perform. Whether it’s text generation, question answering, translation, summarization, or any other task, you need to define the task’s objectives.

Clarity and Specificity

Prompts should be clear, specific, and unambiguous. Avoid vague or open-ended prompts. Clearly define what you want the model to do or provide.

Bias and Fairness

Be conscious of potential biases in prompts. Avoid using biased language or biased assumptions. Prompts should promote fairness and responsible AI use.

Examples and Few-Shot Learning

When using few-shot learning, include a few examples or demonstrations of the desired output within your prompt. These examples provide the model with explicit instances to follow. For example, when asking for a text summary, you could provide a few lines of the summary you want to see.

Error Handling

Anticipate potential errors or undesired responses and incorporate error-handling instructions into your prompts. For example, you might add a clause like, “Avoid generating offensive or inappropriate content.”

Iterative Design

The design of prompts requires iteration and experimentation. It’s often helpful to start with simple prompts and gradually make them more complex or specific as needed.

Types of prompts with examples

Prompt Types

The choice of prompt plays a crucial role in guiding the model’s behavior. Here are a few different types of prompts used with LLMs:

Question prompts:

What is the capital of California?

Comparative prompts:

Compare the advantages and disadvantages of Tesla electric vehicles to Mercedes electric vehicles.

Instructional prompts:

Create a 500-word essay on electric vehicles.

Translation prompts:

Convert the following text from English to French. “What is your name?”

Named Entity Recognition (NER) Prompts:

When training a model for NER tasks, you can provide prompts with text containing entities you want the model to identify and classify, such as “Find all the dates and locations in the following text.”

Mathematical or Computational Prompts:

For solving math problems or performing computations, you can give mathematical prompts like “Calculate the square root of 144.”

Sentiment Analysis Prompts:

To assess sentiment, you can ask the model to analyze the sentiment of a piece of text, such as “Analyze the sentiment of this customer review.” The chow mein was great.

A full example of a response from an LLM for sentiment analysis is below.

The customer review “The food was great” expresses a positive sentiment. Here’s a brief sentiment analysis:

 Sentiment: Positive

 The statement “The food was great” conveys a positive sentiment, as the word “great” is a positive descriptor that indicates the customer had a positive experience with the food.

Context:

 The review is concise and does not provide specific details or reasons for the positive sentiment. However, it’s clear that the customer enjoyed the food.

Subjectivity:

The statement is quite objective and straightforward. It states a fact about the customer’s experience without delving into personal opinions or additional context.

 In summary, the sentiment analysis of this customer review is positive, indicating that the customer had a favorable dining experience.

LLM Use Cases

LLM changes Industries

A few use cases for LLM are below. This is just a small set of use cases but it should give you and idea of the power of LLM’s. LLM’s will change all types of businesses in some way shape or form. 

Text Generation: LLMs can generate human-like text, making them invaluable for content generation, including news articles, creative writing, and marketing copy.

Question Answering: They can answer questions based on a given text, making them useful for information retrieval and knowledge bases.

Chatbots and Virtual Assistants: Virtual assistants like Siri, Alexa, and chatbots for customer support utilize LLMs to understand and respond to natural language queries, improving user interaction.

Sentiment Analysis: LLMs can analyze social media posts and reviews to determine the sentiment behind them, helping businesses gather valuable insights into customer opinions.

Text Summarization: These models can generate concise and coherent summaries of lengthy documents, aiding in information retrieval and content curation.

Language Understanding and Generation: LLMs can answer questions, complete sentences, and perform language understanding and generation tasks with human-like fluency.

Medical Diagnosis and Research: LLMs are used to analyze medical records, assist in diagnosis, and conduct medical research, potentially improving patient care and outcomes.

Education: They can provide personalized learning experiences by generating practice questions, explanations, and educational content tailored to individual student needs.

Creativity and Art: LLMs have been used to generate poetry, artwork descriptions, and even assist in the creation of music and video game content.

Zero-shot training and Few-shot training

Zero shot training refers to the process where you only provide an input prompt to the LLM to get an output response. No additional information is provided.

In few-shot training the LLM is provided with limited amounts of prompts and sample outputs so the LLM can learn from those samples. Once that is done a new prompt similar to what you provided in the training will elicit a correct response from the LLM. It enables models to generalize from a small amount of data and adapt to specific tasks with minimal training. Few-shot training is valuable for a wide range of applications, including:

Question Answering:

Example: A few-shot-trained model can answer questions based on a few examples provided.

Historical Events:

Example Questions:

“Who was the first president of the United States?”

“When did World War II end?”

Example Answers:

“The first president of the United States was George Washington.”

“World War II ended on September 2, 1945.”

Language Translation:

Example: A model can be trained to translate between language pairs with only a few examples of translated sentences.

Summarization:

Example: Few-shot training can be used for text summarization tasks, where the model summarizes long articles with just a few examples of desired summaries.

Text Classification:

Example: A model can be trained to classify text into categories with limited labeled examples, such as classifying customer reviews as positive or negative.

Sentiment Analysis:

Example: A sentiment analysis model can be fine-tuned with only a small set of example positive and negative sentiment sentences.

Content Generation:

Example: A few-shot-trained model can generate content in specific styles or genres based on minimal examples, such as generating poems, stories, or jokes.

Personal Assistants:

Example: Developing virtual personal assistants that can assist users in specific tasks based on a few provided examples and instructions.

Content Tagging and Categorization:

Example: Tagging content or categorizing items with a few-shot-trained model based on a small set of labeled examples.

Few-shot training is particularly valuable when you have limited labeled data for a specific task or when you need to adapt a pre-trained model to a new context quickly. It allows machine learning models to generalize and perform effectively with minimal supervision, making it a versatile approach in a variety of natural language processing and machine learning applications.

Conclusion

Gen AI Innovations

The tectonic shift that Generative AI brings cannot be underestimated.  Like any disruptive technology, it will create huge opportunities but also cause significant collateral damage in terms of displaced jobs, reinforced biases, misinformation, and other unintended consequences.

Like the internet and cloud computing that spawned innovative businesses like Netflix, Uber, Airbnb, Google, Amazon AWS, Microsoft Azure and social media companies, Gen AI will do the same in as yet unknown ways. As the saying goes, we continue to live in interesting times.

LLMs are prone to biases and hallucinations where they provide information that is in correct or biased towards some factors. For these reasons LLMs should be used with care and human intervention to make sure the reputation of your business is not tarnished. Every organization should have a set of rules for responsible AI that is followed when creating Generative AI applications.