In this short post, we will talk about Semantic Kernel from Microsoft. We will talk about what exactly is Semantic Kernel and its key components.
So, let’s begin!
What is Semantic Kernel?
Simply put, Semantic Kernel is an open source SDK developed by Microsoft that helps you build AI apps. It is an AI orchestration engine that enables you to combine different components (also known as plugins), Large Language Models (LLM) and other some other things to build AI apps.
You can learn more about Semantic Kernel here: https://learn.microsoft.com/en-us/semantic-kernel/overview/.
Key Components
Following are the key components of Semantic Kernel:
Plugins
In simplest terms, a plugin is a function. Just like functions in any other SDK, a plugin in Semantic Kernel does something. That something could be sending a user request (prompt) to an LLM to generate a response or executing a functionality that is not possible with an LLM (like sending email etc.).
You can learn more about the plugins here: https://learn.microsoft.com/en-us/semantic-kernel/ai-orchestration/plugins.
There are two kinds of plugins currently supported by Semantic Kernel:
Semantic Plugin
A semantic plugin is a function that interacts with an LLM. It takes a user prompt as an input and sends that to LLM and returns the response from LLM based on the prompt.
Creating a semantic plugin is really simple. All you need are 2 files – one is a text file containing the prompt with placeholders for user input (and optionally history and options) and other one is a configuration file in json format that contains details about the other parameters that affect the behavior of underlying LLM like max output tokens etc. You can also create a semantic plugin inline in the code as well.
You can learn more about the semantic plugins here: https://learn.microsoft.com/en-us/semantic-kernel/ai-orchestration/semantic-functions.
Native Plugin
LLMs are extremely good at perform semantic tasks like text generation, text comprehension, entity recognition etc. but when you are building an app, there are so many other things that you would need to do which are beyond the capabilities and scope of LLMs. For example, you cannot ask an LLM to send out emails or call a protected API etc. This is where native plugins come in to picture.
A native plugin is a function that can do things which a semantic plugin cannot do. As mentioned above, you can write a native plugin which sends an email, or work with file system to read/write files or invoke an API.
You write native plugins in high level languages (currently only .Net and Python are supported). They are no different than writing functions for regular coding tasks.
You can learn more about native plugins here: https://learn.microsoft.com/en-us/semantic-kernel/ai-orchestration/native-functions.
Chains
When building an application, usually in order to satisfy a user’s ask, multiple functions need to be invoked. This is what a chain does in Semantic Kernel.
Simply put, Chain is a collection of plugins (functions) that accomplish a user’s ask. In a chain, plugins are automatically invoked one after the other where output of one plugin is fed into the next plugin in the chain as input to that plugin. This is where “orchestration” part of definition comes into play.
To explain, let’s say you have an app which goes over a chat between a customer service bot and a customer, finds the action (e.g. send an email, call back etc.) that needs to be taken and then takes that action. Now you created a semantic plugin which analyzes the conversation and extracts the intent (let’s call it “Get Intent” plugin). Once the intent is identified, you would need to take some action on that intent. For that let’s say you created a native plugin (let’s call it “Take Action” plugin) which takes the appropriate action (of sending email etc.). You can chain these two plugins together in such a way that first “Get Intent” plugin is called where you input the chat contents and get an output of the intent. That intent is automatically fed into “Take Action” plugin and appropriate action is taken based on the intent.
You can learn more about chains here: https://learn.microsoft.com/en-us/semantic-kernel/ai-orchestration/chaining-functions.
Planners
You can manually create a chain by putting plugins at appropriate places in the chain. However, at times it is not possible to manually create a chain and that’s where planners come into play.
In very simple terms, a planner is a function that takes a user’s ask as input and creates an orchestration plan to fulfill that ask. Planner actually makes use of AI to come up with this plan.
Let’s understand with an example. Let’s say you are building an AI app that helps users perform basic arithmetic operations (add, subtract, multiply and divide). User provides the input in natural language (e.g. What is the sum of 10 and 2).
To accomplish this, you first created a semantic plugin to find the user’s intent (GetIntentPlugin) and then have four native plugins to perform add (AddPlugin), subtract (SubtractPlugin), multiply (MultiplyPlugin) and divide (DividePlugin).
Considering a user can ask the app to perform any operation, it is not possible for you to manually create a chain. This is where planners come into play. It will take the user’s input and then create a chain of plugins (also called a plan) to fulfill user’s request.
For example, if the user’s input is “What is the sum of 10 and 2”, the planner will create a chain comprising of “GetIntentPlugin” and “AddPlugin”. Similarly if the user’s input is “How much is 10 divided by 5”, planner will create a chain comprising of “GetIntentPlugin” and “DividePlugin” and so on.
Semantic Kernel performs a semantic search over the metadata of all the plugins registered in the kernel to find the best plugins and their order to perform user’s ask.
You can learn more about planners here: https://learn.microsoft.com/en-us/semantic-kernel/ai-orchestration/planner.
Memory
LLMs are stateless in nature. What that means is that if you send a request, it would serve it and then completely forgets about it. For example, consider the following conversation:
User: Hi! My name is John.
Bot: Hello John! How can I help you today?
Now here’s the next conversation:
User: What is my name?
Bot: I’m sorry, but as an AI assistant, I don’t have access to your personal information, including your name.
In this conversation, the LLM does not remember that the user had provided their name in the previous conversation.
Memory helps solve this problem.
In simple terms, memory component of your AI app keeps the conversation history and then that conversation history gets plugged in automatically in the prompt so that LLM knows about the previous conversation you have had with it.
With memory enabled, the 2nd conversation would look something like the following:
History:
User: Hi! My name is John.
Bot: Hello John! How can I help you today?
—————————————–
User: What is my name?
Bot: Your name is John. How may I assist you with anything else?
There are different ways by which you can implement memory in your AI app. You can choose to store the entire conversation history in memory or part of the conversation history (e.g. last “n” conversations) or store the semantic meaning of the conversation history. You can also choose to keep the memory volatile or persistent. We will go over the details about memory in a subsequent blog post.
You can learn more about memory in Semantic Kernel here: https://learn.microsoft.com/en-us/semantic-kernel/memories/.
LangChain
If you have been building AI apps, I am pretty sure that you have either used or heard of LangChain. I don’t think that I would be wrong if I say that Semantic Kernel is highly inspired by it.
In my limited experience working with both of them, I would say that LangChain is much more feature rich than Semantic Kernel at the moment.
Summary
That’s it for this post folks! I hope you have found it useful. Please do share your thoughts by providing comments. Let me know if you find any mistakes in my post and I will correct them ASAP.
Cheers!