Gaurav Mantri's Personal Blog.

Microsoft Semantic Kernel – Some Tips & Tricks To Get Rendered Prompts

When you start building a new AI application, most likely you start with a very simple prompt where you write everything you need to do in that prompt only.

However, as the application grows, you write more prompts and that’s when you start templatizing your prompts by extracting things that are common across all your prompts and pass them as template variables to the prompts.

This leads to better manageability of the prompts but you lose the clarity as to what gets sent to the LLM.

Because now the prompt is a template with a number of template parameters, simply by looking at the prompt text will not tell you, what is being actually sent to the LLM. Furthermore, you may want to log the prompts (and response from LLM) somewhere in your system so that you can analyze the performance of these prompts. For these reasons, you would want to have access to the prompts that are being sent to the LLM,

In this post, I will show you three ways in Microsoft Semantic Kernel using which you find out the exact prompt that is being sent to the LLM for processing.

Sample Prompt

Let’s consider the following prompt that I wrote for an application I am building. BTW, I wrote this prompt with the help of an LLM 🙂 (you can read all about it here). I am writing my prompts in YAML format.

name: Rephrase
description: Use this function to reword an unclear question, considering previous context, for better comprehension.
template_format: handlebars
template: |
  <message role="system">
  Grounding Rules:
  ================
  {{#each grounding_rules}}
  - {{this}}
  {{/each}}
  
  The user has asked a question that may not be clear in its current form and may rely on the context of multiple 
  previous questions and answers. Your task is to rephrase or reword the question, taking into account the conversation history if available, 
  to improve its clarity for a Language Model (LLM) to answer it.
  
  Conversation History:
  ====================
  {{#each chat_history}}
  Question: {{Question}}
  Answer: {{Answer}}
  {{/each}}
  </message>
  
  Current Question:
  =================
  <message role="user">{{question}}</message>
  
  <message role="system">
  Considering the information provided to you, please rephrase or reword the current question to increase its clarity and 
  specificity for a language model. Consider identifying the key elements or concepts within the question, ensuring the 
  language is precise, and avoiding any ambiguity or overly complex language. Remember to incorporate the context provided 
  by the previous questions and answers. Your goal is to create a revised question that maintains the original intent, 
  but is more easily understood by an LLM when considering the conversation history.
  </message>
input_variables:
  - name: question
    description: user question
    is_required: true
  - name: grounding_rules
    description: grounding rules for AI model to behave
    is_required: true
  - name: chat_history
    description: chat history
    is_required: true
execution_settings:
  default:
    temperature: 0

As you can see, my prompt template contains some template variables like question, grounding_rules, and chat_history that I am passing to the prompt. Semantic Kernel in turn parses the YAML, replaces these template variables with values that I pass in and then sends that prompt to the LLM.

Solution

So, how do we get the prompts. As I mentioned above, I will show you three ways by which you can get this information.

1. Hooking into Kernel Events (Obsolete, Not Recommended)

First way is by hooking into (Semantic) Kernel events. Kernel in Semantic Kernel exposes a PromptRendered event which gets fired when a prompt is rendered. You can consume this event to get the rendered prompt.

Your code would be something like the following:

private Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder();
    var deploymentId = "your-azure-openai-deployment-id";
    AzureOpenAIClient client = GetAzureOpenAIClientSomehow();        
    kernelBuilder.AddAzureOpenAIChatCompletion(deploymentId, client);
    var kernel = kernelBuilder.Build();
    kernel.PromptRendered += (sender, args) =>
    {
        Console.WriteLine($"Rendered prompt: {args.RenderedPrompt}");
    };
    return kernel;
}

However, you should not be using this approach as it has being marked as obsolete in the latest version. In fact, if you use this approach with version 1.3.0 of Semantic Kernel (which is the most current version at the time of writing this post), you will get a warning about not to use it.

2. Use Filters (Experimental)

This is another approach that you can take. I believe this feature was introduced recently and is recommended to use this approach over using kernel events.

Using filters is really easy. You basically create a custom filter class that implements IPromptFilter interface and then implement PromptRendering and PromptRendered methods to suit your requirements. For example, I could simply write the rendering and rendered prompts to console.

So my code would be something like:

private class PromptFilter : IPromptFilter
{
    public void OnPromptRendering(PromptRenderingContext context)
    {
    }

    public void OnPromptRendered(PromptRenderedContext context)
    {
        var prompt = context.RenderedPrompt;
        Console.WriteLine($"Rendered prompt: ${prompt}");
    }
}

And this is how I would wire up the filter in the kernel:

kernel.PromptFilters.Add(new PromptFilter());

My complete code for kernel would be:

private Kernel GetKernel()
{
    var kernelBuilder = Kernel.CreateBuilder();
    var deploymentId = "your-azure-openai-deployment-id";
    AzureOpenAIClient client = GetAzureOpenAIClientSomehow();        
    kernelBuilder.AddAzureOpenAIChatCompletion(deploymentId, client);
    var kernel = kernelBuilder.Build();
    kernel.PromptFilters.Add(new PromptFilter());
    return kernel;
}

Please note that this is still experimental and may change (or even removed) in the future versions.

3. Manual Way

Above 2 approaches would work great (though use only the 2nd approach and not the 1st one) however at times you would want to get the prompt inline in your application flow and not outside of it.

For example, the application I am building required me to calculate the prompt and completion tokens and send those back to the user as part of the response.

If your application has this kind of requirement, you can manually create the prompt from the prompt template by passing the arguments.

Here’s the code to do so:

var promptFileContents = await File.ReadAllTextAsync(promptFilePath);
var promptTemplateConfig = KernelFunctionYaml.ToPromptTemplateConfig(promptFileContents);
var factory = new HandlebarsPromptTemplateFactory();
if (!factory.TryCreate(promptTemplateConfig, out var promptTemplate)) throw new InvalidOperationException("Unable to create prompt template.");
var prompt = await promptTemplate.RenderAsync(kernel, kernelArguments);

Here, what I am doing is first reading the entire prompt template YAML file and then creating a PromptTemplateConfig from it. Because my prompt template uses handlebars templating, I am creating a HandlebarsPromptTemplateFactory and extracting the prompt template out of my prompt template configuration. I would then render the prompt by passing the kernel and the arguments.

The advantage of this approach is that I can get my prompt inline with my code flow and then use it any way I see fit.

However, because this approach parses the raw YAML file, it will not work if your prompt template calls other functions (say for example, calling a native function) inside it. So, please use this approach cautiously.

Summary

That’s it for this post. I hope you have found the information useful. Semantic Kernel (and in general AI tools) are changing very rapidly (quite evident from the fact that kernel events are being deprecated within a few minor releases), I would highly recommend referencing official documentation for the most current functionality.

Happy Coding!


[This is the latest product I'm working on]