Blog

Chatbot: From Zero to Hero

Skelia is not all about augmenting teams with top-class developers. Sometimes, we just like to experiment with technologies a bit. A couple of our senior .NET developers had some free time and decided to investigate the area around creating chatbots, and here’s their experience of building a chatbot in a couple of weeks. We think that anyone who wants to start developing a chatbot but doesn’t know where to start would appreciate reading this article.

How it All Started

The goal we initially strived for was building a chatbot to assist users in meeting room booking. The chatbot was supposed to function based on an internal booking system.

We started working on the project in a team of 3 .NET developers, and none of us had basically any prior experience with building a chatbot. We didn’t know where to start and only vaguely understood how it should work. But what we had was determination and endless curiosity to implement the project anyway.

What Was the Plan?

First things first: we intended to use Microsoft services for bot development to get familiar with their infrastructure and availability. Google helped us here and pointed toward the Azure Bot Service. The service immediately directs you to the chatbot ecosystem. So, we went from there and dived deeper into the mind of our chatbot.

What is a Chatbot Anyway?

Before we go into the details of how to build a chatbot, let’s recall the basics. A chatbot is an application users interact with in a conversational way using text, graphics or voice. Each chatbot consists of several essential elements:

  • each interaction between the chatbot and the user generates an activity.
  • the Bot Framework Service, a component of the Azure Bot Service, sends information from a user’s bot-connected app or a channel (such as Facebook or Skype, for example) to the bot
  • a turn consists of the user’s input and any immediate response of the bot to the user (in a conversation, people often speak one at a time, taking turns speaking; since bots are supposed to imitate human conversation, it generally reacts to user input).
  • the turn context object provides information about the activity such as the sender and receiver, the channel and other data needed to process the activity.

Keep in mind that the turn context is one of the most important abstractions in the SDK Bot Framework. It carries the inbound activity to the middleware components and the app logic and provides the mechanism. The middleware components and the application logic, in turn, can send outbound activities.

Inside the Chatbot Ecosystem

Here’s what the most common commercial chatbot architecture looks like:

Chatbot Ecosystem

Let’s take a look at each step in detail:

  1. The customer uses your mobile app.
  2. The user authenticates through Azure AD B2C.
  3. Using the custom Application Bot, the user requests information.
  4. Cognitive Services help process the request in the natural language.
  5. The customer reviews the response and refines the question using the natural conversation.
  6. When the user is happy with the results, the Application Bot executes the required functionality.
  7. Application insights gather runtime telemetry to track and adjust Bot performance and usage.

Having figured this out, we dug deeper into the topic, exploring ways to develop an actual bot application.

Creating a New Bot Service

Microsoft has several projects you can use as a starting point for your bot, such as this repository. As for us, we used the handy Azure Bot Service documentation page to start.
To create a new bot service, you have to take the following steps:

    1. Log into the Azure portal.
    2. Create a new resource, the Web App Bot.
      Web App Bot
    3. Fill in the information on location, resource groups, the application plan and other aspects in the bot service popup.

Note: We recommend choosing West US as the location to start as LUIS (Natural language processing tool) and get a location with free endpoints.

  1. Click Create to create the service and deploy the bot to the cloud (this may take several minutes).
  2. Test your bot in the web chat straight in Azure.
  3. Download the code.
  4. Continue developing the bot locally on your device.

Now, let’s talk about the prerequisites for your local environment. Here are the essentials:

  1. Visual Studio
  2. Bot Builder SDK
  3. Bot Framework Emulator
  4. Dot Net Core SDK
  5. LUISGen

With all that installed, you can easily open your downloaded code, launch the emulator, and connect your bot.

The Issue We Faced

While the outlined process seems pretty straightforward, we did stumble upon a small issue. Microsoft uses appsetting.json to hold its configuration, but the Bot Framework Emulator requires the .bot file configuration to perform. Luckily, it’s easy to create one with an emulator and a bot solution on hand.

You can interact with your bot using the emulator interface, which looks like this:

emulator interface

Bot Builder SDK

Bot Builder probably provides the most comprehensive experience for building conversation applications. That’s why we used the Bot Builder Framework v.4. One of the best things it offers is the ability to use the .NET Core framework. The framework allowed us to deploy Linux at some point.

Now, it was time to start learning!

Application Flow

Just like apps and websites, bots have UI, too. But their UI consists of messages rather than screens. Messages may contain buttons, text and other elements or be entirely speech-based.

While a traditional application or website can request multiple pieces of information on a screen all at once, a bot gathers all the required information using multiple messages. This makes info gathering an active experience since the user is having an active conversation with the bot every time he or she shares any information.

A conversation generally focuses on achieving the procedural flow. The procedural flow implies that the bot asks the user a series of questions to gather all the information it needs before processing a task. You define the question order, so you can organize the questions into logical modules to keep the code centralized.

In other words, you can design a module to contain the logic that helps the user browse for products or one to help create a new order.

The Role of Dialogs in Building a Chatbot

Dialogs are immensely important. They provide a convenient way to manage a conversation with the user. Dialogs are bot structures that act like functions in your chatbot’s program. Every dialog performs a specific task in a particular order.

Dialogs can be thought of as a programmatic stack, which we call for the dialog stack. It has the turn handler directing it and serving as the fallback if the stack is empty. The topmost stack item is the active dialog, to which the dialog context directs all the input.

When a dialog begins, it is pushed onto the stack and becomes an active dialog. The dialog remains active until it ends, gets replaced with another dialog method or another dialog is pushed onto the stack (either by the turn handler or the active dialog itself). When a new dialog ends, it pops off the stack and the next dialog becomes the active dialog.

The traditional dialog flow looks something like this:

dialog flow

The Dialog Class Hierarchy

Dialogs in the Bot Framework v.4 come in several different types: prompts, waterfall dialogs and component dialogs. Here’s the hierarchy:

Bot Framework

  • Prompts
    Prompts make it simple to ask the user for information and evaluate their responses. For example, you specify the question or information you’re asking with a number prompt, and the prompt checks whether it has received a valid number response or not. If yes, the conversation continues. Otherwise, the user gets re-prompted for a valid answer.
  • Component dialogs
    The component dialog designs a strategy for creating independent dialogs to handle specific scenarios to break a large dialog set into more manageable pieces. Each of these pieces has its own dialog set and avoids any name collisions with the dialog set. In this way, you can write a reusable dialog suitable for different scenarios, such as an address dialog requesting values for the street name, city and zip code.
  • Waterfall dialogs
    A waterfall dialog collects information from the user or guides the user through a series of tasks. Each step of the conversation is implemented as an asynchronous function that takes a waterfall step context (step) parameter. At each step, the bot prompts the user for input (or can begin a child dialog), waits for a response and then passes the result to the next step. The result of the first function is passed as an argument to the next function, and so on. Here’s how waterfall dialogs work:waterfall dialogs

The basic bot template we downloaded from Azure contained examples of all three dialog types, so it was easy to learn about each type in the code.

Then, we had to decide what our dialog structure would look like and what dialog types we had to use for our bot.

Waterfall Issue and Alternative

Like I’ve already said, our goal was to build a room booking bot. Our first solution was to use a mixture of Waterfall and Сomponent dialogs for the duration of a booking logged by the start and the end time. Naturally, these dialogs contained prompts for DateTime and meeting room Numbers.

However, we noticed that handling Waterfall dialogs required much more care regarding all user inputs and different corner or error cases. So, we found an alternative – the FormFlow Dialog library. There was an issue though: the library was compatible with Bot Framework v.3 only and not really available for v.4, which was the one we used.

Lucky for us, the bot builder community shared some free packages of the library, suitable for our framework version you can find here. Using these packages was not a silver bullet for us, but we managed to write some additional .NET code using reflections and changing some internal states of the dialogs, which helped.

We hope this issue will be solved soon since the community releases new versions pretty often. So, be sure to check the community to stay tuned for updates.

FormFlow

FormFlow automatically generates dialogs when they are needed based on the guidelines you set. These dialogs manage a guided conversation. Designing a guided conversation using FormFlow can reduce the time it takes to develop a bot big time. But don’t forget that the library requires losing some flexibility in creating and managing dialogs on your own.

For our bot, we used FormFlow for two dialogs, Availability and Booking. The library helps prompt, validate and control the flow of these dialogs.

Here’s the FormFlow code for our booking model:

 
    [Serializable]
    public class BookingModel
    {
        [Prompt("Please enter room number.{||}")]
        public RoomNumber? RoomNumber { get; set; }
 
        [Prompt("Please enter from date")]
        [Describe("From")]
        public DateTime? FromDateTime { get; set; }
 
        [Prompt("Please enter to date")]
        [Describe("To")]
        public DateTime? ToDateTime { get; set; }
 
        public IForm BuildForm()
        {
            return new FormBuilder()
                .Field(nameof(RoomNumber))
                .Field(nameof(FromDateTime))
                .Field(nameof(ToDateTime))
                .Confirm(string.Format(BookingStrings.CONFIRM_BOOKING, "{RoomNumber}", "{FromDateTime}", "{ToDateTime}"))
                .OnCompletion(FormCompleted).Build();
        }
 
        private async Task FormCompleted(DialogContext context, BookingModel state)
        {
		...
  }
}
 

The Room Number property uses enum to hold all the required meeting rooms. Here’s how the prompt looks in Skype, for example:

Room Number

Additionally, any FormFlow dialog contains the following useful commands:

  1. Help: see the kinds of responses you can enter
  2. Quit: quit the form without completing it
  3. Reset: start filling in the form over (with defaults from your previous entries)
  4. Back: go back to the previous question
  5. Status: see your progress in filling in the form so far

One of the benefits of using the library is that the features are already there; if you use the Waterfall dialog flow, for instance, you have to include these features manually. At the same time, you can switch to a different field by entering its name (Room Number, From or To, in our case).

Natural Language Processing

While still working on the dialog flow, we jumped to applying natural language processing. Microsoft lets you use its Natural Language Processing service LUIS, designed to help determine user intent. Then, the bot can choose which dialog to use to process user request using a LUIS response.

LUIS

Microsoft describes LUIS as a “cloud-based API service that applies custom machine-learning intelligence to a user’s natural language text to predict overall meaning and pull out relevant, detailed information.”
Once the LUIS app is published, a client application sends utterances to the LUIS natural language processing endpoint API. The results are received as JSON responses.

The LUIS model begins with intents. Each intent needs examples of user utterances. Each utterance can provide a variety of data that needs to be extracted with entities. We used LUIS to determine user intentions and parse user input into a machine-ready JSON formatted response.

Here’s the model we got:

  • Intents
  1. Book meeting room
  2. Check availability
  3. Greeting
  4. Help
  5. Change name
  • Entities
  1. Booking time (From, To, Duration)
  2. Meeting room number
  3. Datetime

Here we’re using LUIS for out bot:

LUIS

Book meeting room is an intent name with a List of utterances and entities that the bot proceeds per request.

We also used the LUISGen tool to generate a model file that would contain enums and information on working with LUIS responses.

Here’s the code in our main dialog using LUIS to determine which child dialog to pick:

 
protected override async Task RouteAsync(DialogContext innerDc, CancellationToken cancellationToken = default(CancellationToken))
        {
            var intent = await LuisHelper.ExecuteLuisQuery(_logger, innerDc.Context, cancellationToken);
 
            if(intent.TopIntent().score > MinIntentScore) { 
 
                switch (intent.TopIntent().intent)
                {
                    case Luis.BotModel.Intent.Greeting:
                        await innerDc.BeginDialogAsync(nameof(GreetingDialog), cancellationToken);
                        break;                   
 
                    case Luis.BotModel.Intent.Check_Availability:
                        var bookingAvailabilityModel = _bookingAvailabilityModel.ProcessCheckAvailabilityIntent(intent);
                        await ResetAndShowFormDialog(innerDc, cancellationToken, nameof(BookingAvailabilityModel), bookingAvailabilityModel);
                        break;
 
                    case Luis.BotModel.Intent.Rename_user:
                        await OnRename(innerDc, cancellationToken);
                        break;
				….
}

We call for the LUIS endpoint to get the top intent and decide which dialog to pick up based on it.

With this in mind, here’s a diagram on how communication works within a chat dialog:

chat dialog

Then, we worked on the additions to different dialogs and trained our LUIS model to understand the requests. Eventually, we got the Greeting and Help Waterfall dialogs and Check availability and Booking FormFlow dialogs.

We also added a repository layer that would do all the dirty work, such as lots of authorizing, checking for available meeting rooms and booking them per request.

State Management and CosmosDB

Now it’s time to ask yourself why you need state in the first place.

Maintaining state lets your bot have meaningful conversations thanks to remembering certain things about a user through the conversation. For example, if it has already talked to a specific user, you can save the information about the user so that you won’t have to ask for it again. State also stores data longer than the current turn, so your bot keeps the information throughout a multi-turn conversation.

With this said, one additional thing that we wanted to have was a personalized approach to the user. The idea was to make the bot recognize the user and assign a personal name to the bookings conducted by the same person. At that point, we needed some kind of storage to work with, so we used CosmosDB. It turned out to be pretty easy to set up on our test Azure account.

State management automates the reading and writing of your bot’s state to the underlying storage layer. State is stored as state properties, which are effectively key-value pairs that your bot reads and writes through the state management object with no worries regarding a specific underlying implementation.

State properties are lumped into scoped ‘buckets,’ aimed to help organize the properties. The SDK includes three of such ‘buckets’:

  • user state
  • conversation state
  • conversation state

These predefined buckets are scoped to certain visibility, depending on the bucket:

  • user state is available in any turn regardless of the conversation
  • conversation state is available in any turn in a specific conversation, irrespective of the user (such as group conversations)
  • private conversation state is scoped to both a particular conversation and to that particular user

When Should You Use Each Type of State?

Conversation state is good for tracking the context of the conversation, such as:

  • identifying the type of question the user asks the bot
  • determining the topic of the current and previous conversation

User state is good for tracking information about the user, such as:

  • non-critical user information (name and preferences, an alarm setting or an alert preference)
  • information from the last conversation with the bot

Private conversation state is useful for channels that support group conversations to track both user- and conversation-specific information. For example, if you have a classroom clicker bot, the bot can:

  • aggregate and display student responses for a given question
  • aggregate each student’s performance and privately relay that back to them at the end of the session

The backend, which stores the state information, is our storage layer. It’s kind of our physical storage like in-memory, Azure or a third party server. The Bot Framework SDK includes some implementations for the storage layer, and we decided to use it in our case.

Here’s the code that sets up the storage place in Startup.cs:

// Create the storage we'll be using for Conversation state.
            IStorage memoryStore = new MemoryStorage();
            services.AddSingleton(memoryStore);

// Initializes bot service clients and adds a singleton your bot can access through dependency injection.
            var botServices = new List
            {
                new CosmosDbService
                {
                    Id = ServiceTypes.CosmosDB,
                    Key = Configuration[CosmosSubscriptionKey],
                    Endpoint = Configuration[CosmosEndpoint],
                    Database = Configuration[CosmosDatabase],
                    Collection = Configuration[CosmosCollection]
                }
            };
            var connectedServices = new BotServices(botServices);
            services.AddSingleton(sp => connectedServices);

            connectedServices.StorageServices.TryGetValue(ServiceTypes.CosmosDB, out var dataStore);
            
var userState = new UserState(dataStore);
            var conversationState = new ConversationState(memoryStore);
		
// Create the User state. (Used in this bot's Dialog implementation.)
            services.AddSingleton(userState);
// Create the Conversation state. (Used by the Dialog system itself.)
            services.AddSingleton(conversationState);

            services.AddSingleton(new BotStateSet(userState, conversationState));

You can see that we store UserState in CosmosDB, and the Conversation state goes to the memory.

Here’s how you can manage your state properties:

  • MainDialog.cs
private UserState _userState;
_state = _userState.CreateProperty(nameof(GreetingState));
  • GreetingState.cs
public class GreetingState
    {
        public string Name { get; set; }
    }

  • GreetingDialog.cs
public class GreetingDialog : InterruptableDialog
    {
        private IStatePropertyAccessor _accessor;
        private GreetingState _state;

      ...

private async Task AskNextStepAsync(
            WaterfallStepContext step,
            CancellationToken cancellationToken = default(CancellationToken)
        )
        {
            _state = await _accessor.GetAsync(step.Context, () => new GreetingState());
            _state.Name = (string)step.Result;

            …
}
}

Having trained LUIS and put all that functionality in place, we proceeded to the next phase.

Using Voice with Microsoft Framework Bots

Making a bot understand user’s voice makes it more welcoming. Here are some ways of using voice both for input and output:

  • use the voice to speech functionality when testing your bot, particularly using the Web Chat client, available directly on the Azure portal after clicking the microphone button:

Web Chat client
Note: This can replace the need to enter text by hand. But we’ve seen this feature on the portal only, meaning that it’s not available on other channels like Skype or MSTeams.

  • develop an IVR (Interactive Voice Response) bot with a separate endpoint

Note: We decided not to spend extra effort implementing this since those are used mainly for dial-response customer service and simple information lookup.

Deploy and Publish

The two next steps, particularly deploying and publishing the bot (chat bot), are easy to set up in Azure:

Chatbot deploy

In our project, we moved the bot directly on-premise because our meeting resources are only available in the local network. We chose Skype and Microsoft Teams as our channels.

Hosted on-premise, here’s how the bot architecture looked like:

chatbot

But Azure offers continuous delivery out of the box. You can set up a repository for continuous deployment and channels to which the bot will be deployed.

Since we wanted to take advantage of using .NET Core for our development, we decided to put our bot application in a Docker container. To create and prepare the container for shipping, we used the following commands:

docker build -t kelisa  

docker save kelisa:latest > kelisa.tar

(run in the root folder of the project; Kelisa is the name of our bot)

This container can now be shipped to an environment with Docker installed and unfolded under a specific port using the following commands:

docker load < kelisa.tar
docker run -d -p 8080:80 --name kelisa kelisa

(you can use any available port instead of :8080)

Next, just point your web server of choice to map your selected port to an external IP/Hostname. Follow this by pointing your Azure bot service to this external address in the Settings section. It can take you up to a couple of minutes to get the bot ready to respond.
If you want to test the bot locally without connecting it to a web server first, go for NgRok.

Logging

We used SeriLog to log and store information and errors. If an error occurs, use the following command to export the image contents:

docker export kelisa > kelisa.jar

Then, open the archive, go to app/Logs/{fileNameWithDate}.txt and see the available logs. But keep in mind that the export feature is currently available for Linux containers only.
You can also set up other loggers, such as AppInsights if you need to.

Wrapping Up

And that’s how we built our bot without knowing where to start! Now, we have a fully working Skype bot to assist users in booking meeting rooms in a conversational manner. For us, the main point of this experiment was not the destination but the actual journey we took to create a chatbot.
Thanks for getting through that whole lot of text with us! Stay tuned for our next adventures.

EXTEND YOUR ENTERPRISE WITH TOP
CLASS TECH TALENT

contact us