A chatbot is an artificial intelligence powered piece of software in a device, application, web site or alternative networks that try to complete consumer’s needs and then assist them to perform a selected task. Now a days almost every company has a chatbot deployed to interact with the users.
Chatbots are often used in many departments, businesses and every environment. They are artificial narrow intelligence (ANI). Chatbots only do a restricted quantity of task i.e. as per their design. However, these Chatbots make our lives easier and convenient. The trend of Chatbots is growing rapidly between businesses and entrepreneurs, and are willing to bring chatbots to their sites. You might also produce it yourself using Python.
There are broadly two variants of chatbots: Rule-Based and Self learning.
The field of study that focuses on the interactions between human language and computers is called Natural Language Processing. NLP is a way for computers to analyze, understand, and derive meaning from human language in a smart and useful way. However, if you are new to NLP, you can read Natural Language Processing in Python.
NLTK (Natural Language Toolkit) is a leading platform for building Python programs to work with human language data. It provides easy-to-use lexical resources such as WordNet, along with a suite of text processing libraries.
Importing necessary libraries
import nltk
import numpy as np
import random
import string # to process standard python strings
Copy the content in text file named ‘chatbot.txt’, read in the text file and convert the entire file content into a list of sentences and a list of words for further pre-processing.
f=open('chatbot.txt','r',errors = 'ignore')
raw=f.read()
raw=raw.lower()# converts to lowercase
nltk.download('punkt') # first-time use only
nltk.download('wordnet') # first-time use only
sent_tokens = nltk.sent_tokenize(raw)# converts to list of sentences
word_tokens = nltk.word_tokenize(raw)# converts to list of words
We shall now define a function called LemTokens which will take as input the tokens and return normalized tokens.
lemmer = nltk.stem.WordNetLemmatizer()
#WordNet is a semantically-oriented dictionary of English included in NLTK.
def LemTokens(tokens):
return [lemmer.lemmatize(token) for token in tokens]
remove_punct_dict = dict((ord(punct), None) for punct in string.punctuation)
def LemNormalize(text):
return LemTokens(nltk.word_tokenize(text.lower().translate(remove_punct_dict)))
Define a function for greeting by bot i.e. if user’s input is greeting, the bot shall return a greeting response.
GREETING_INPUTS = ("hello", "hi", "greetings", "sup", "what's up","hey",)
GREETING_RESPONSES = ["hi", "hey", "*nods*", "hi there", "hello", "I am glad! You are talking to me"]
def greeting(sentence):
for word in sentence.split():
if word.lower() in GREETING_INPUTS:
return random.choice(GREETING_RESPONSES)
To generate a response from our bot for input queries, the concept of document similarity is used. Therefore, we start by importing necessary modules.
From scikit learn library, import the TFidf vector to convert a collection of raw documents to a matrix of TF-IDF features
from sklearn.feature_extraction.text import TfidfVectorizer
Also, import cosine similarity module from scikit learn library
from sklearn.metrics.pairwise import cosine_similarity
This will be used to find the similarity between words entered by the user and therefore the words within the corpus. This can be the simplest possible implementation of a chatbot.
Define a function response that searches the user’s vocalization for one or more known keywords and returns one of several possible responses. If it doesn’t find the input matching any of the keywords, it returns a response: “I’m sorry! I don’t understand you”
def response(user_response):
robo_response=''
sent_tokens.append(user_response)
TfidfVec = TfidfVectorizer(tokenizer=LemNormalize, stop_words='english')
tfidf = TfidfVec.fit_transform(sent_tokens)
vals = cosine_similarity(tfidf[-1], tfidf)
idx=vals.argsort()[0][-2]
flat = vals.flatten()
flat.sort()
req_tfidf = flat[-2]
if(req_tfidf==0):
robo_response=robo_response+"I am sorry! I don't understand you"
return robo_response
else: robo_response = robo_response+sent_tokens[idx]
return robo_response
I have tried to explain in simple steps how you can build your own chatbot using NLTK and of course it’s not an intelligent one.
I hope you guys have enjoyed reading.
Happy Learning!!!