A Neural Conversational Model


TLDR; Using seq-to-seq models, we are training to produce output sequences from input sequences. The objective is to hold a conversation that helps solve some end goal defined by the input sequences and the output sequences make sense.

Detailed Notes:

  • Using an input sequence, we want to predict an appropriate output sequence using a seq-to-seq RNN model.


  • Greedy approach during inference is just feeding in the previous output as the new input, but less greedy is using beam search and then narrowing down based on product of all probabilities.

Training Points:

  • Used closed domain IT troubleshooting help desk Q/A and an open domain movie script dataset.
  • Quite a bit of cleaning up for both datasets, and for the movie one each sentence appears twice, once as the input and once as the output for some other input.
  • Evaluation was done by human comparison for 200 questions on the response of this model and the popular rule-based CleverBot.

Unique Points:

  • the Neural Conversational Model (NCM) performs significantly better than CB.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s