Author unknown

[Note: Amit Patel converted this to HTML but did not write the original document.]

Did I hear a request to discuss conversation engine techniques? That means an opportunity to repost for discussion my 26 screen paper on the best way to talk in a game. Booo-ha-ha-ha-ha-ha... :)

Seriously, I’m quite interested in how to work a solid, intuitive conversation, and any suggestions would be welcome.

1.0 Introduction

I’ve noticed a preponderance of posts all looking for various graphics tricks and techniques to use in their games. While I have no problem with this (in fact, I’ve learned quite a few things), there’s been another programming problem on my mind lately. I haven’t really though too heavily about it, but I’d like to throw a few ideas out in this forum and see where it goes. Results of open-discussion will be incorporated into this paper over time. Source code will inevitably follow, and the resulting monolith (as it seems to be taking on a life of its own) will be donated to the Rec.Games.Programmer Programming Encyclopedia if it is deemed useful enough.

1.0.1 Statement of Purpose

What I’m interested in creating is a conversation engine. I want to find a method of allowing communication between players and the computer characters (Non Player Characters or NPCs). Ideally it would allow the following:

  1. Structured interaction between the player(s) and NPCs. This would allow conversations to take place, and would be the normal method of communication. It would allow player input, as well as the “standard greeting” given by an NPC from whom nothing else can be obtained. For example: it would allow players to negotiate with a shop owner, but also allow players to get information from a pedestrian (“Welcome to Port Nowhere”. <end of conversation>).
  2. Flexible dialogues based on external flags. This would allow NPCs to change conversations depending upon events in the storyline. For example: everyone in town will talk of nothing other than the slimy alien who ate the mayor. Once the alien is disposed of, they would be free to talk about other things.
  3. It should be (fairly) easy for the players to determine how to ask certain questions, while not be merely a matter of walking through every branch of a conversation tree. For example: the venerable Sierra engine vs. Ultima Underworld (more detail on this later).
  4. It should be fairly easy to program and maintain (including efficient storage) to allow many large conversations to be created, tested, and maintained without inordinate amounts of effort.
  5. It should be flexible enough to provide better than 3rd grade levels of conversation, but extensible enough to handle modifications without undue difficulty.
  6. It should be easy to use for novice players, but convenient for advanced players. This means it should also allow both keyboard and mouse input. If it can incorporate both seamlessly, then the player can decide what combination works best.

One other item to keep in mind, is that the conversation engine should be flexible enough to be incorporated into the rest of the interface. It must function smoothly with the movement engine, and any other control routines the program may require. What this list of requirements does, is prove that this problem will be more difficult to solve effectively than the graphics portion of my program, because the approach is not terribly well defined.

2.0 Common Approaches

There are three main approaches I have seen. They are:

  1. Text Parsing
  2. Menu-Driven Conversations
  3. Hybrid Conversations

Each has distinct advantages and disadvantages:

2.1 Text Parsing

Text Parsing Engine. A prompt is given to the player, nothing more. Input is typed in the following syntax: <VERB> <NOUN>. This is a very old and refined method of input for a game, allowing all manner of action to be effected upon the environment. Generally, it has been used as the primary means of interacting with the game (as in the case of the old text adventure games), but has also been seen used as a primitive vehicle for interaction with NPCs. Examples of this sort of input would be: TADS (the Text ADventure game System), old Sierra games, Zork (really old people might remember this one : ) ), etc.

Advantages:

“Correct” actions are concealed from the player. Player needs to think about problems, rather than just walk though all the possibilities. Generally very good for puzzle creation, where the trick is to determine what to use X for to effect Y (for example: “put the key in the lock”, “open the door”).

The fact that it is the oldest form of input means that it has been resolved to a respectable level of complexity, and can simulate natural conversation fairly well. For example:

ASK THE MAN ABOUT THE WHALE

"Oh, the whale.  He's a real killer alright.  Nary a man sails
 the sea without keeping one eye out for the whale."

ASK THE MAN ABOUT A BOAT

"A boat?  Surely you're not thinking of sailing?  You're
 crazier than I thought!"

Disadvantages:

“Correct” actions are often extremely difficult to determine, leading to frustration on the part of the user, especially if the puzzles are difficult, ill-conceived, or multi-part. This is especially difficult when a nice illustration or snazzy description does not match the name by which an object is referred to internally. Anyone who has ever played one of these games has seen the following:

TAKE THE WINE

"I don't know what a WINE is"

TAKE THE FLASK

"I don't know what a FLASK is"

TAKE THE DAMN BOTTLE

"I don't know what a DAMN is"

TAKE THE BOTTLE

"You have taken the bottle."

This can quickly lead to frustration, as the player has difficulty in manipulating the interface. The largest problem is that user though patterns are always a mystery, as we have all seen (especially from beta testers: “What? You mean you tried to put the bathtub in your backpack and the game crashed? Gee... I never thought of that one...”) Players, especially good ones, are notorious for trying the bizarre just to see what happens. (In fact, I ‘ve always found that in these types of games, finding the little bizarre cases that they actually put into the game just to see if anyone tries it are the most interesting parts of these games.) Regardless, this form of communication is user un-friendly, despite the fact that it’s relatively straightforward to program.

2.2 Menu-Driven Conversations

Menu Driven Conversations: These are conversations where the NPC says something and the player is presented with a multiple choice answer. Ultima Underworld II used this form of communication:

Angry Orc: "Hey! Nobody gets in here without the password."

A. "Yes.  The password.  I know it."

B. "The password?  Oh, I forgot it.  It's downstairs.  I'll be
    right back."

C. "Password?  I don't need no stinkin' password!"

Advantages:

The choices are obvious, allowing the player to concentrate on their course of action, not “guess the right question”. An interface should intrude upon the game as little as possible, allowing the player to invlove themselves with your game, not entangle themsleves in their hardware.

The conversation is a little more colorful (I’ve always loved choice “C”.) and allows you to texture the conversation and use consistent slang to keep the player in character:

B. "Yea, the scourge of the east approaches."

Disadvantages:

The choices are a little too obvious. A player can merely save their game, and then try all the different branches of the conversation tree until they find the desired path. One possible solution is to allow save games only in certain locations, making it more trouble than it’s worth to try all the options at will. On the other hand, this inconveniences the player and complicates the interface. (Personally. I don’t mind the limited saves concept. It has the extra advantage of making combat a little more stressful than infinite save points does.)

Large numbers of choices are difficult to implement. If your game is a detective-type game, where you have large numbers of questions to ask large numbers of people, than the engine becomes unwieldy. It can also take a long time to traverse enough branches to get the desired result. This also detracts from the feel of the game.

2.3 Hybrid Engine

The hybrid engine is a combination of both previous systems. On the one hand, it presents the player with a list of actions that may be tried, allowing them to decide which is appropriate. It also obscures the painful obviousness of multiple-choice answers. From this basis, there are two major variations: the full-keyword version, and the basic choices plus optional keyword version. Both systems traverse conversations in the same way.

Typical conversation:

(Approach Fred, initiate conversation by selecting GREETING
 from list)

you: "Greetings friend."

response: "Greetings yourself, traveler.  Nice day today."

(select TRADE)

you: "I have traveled many lands and have acquired many
      curiosities.  Perhaps you might
      have as well.  Would you care to barter a bit?"

response: "Ho there!  Is my profession so obvious that you can
           spot me as a trader a mile off?  Of course!  
           Let us see what we have..."

(Enter bartering routines)

2.3.1 The All-Keyword System

This engine presents the player with a list of keywords. The player picks a keyword, and the appropriate conversation proceeds from there. As the game progresses, more keywords are added to the list, allowing the player to revisit certain NPCs to see if they know anything new. An example of this type of game would be Bad Blood, a post-apocalyptic game. As you progressed in the game, new keywords would be added to your list of options. You would know which parts of conversations were important, and could then revisit key characters and try the new keywords.

Disadvantages:

They all-keyword system. This allows the player to talk to everyone about everything, and wait for the important keywords to show up. (Similar to the way you’d get points in the old Sierra games for doing correct things.) This could take some of the thought out of the game, as a player can simply run through conversations only looking for the ‘beep’ that indicates a new keyword. On the other hand, it allows some feedback to the player on their progress.

Another disadvantage, is that all intonation is predetermined. If you wish to ask a character about a certain item. like a bottle, you choose bottle and the computer phrases your question for you. It takes away a player’s ability to decide how they wish to phrase their question. Phrasing is especially useful when giving NPCs their own motivations. For example, you might want to lean on a weak-willed Orc to get information, but you would probably affect a more fawning tone in the presence of royalty, lest you find yourself shorter by a head. Intonation and interpretation is a vital part of communication (as seen by the net usage of : ) and similar icons to ensure that the intended meaning is not misinterpreted by the reader) and eliminating that aspect of conversation can detract from the player’s association with their character.

2.3.2 The Standard Choices With Optional Keyword

This is similar to the first option, only this time, there are only basic choices presented to the player. There is also a blank space, where the player may try any other keywords they might like to try. This allows for some subtlety on the part of the designer. Key hints could be given and the keyword not added to the list (which would really annoy some players, who would accuse you of design flaws) or the list of choices could always be the same, relying on the character to keep track of what might be important. An example of this type of game would be The Summoning, a single-person dungeon game.

Disadvantages:

This system might seem to present some of the same limited choices that commonly have players trying vaguely related verbs. For example, the ubiquitous “use” verb that also stands for: twist, read, peel, pry, flick, turn on, etc. The other possible twist is that a player may find out a keyword before the plot would normally present it to them. This might be more common than initially thought, with stumped players frequently requesting this sort of information at the wrong time, or for someone who might by playing a second time. The solution would be to implement the EVENT flag concept, locking out certain keywords and conversations until certain events have happened. This system also suffers from the same lack of intonation as the previous flavor of this engine.

3.0 Conclusions

To be honest, I’m not quite sure. If I had to say that I was leaning towards any particular engine, I’d have to say it was type 2.3.2: standard action words with optional keyword. It seems to offer the strongest balance between player assistance and mystery. Unfortunately, it does nothing to solve the intonation issue. We would seem to need another option for our list of choices: one which allows some flair in conversation, allows the player to choose emotional context for their statements, yet obscure the choices somewhat, making the conversation less of a “try all the choices in order” process.

What I’d like to know is if anyone else has thought about this problem, if I’ve might have missed another way, or if there is a better way to implement one of these choices.

4.0 Coding Issues

As for the coding issues discussed initially, I ‘d like to keep those in mind when discussing the possibilities (e.g. how could they be implemented) without clouding the issue with coding techniques just yet. (Because we all completely think through our ideas before we start programming them right? : ) ) Even so, we need to remember that the system should be extensible. If it works well, I intend to adapt the engine for use with examining objects and other sorts of interaction with the environment around the players. Flexibility is the key - we are looking to support both short conversations, as well as longer ones, and we will need to be able to maintain and dry-run our conversations to ensure that they flow. This section will be expanded along with the others as the topic matures.

Bibliography

  1. The Journal of Computer Game Design, Volume 6, Number 3. Estvanik, Steve. Designing a Mouse/Command Line Interface. February, 1993. p.10-11.
  2. The Journal of Computer Game Design, Volume 6, Number 3. Em, Michele. How to write Interactive Characters and Dialogue. February, 1993. p. 14-5. (Before you ask: “The Journal of Computer Game Design is published six times a year. To subscribe to the Journal, send a check or money order for $36 ($50 outside North America) to:
         The Journal of Computer Game Design
         5251 Sierra Road
         San Jose, CA 95132
    Although I wouldn’t if I were you. Not that the effort isn’t appreciated, but it’s hard to get in-depth articles form starving programmers who are already past deadlines. Articles are 1 - 2 pages and very abstract. The Journal has moved away from code-oriented articles in favor of “interactive storytelling” types. I also cannot vouch for the accuracy of this information. I stopped getting it a year ago after the first (and only year) of my subscription.)
  3. Hartnell, Tim. Creating Adventure Games on Your Computer. Ballantine Books. New York, New York. 1984.