Wednesday 8 June 2011

Designing a Dialogue System From Scratch Part 2: Structure

'Part 1: What is Dialogue' is available here.

Now we're ready to consider how to model conversation on a structural level.

The first question is what underpins the fundamental structure of our proposed dialogue system in terms of progress, input and output. In each case I'll present the usual approach and at least one more left field angle.

     - Branching: This is the approach adopted by most interactive systems. They allow for backtracking of differing degrees (Fahrenheit moves inexorably forward, most RPGs work from a central hub with certain paths that can't be rolled back on, while adventure games are usually very forgiving - you can't go wrong. It seems logical not to allow too much backtracking, to reflect the natural flow of conversation.

     - Linear: It might seem strange to propose a linear system as an interactive dialogue. Providing the player no control over the topic or direction of conversation isn't necessarily realistic, but it does grant us some unique freedoms: the conversation remains flowing, and it allows us to focus the player's attention on how he's conversing, rather than what he's conversing about. Usually in games we already force certain topics and goals - you may be free to ask about different things in different orders, but if you're talking a key NPC you're going to get to the game-crucial topic sooner or later; everything else is simply context. Fahrenheit or Alpha Protocol do provide decision points attached to story branches, but for the most part there's only really one direction to go in. Instead of providing a bad illusion of freedom, they force the topic of conversation and encourage the player to focus on the degree of success he has within that topic.

     - Words / speech Obviously we usually present dialogue in this fashion. It provides complexity and realism. These are good things.

     - Pictorial / other: However, there's another way to do things. We're already familiar with dialogue being presented without any actual words: think The Sims' Simlish, or emoticons. We can represent topics, emotions and decisions entirely visually. Naturally this limits the complexity and depth of what we're doing, but it also provides us a degree of emergent potential that simply cannot be achieved with words. Computers aren't smart enough to construct sentences on the fly, written dialogue will always require a human author, and therefore a prescribed route and set of options. In The Sims, it's possible to interact on more fundamental levels that nonetheless we all understand: humour, romance, physical expressions. Our basic inputs are interpreted by the AI Sims, compared to their personality statistics, and appropriate responses output  in the same syntax. It's a system whose complexity could be scaled to a far greater degree, and could allow for far truer narrative freedom.

     - List of options: The usual dialogue tree approach, but it's worth noting this is also how we'd select our emoticons and topics if using that sort of representation. Obviously using a predefined list limits massively the possible approaches we can provide, but by using elements less specific than whole sentences (ie images or keywords) we can provide greater flexibility.

     - Mini-game: Any mini-game (eg Theme Park's negotiation game detailed previously) is necessarily going to be quite an abstraction to the degree that I'd not recommend it be the central input mechanic. As demonstrated in Theme Park, though, mini-games can make for useful tools in representing more specific elements of conversation.

     - Keywords: This really interests me. What if we allowed the player to type a word to reflect an emotion, or topic, or observation? Obviously interactive fiction has been doing this for years, and it would still require a predefined dictionary set. At the very least, though, it provides a greater sense of freedom, and can handle far more options than a traditional dialogue tree. It also allows us to hide from the player the options available to him, requiring a greater depth of consideration than simply browsing a list.

Now let's consider how to fit and test the conversational traits we've identified in the structures available. 

     - Simplified implementation: The LA Noire system. Assume animation and voice performance is sufficiently detailed for the player to employ his ability of perception entirely naturally. Requires our input method to allow him to leverage that perception appropriately. This works fine in LA Noire where what is perceived is as simple as telling the truth or lying, and the input method follows the same options, but does it scale to more complex observations? Without providing a large set of red herrings it seems like it would struggle in any context where what the player was perceiving was more specific than truth/lie, because if we give him the option to, say, accuse the merchant of having ulterior motives, he's learnt that it's important from us rather than his own observation of the underlying meaning of the dialogue.

     - Keyword implementation: Pre-define certain words in the dialogue and allow the player to either click on them or type them in, which will then lead the conversation in that direction. This wouldn't be used for selecting a topic of conversation necessarily; more so it would allow the player to identify and leverage subtext. Stupid example: "My wife has gone missing, she was wearing her best jewellery, please find her." Player inputs "jewellery motive" and opens up a quest branch where we come to understand the speaker is more concerned about the gold wedding ring than the wife. If this was a trad dialogue tree it'd be an obvious dialogue option; using keywords it becomes a question of the player's insight. Effectively you're asking the player to demonstrate his understanding of ther subtext - what is this person really talking about?

Knowledge:  In trad dialogue trees this is usually represented by a variable: if the player pursued dialogue option X previously then provide new dialogue option Y. Perhaps the more challenging approach is to provide the player a bank of collected information: facts or topics he's discovered previously which must be selected specifically at key points. It's still pre-authored, and will be indescribably annoying when ti doesn't work (yes you, LA Noire) but again it does put more emphasis on the player's knowledge, rather than his character's.

Eloquence / Timing: It's hard to allow the player the express eloquence - the natural skills we employ every day (to varying degrees of success) are too complex and numerous to really model (though one could argue the sum of a successful dialogue system would itself be representative of eloquence). At any rate, we certainly don't want to model it as a statistic (+5 charisma) because that's unnatural and unsatisfying. This seems, to me, like a great place to use a mini-game. We allow the player to select his topic or tone, but we apply a modifier based on a mini-game, which will affect the tone, relationship or information presented in the response. What this catches, for me, is that basic buzz of successfully pulling off the perfect one liner; or, more interestingly, knowing exactly what you need to say and entirely failing to communicate it.

Group Formation: This is the only trait which makes sense to model as a straight set of statistics. Social standing is something that's affected by conversations and actions previously undertaken, and which can only be affected by the same in the future. It's commonly modelled very simplistically in RPGs - eg if character X likes you more than 50% then dialogue option Y appears. It could, naturally, be extended. If you're using threats, is the character aware of previous instances where you've backed down? If you're trying to lead a conversation, are there enough people in the group who already respect you as a leader?

In 'Part 3: The System' we'll complete our overview of the traits by looking at self-control vs personal expression, and finally tie everything together into something vaguely resembling a coherent system. Maybe. Eyes on the prize this time next week.


  1. I appreciate the irony of having an in-depth discussion of complex narrative mechanics punctuated by a write up on Adidas miCoach ;-)

  2. I would love to see someone create a deep AAA rpg style game that forgoes the repetetive grind (combat) which often is just filler between NPC plot-enhancing dialogue and dives in with pure conversation as the RPG mechanic itself. Imagine earning "intuition" experience points or fighting a "debate" boss battle!

  3. It's not AAA, but otherwise you've just word for word described Winter Voices.

    Before you get too excited, it turns out simply replacing the words 'fireball' and 'chain lightening' with 'avoidance' and 'self-expression' doesn't NECESSARILY make for a more engaging experience. You're also constantly at war with the pacing (of plotting, combat and navigation).

    However, there are soem really neat ideas and metaphors at work in the game. Combat is with personal demons, with the only success scenario being to flee them. The more you choose to remember these encounters, the more you learn but the harder it becomes; forgetting about them makes things easier. Your character choices are between three women - the first seems to be some kind of wizard (as expected); the other two are a seamstress and something else equally mundane. The plot centres around your character's grief over her father's death.

    So. If you're at all interested, check out the steam demo.

  4. I don't think Winter Voices is that close to what Breakdance McFunkypants is talking about - "pure conversation" isn't really a core mechanic in WV. It's still "just" fighting. The big difference, as you mentioned, is that the battles are metaphorical. That and the skills, and how both aspects interweave with the theme of dealing with grief.

    That said, it's an interesting game worth checking out.

  5. This is all pretty interesting...I'm not sure I agree with all of it, but it's certainly thought-provoking.

    One thing that particularly bothered me was this:
    "It's hard to allow the player the express eloquence...At any rate, we certainly don't want to model it as a statistic (+5 charisma) because that's unnatural and unsatisfying."

    That, to me, seems very misguided. The thing is, a lot of people play games as escapism. If you're someone who has trouble talking to people in real life—if job interviews stress you out, for example, then sometimes you want to play a game that lets you be someone else. Sometimes you want to pump up your Charisma stat and let them game give you the feeling of being able to manipulate anyone you want.

    To put it another way: The more you make conversation "natural", the more you shut out people who aren't "naturally" good at conversation.

  6. Thanks for your comments, John. I think misguided is a bit strong; it's a considered design decision - one primary goal of this project has been to emulate dialogue as closely as the format allows.

    Personally I'd be extatic if there was a game that so closely mapped dialogue that a person could carry over their natural strengths & weaknesses. Escapism and power fantasies are fine, but not the only option.

    That this might seem exclusive doesn't bother me any more than that bejewelled excludes poor pattern matches or CoD excludes those with slow reactions.

    Considered as a commercial concern? Difficulty levels.