Monday, December 5, 2016

Why I don't like the term "AI"

Content note: I replicate some ableist language in this post for the sake of calling it out as ableist.

In games research, some people take pains to distinguish artificial intelligence from computational intelligence (Wikipedia summary), with the primary issue being that AI cares more about replicating human behavior, while CI is "human-behavior-inspired" approaches to solving concrete problems. I don't strongly identify with one of these sub-areas more than the other; the extent to which I hold an opinion is mainly that I find the distinction a bit silly, given that the practical effects seem mainly to be that there are two conferences (CIG and AIIDE) that attract the same people, and a journal (TCIAIG - Transactions on Computational Intelligence and Artificial Intelligence in Games) that seems to resolve the problem by replacing instances of "AI" with "CI/AI."

I have a vague, un-citeable memory of hearing another argument from people who dislike the term "artificial" because it presupposes that biological matter is fundamentally different from digital. I'm a bit more sympathetic to this argument. "Computational" seems like a fine substitute, except that apparently it means something else. ;) I suppose I'm partial to "synthetic" (vs. "organic").

But ultimately, it's not the first word in "AI" that bothers me, that makes me hesitant to adopt it as a field I identify with -- it's the second one, intelligence. My issue is not just that "intelligence" is poorly defined and hard to measure, but actually that it highlights everything I find culturally wrong with computer science as a field: a false dichotomy and prioritization of the "smart" over the "dumb", the "rational" over the "emotional", and a supposition that these qualities are immutable and acontextual.

Fundamentally, the language of "intelligence" is ableist, as Tim Chevalier explains well.
Popular culture seems to like the "innate intelligence" idea, as evinced by movies such as "Good Will Hunting". In that movie, a guy who's had no social interaction with a mathematical community bursts into a university and dazzles everyone with his innate brilliance at math, which he presumably was born with (for the most part) and put the finishing touches on by studying alone. The media seem to be full of stories about a bright young person being discovered, a passive process that -- for the bright young person -- seems to involve nothing except sitting there glowing.
Computer scientists can at times be obsessed with the idea of the loner auteur, the one person (usually white guy) who just has to work alone at a blackboard for long enough until he understands what no one else can and instantly changes the world with his discovery. In this narrative, a natural order emerges in which people without such gifts have a responsibility to simply identify intelligent people and bring them into their rightful privileges.

Meanwhile, in the real world, intellectual efforts succeed by collaboration and social support, and by work one spends communicating and disseminating their ideas -- the time one spends on which can completely overshadow time spent actually programming, proving, or measuring -- and subsequently refining them upon critique and peer review. And believing intelligence is immutable leads to poorer performance on intelligence tasks. And I could link to a wealth of research discussing the role of social cues and support in academic success. The evidence is overwhelming that "intelligence," under all definitions by which it's measurable, is not a static nor inherent property of an individual.

Lindsey Kuper also makes the argument that we should say "experts" instead of "smart people" to highlight the fact that expertise is relative, and privileging expertise in one specific thing over another often isn't productive when the point is to communicate well between different domains of expertise.

So why does this matter for AI? Mainly because I see it as a flawed research goal. To the extent that AI is about re-implementing human-like abilities through algorithms, why is intelligence the one we want to focus on?

I have no interest in creating Synthetic Idealized Bill Gates or Robot Idealized John Nash, at least not for the sake of their "intelligence." We have to be asking what AI is for. It seems like if you want a synthetic intellectual, you probably want it to be a good collaborator. And I have no evidence that "intelligence" as typically defined is a good predictor for collaborative skill.

On the other hand, I like working with proof assistants and automated deduction systems (e.g. Twelf, Agda, Coq, Prolog, ASP, Ceptre, STRIPS planners). It's not because these things are intelligent but because they do something (a) predictably and (b) quickly. I like generative methods because they, like young human children, repeat the things I tell them in novel and amusing configurations. I wouldn't call that intelligence, but I might call it creativity or whimsy.

I'm finding it an interesting exercise to go through Tim's list of alternatives to intelligence and identify which "synthetic" versions exist or don't:

  • Synthetically curious: web crawling, stochastic search, query generation
  • Synthetically hard-working: most automation systems
  • Synthetically well-read: text processing, natural language processing (NLP), narrative analysis
  • Synthetically knowledgeable: expert systems, commonsense reasoning
  • Synthetically thoughtful: automated deduction? What about other kinds of thoughtfulness (e.g. consideration for diplomacy or emotional context?)
  • Synthetically open-minded: // TODO
  • Synthetically creative: See: ICCC conference
  • Synthetically attentive to detail: most AI
  • Synthetically analytical: static/dynamic analysis; NLP; image/sound processing...
  • Synthetically careful: software verification
  • Synthetically collaborative: mixed-initiative creativity tools; interactive theorem provers
  • Synthetically empathetic: Affective computing? // TODO
  • Synthetically articulate: narrative summarization; natural language generation?
  • Synthetically good at listening: chatbots/dialogue systems, ideally // TODO

A secretly great thing about being more "politically correct" (as in: more considerate of the language we use) is that it's really about being more precise and concrete, which in turn is a great mechanism for generating research ideas.

Edit to add: I also think it would be rather interesting, if someone isn't doing it already, to try to model human cognition that we label as *less* intelligent, or neuro-atypical. What would a computational model of autism or dissociative identity disorder look like? How might we represent a computational agent experiencing anxiety, depression, or trauma? Answers to these questions could also lead towards new answers to some of the questions above, like how synthetic empathy or social skills might work.

Sunday, November 20, 2016

Paper and Game of the Week

This past week I attended (and co-chaired, organized a workshop for, and presented at) the International Conference on Interactive Digital Storytelling (ICIDS). In celebration of a successful ICIDS, I'll share a Paper and Game of the Week each of which I discovered during it.

Paper of the Week: "Using BDI to Model Players Behaviour in an Interactive Fiction Game"

By Jessica Rivera-Villicana, Fabio Zambetta, James Harland, and Marsha Berry; available for download on ResearchGate. Disclaimer: I attended the talk about this paper, but have only skimmed the text of the paper itself.

Player modeling is a sub-area of game AI concerned with representing and tracking players' mental states and experiences while playing a game. This is the first paper I've seen addressing the problem in an interactive narrative context. BDI stands for Belief, Desire, and Intention, a philosophical framework for agent modeling (from 1987) that supposes all actions are driven by those three things, which relate to one another in the following way:
  • Agents are inherently assumed to have goals, which create desires;
  • Agents observe things about the world, which form beliefs;
  • Desires and beliefs inform intentions, which drive action toward specific goals.
The hypothesis stated in this work is that BDI suffices as a model of "folk psychology," or "how I think you think," as Rivera-Villicana put it in her talk -- so in turn, it can be used to model a human player's traversal of a constructed, text-rendered space with a short puzzle and a win condition. Having a good model of how humans play this kind of game, i.e. how they traverse the space to find a solution, can inform better design of interactive stories.

My favorite slide of the talk was a comparison between example play traces between the optimal playthrough and more cognitively-realistic playthroughs:

In a sense I kind of see player modeling in general, and this study in particular, as a kind of Wason selection task of game AI: it proves that theoretically optimal or rational reasoning makes a poor model for human reasoning, and in this case demonstrates it in the context of reasoning about what actions to take in an unknown environment. 

Game of the Week: Fiasco

I finally got a chance to play Fiasco, a decentralized improvisational roleplaying game, with a few other conference attendees. I'd been hearing excellent things about it since Ian Horswill presented a system for generating Fiasco playsets at EXAG a year ago.

Game Summary

The premise is somewhere between freeform improvisation and tabletop roleplaying. The players (up to 5; my group had 4) sit in a circle, and the story is set up by forming relationships between adjacent pairs of people. For each pair, one member chooses a general category appropriate to a genre (e.g. family, work, crime, romance) and the other member chooses a specific relationship within that category (e.g. grandma, boss, drug dealer/client, ex-lover). They also choose a need, location, or object that relates to this relationship. Finally, players flesh out their characters with names and backgrounds and the details of the relationships they've outlined.

Play then proceeds cyclically with each player opting to either establish or resolve a scene, with the other players collaboratively filling in the complementary task (resolving or establishing). Midway through the game's 4 rounds, something called The Tilt happens, a crucial turning point in the action selected similarly to the character relationships, through an agreement between two players determined by a die roll. At the very end, the Aftermath scenes wrap up each character's storyline.

My Experience

I'm pretty terrible at Improv -- I'm much more reflective than action-oriented, so it's hard for me to decisively manufacture narrative content under time pressure. That makes me pretty terrible at Fiasco, too. Luckily, the storytelling is collaborative, so there were few situations wherein my indecision actually held up the game -- often many people would suggest ideas that I could pick from among.

Occasionally I wished the game had more structure, or perhaps just that I were more practiced at it. I sort of enjoy the idea that the game's fun can scale with the players' creativity, but on the other hand, I wonder if the game could do more to support player creativity, e.g. a deck of cards with prompts to "unstick" hard-to-resolve situations, or an additional constraint to resolve the paralysis of a blank page (actually worse: friends staring at you while you think!). (Alcohol helps, but we ran out of bourbon too quickly.)

Collaborative storymaking

I loved having played this game in the evening preceding an amazing keynote talk by Semi Chellas, who spoke about her experience writing for Mad Men. She had a lot to say about the process of writing collaboratively, as a collective brain consisting of many humans, about the process of sharing an experience with a piece of music or a movie or a poetry reading to put everyone in the same mindset, then churning out ten stories per person in the space of two weeks. She frequently spoke of how story outcomes were overdetermined a lot, as in, "we all knew that X had to happen but there were too many reasons it could happen," and how the "multiplicity of voices & perspectives" in the writer's room sorted each other out and influenced each other to come into harmony ("we dream each other's dreams"). On the other hand, there are also constraints: a TV show can only be 47m03s long and this event has to happen on this timeline and this character needs to appear in these scenes... she called this the "crossword puzzle" aspect of TV writing. An absolutely stunning example of fitting an overdetermined story into an overconstrained crossword puzzle was the central plotline of the episode The Other Woman around the pursuit of the Jaguar account, which I recommend watching if you have the ability (e.g. Netflix).

Anyway, the experience she described sounded nothing like playing Fiasco, but I appreciated the context it gave me.

Bonus Game of the Week: The Room

First-day ICIDS keynote speaker Kevin Bruner of Telltale (and, less recently, well-loved games like Grim Fandango) discussed puzzle games in the a Q&A question about why Telltale doesn't do them. He offhandedly mentioned the iOS game The Room, which I then promptly downloaded (having believed I basically knew about all popular small-scale room-escape-style games from the last decade, but apparently being wrong).

It's kind of like Monument Valley but a tad less obvious. Has a "puzzle box" premise with controls that are pleasantly physical, if occasionally clumsy. Very pretty, narrative-light, just the right amount of mentally taxing to recharge one's introvert battery with at the end of a long day of conference socializing (for example).

Saturday, November 12, 2016

Paper and Game of the Week

Paper of the Week: Imaginative Recall with Story Intention Graphs

By Sarah Harmon and Arnav Jhala. Bias disclaimer: both Sarah and Arnav are folks I've worked with and consider colleagues.

Imaginative recall is the process of generalizing and extrapolating previously-seen narrative examples to create new ones. Harmon and Jhala present an automated system for carrying out this process based on a narrative representation scheme called story intention graphs (SIGs).

This paper was a bit hard for me to tease apart initially because there are really two things going on:
  1. The system of imaginative recall originally embodied by the Minstrel system, later adapted to the modern rewrite Skald, which uses case-based reasoning
  2. The translation between Skald story representation and SIGs
(1) is largely prior work, but lays the groundwork and motivation for their project. Ultimately they are interested in the problem of carrying out processes on narratives such as adaptation, transformation, and measurement of complexity. However, although Minstrel/Skald can do something interpretable as these processes, they need a means of evaluating exactly what it can do and how it compares with other systems, which motivates contribution (2).

The problem they face, as I understand it, is a lack of standards: Skald doesn't have an agreed-upon semantics that permit mapping from internal representation to natural language, and furthermore, its output cannot be compared to that of similar systems, because there is no common format for comparison.

They thus propose story intention graphs as an answer to these problems. Story intention graphs began as an informal, analytical paradigm for dissecting narratives into many kinds of relationships between events, characters, and underlying motivations. The canonical example is the Aesop fable "The Wily Lion":
A Lion watched a fat Bull feeding in a meadow, and his mouth watered when he thought of the royal feast he would make, but he did not dare to attack him, for he was afraid of his sharp horns. Hunger, however, presently compelled him to do something: and as the use of force did not promise success, he determined to resort to artifice. Going up to the Bull in friendly fashion, he said to him, “I cannot help saying how much I admire your magnificent figure. What a fine head! What powerful shoulders and thighs! But, my dear friend, what in the world makes you wear those ugly horns? You must find them as awkward as they are unsightly. Believe me, you would do much better without them.” The Bull was foolish enough to be persuaded by this flattery to have his horns cut off; and, having now lost his only means of defense, fell an easy prey to the Lion.
...which maps to the following SIG:

This graphs relates lexical components of a story to interpretive nodes such as goals, and, most importantly, relationships between goals, story states, and affectual impacts on characters, which allows us to analyze The Wily Lion in terms of the crucial fact that the lion's intention to eat the bull provides for the lion, but damages the bull, leading to conflict. Representing this type of information permits reasoning over generalized patterns, such as betrayal and revenge, to try to find similar structures across a corpus of stories.

Meanwhile, Skald stories are unordered collections of "case frames" representing goals, states, and actions. Case frames are simply records with labels like "actor", "object", "scale", "to", "type", and "value", which, e.g. for actions, track the subject and direct and indirect objects of the action.

So the majority of the paper describes a compilation strategy from stories in Skald case frame representation to SIGs. Then, The conclusion of the paper states that the same strategies for analogical reasoning and recall in Skald may be applied to SIGs in a more enriched, extensible way.

Game of the Week: The Voter Suppression Trail

Voter Suppression Trail is a tiny (news-article-length) empathy game created for the New York Times on U.S. election day. In it, you are invited to experience three distinct simulations of the voter experience across demographics and geographic locations.

While I am skeptical of the concept of "empathy games" for many of the reasons discussed by Mattie Brice and others, I find it encouraging that the notion of communicating a different viewpoint through an interactive experience is finding its way into mainstream media outlets. I find this game interesting in part because of all the things it could be but isn't: for example, to illustrate systemic oppression, it would be more convincing to encode real facts about demographics and polling locations and allow these racialized disparities to emerge as consequences. There is something of an opportunity for generative methods to provide more believable arguments along these lines, where hand-authored stories like this one can always be explained away as cherry picking.

Friday, November 4, 2016

Paper and game of the week

I'm going to try to loosen the ol' blogging joints a bit by experimenting with a weekly feature: a Paper and Game of the Week, posted every Friday morning. My goal will be to keep a record of recent research inspirations in the hopes of exposing interesting gems to others, and providing better context for my own work. I will preemptively establish the expectation that the paper and game of the week may not *literally* be a paper and a game; the objective is more like "something CS-academia-centered" and "something creativity/arts-movement centered," which for me in recent weeks has mostly meant papers and games, but at other times has included talks, interactive essays, plays, art exhibits, and weird internet art.

So without further ado:

Paper of the Week

Commonsense Interpretation of Triangle Behavior by Andrew S. Gordon (not Andrew D. Gordon, although his papers might easily feature on this blog too).

This paper is a formalization of the reasoning that psychology researchers Fritz Heider and Marianne Simmel observed takes place in humans when shown abstract animations and asked to explain what was going on. Here's the movie they showed subjects:

Their study showed that people produce consistent narratives for what is "happening" in the movie, universally ascribing human-like behavior such as beliefs, goals, emotions, and social relationships, despite there being no words or concrete human features assigned to the triangles or circle.

Gordon refers to the type of abstract but legible-as-human behavior in this video as triangle behavior, a term I enjoy partly because I now live in The Triangle and partly due to a certain artist-mathematician's love of triangles.

In Gordon's paper, he presents a hand-authored library of logical rules that perform abductive, probabilistic reasoning to capture this kind of commonsense social reasoning. The system will take as input a logically-modeled description of an animation, such as "The triangle opened the door, stepped out- side and started to shake." Then it can answer queries such as, "Why did the triangle start to shake?" with options like "because it was cold" or "because it was upset." Without further context, the system responds with the first reason being the most likely. On the other hand, if this action is contextualized with a social situation, such as being attacked by another shape inside the area that it left, the system reasons that there is higher probability that the triangle is shaking because it is upset.

I find this paper interesting first of all because it's a AAAI paper in 2016 that uses symbolic knowledge engineering, which I had practically written off as a nonstarter (and haven't bothered submitting any of my own work there for this reason). In the structure of the research problem and solution, it's quite similar to my & my Santa Cruz collaborators' work on formalizing proceduralist readings as a logic program, where in that work the "human" reasoning we're trying to encode is understanding a high-level, experiential meaning from a game bases on its mechanics and communicative affordances.

Second, this paper is quite interesting from a visual narrative point of view. Our proceduralist readings work was aimed at eventually generating games based on meanings; likewise, one could imagine such a system being run in the reverse direction to accept meanings (or alternatively, communicative goals) as input and produce animations that satisfy them. The reason this paper came across my radar in the first place, in fact, was that the Visual Narrative group here at NCSU was reading it.

Game of the Week

I have been playing Stephen's Sausage Roll very slowly. It's a wickedly hard puzzle game that, I'm told, gets increasingly interesting with time, though so far it has been mainly a slog (and occasional youtube hunt for a solution). I'm not sure I agree with my peers that the puzzle design is flawless -- I've definitely completed puzzles for which I'm not sure I could repeat the solution -- but it's certainly satisfying to finish each new puzzle, and I keep coming back for more.

Monday, October 24, 2016

Two talks: an introduction to the POEM lab; a survey paper on story generation

Principles of Expressive Machines

Last week I gave a presentation to the first-year computer science grad student seminar on my research, AKA an introduction to the "Principles of Expressive Machines" (POEM) lab, because I am looking for students. This talk was my first attempt to organize my future research plans into something vaguely coherent and forward-looking (in more depth than my job talk described), so I thought I'd share the results of my efforts. Here are the slides:

In the talk, I outline three research agendas:

  1. Narrative knowledge representation and generation
  2. Tools for game and interactive fiction design
  3. Social multi-agent system modeling
The slides are not particularly verbose, but there should be enough in them to grant a sense of what I'm interested in.

Story Generation Survey

This semester, I'm teaching a course on Generative Methods, i.e. algorithms for producing creative artifacts -- such as stories. For the most part, students have been presenting the papers each week, but there's one fewer student than there are papers to present, so I presented this week's: A survey on story generation techniques for authoring computational narratives by Ben Kybartas and Rafael Bidarra.

The paper is a nice resource for folks looking to understand the state of the art in computational storytelling, and I hope my slides will save similar folks the effort of reading a 20-page academic article. :) 

Sunday, October 9, 2016

Augmenting reality without augmenting vision

A common narrative that people tell about virtual and augmented reality (VR and AR) goes something like this: "VR means total immersion in an environment, allowing a game designer to involve you directly in their completely hand-fabricated version of reality. It does this by completely supplanting your field of vision with a simulated 3D environment. AR, on the other hand, only supplants part of your field of vision, allowing overlays of simulated objects and information atop what is otherwise seen normally in the world."

The attentive reader will notice that one sense in particular was heavily emphasized in this explanation: vision. It seems like many people almost take it as a given that supplanting or augmenting reality means changing what we see in a very literal way, and sometimes this idea becomes almost a magic bullet, as though manipulating vision is all it takes to create compelling experiences, as if a more convincing simulation of vision is the main missing piece for telling better stories that center a human interactor. It's for this reason that I've taken to somewhat tongue-in-cheekedly referring to 3D virtual environments as "eye simulators" (thanks @zarawesome) to distinguish them from all the myriad other ways that one could consider rendering, or communicating, a simulation of space, even 3D space (I mean, you can "go up and down" between levels in games where the interface uses conventions from purely textual IF or ASCII-rendered Roguelikes).

Despite my occasional frustration with the hype surrounding VR and "immersive (visual) realism," I believe that constructed, visual virtualities have an awesome potential beyond their current use in games. Recently, Tale of Tales pointed their followers to the Real Time Art Manifesto that they published 10 years ago, and the most interesting part to me is the bit on storytelling, where they actually explicitly reject the idea of using this medium for telling constructed, drama-managed stories:

Embrace non-linearity.
Let go of the idea of plot.
Realtime is non-linear.
Tell the story through interaction.
Do not use in-game movies or other non-realtime devices to tell the story.
Do not create a “drama manager”: let go of plot!
Plot is not compatible with realtime.
Think “poetry”, not “prose”.
The ancient Greek philosopher Aristotle recognized six elements in Drama.
what happens in a play, the order of events,
is only one of them.
Next to plot we have
or the main idea in the work
or the personality or role played by an actor
the choice and delivery of words
the sound, rhythm and melody of what is being said
the visual elements of the work.
All of these can be useful in non-linear realtime experiences. Except plot.
But the realtime medium offers additional elements that easily augment or replace plot.
the direct influence of the viewer on the work
the presence of the viewer in the work
every staging of the work is done for an audience of a single person in the privacy of his
or her
Perhaps this issue is not limited to visual realtime art, as it were; perhaps it's simply a reflection of the new-at-the-time, but by now well-established, idea that indeed there is a tension between allowing full manipulation of an environment (visually realized or not) by an interactor and conveying a structured plot with mandatory authorial beats. But I do think it underscores the main theme of this post: visual and narrative suspension of disbelief are not one and the same.

Perhaps that statement seems obvious, but since interactive narrative researchers who grew up on Star Trek positioned the Holodeck as the holy grail for games, it has been surprisingly difficult to disentangle these two things. Let's call it the Holodeck Fallacy.

When you presume the Holodeck Fallacy, it's all to easy to draw the conclusion that if VR aspires to be the Holodeck, then augmented reality should aspire to be the interface from Minority Report:

Microsoft certainly seems to have made this assumption with the Hololens, and to a lesser extent so has Nintendo.

But recently, I've been much more interested in the ways that our reality can and has already been meaningfully augmented without manipulating one's visual field: specifically, through audio.

Audio as an alternative augmenter

Whenever we talk about augmenting reality, we need to answer two questions:

1) Which part of reality? What is the "default" thing that we expect a human to be doing, in which we are going to intervene with some computational process?

2) What are we augmenting it with?

A really interesting and increasingly well-explored answer to (1) is physical location, as in "location-based games," and to take a really specific example, running for physical exercise. Well, okay, it's interesting to me because it's an activity I happen to enjoy and engage in regularly, but here are some augmented running experiences I have had that fundamentally changed the way I experience that activity:

- Listening to a handcrafted, or algorithmically-generated, playlist of music.
- Using a GPS tracker that occasionally (on which occasion depends on app settings) reports my distance, pace, time, heart rate, and/or other information it can access.
- Playing Zombies, Run!

While the answer to (2) is "audio" for all of these, only the last example also has the answer "story." Largely, until recently, augmentation has been used primarily for adding more factual information to one's day-to-day experience; heck, even a wristwatch could be seen in this light. Collecting data about your geographical trail and repeating it back to you, which in turn may affect your behavior, seems to logically follow from other informational gadgets, but using the same thing for storytelling feels like it's treading new ground.

The way Zombies, Run! works is that, essentially, you are listening to a radio play about a zombie post-apocalypse while you run, which is narrated to you as though you are a character in the story receiving communication from a base through a headset. The story interspersed with lengths of silence in order to space it out to take the amount of time specified by you to match the length of your run.

If that were all there were to it, Zombies, Run! would be nothing but an amusing second-person podcast, which itself does do interesting things as augmented reality: it allows your imagination to connect the visual channels, as well as other bodily senses associated with what you feel while you run, to map onto narrative events. This in itself is interesting.

There are a few more tricks, however: 1. as you run, you (at random, I think) collect items such as med packs, clothing, and books that, after your run, can be used to build up an in-game base to which you have an interface through the app. This mechanism cleverly separates the "staring at a screen" part of the game from the running part. 2. At your own requested frequency, your run will be peppered by zombie chases: you get a warning that they are a certain distance behind you, and if you pick up your pace, you outrun them. If you don't get your pace up quickly enough, they catch you, and you lose one or more pieces of collected inventory.

This last mechanism is interesting mainly because of how it lets reality affect virtuality, not just the other way around. It's almost a cheap form of "biofeedback" that circumvents sensors by using your GPS position plotted over time as an indirect measure of your physical actions. It's one of the few ways you can actually "interact" with the game as you run, since otherwise your location doesn't really matter.

Of course, one could go further with the idea of adding a feedback line from the player's current situation into a narrative fiction. A story told to you while running but where your own location information played into the narration, and where turning in a particular cardinal direction could affect the course of the story, could be interesting. Such things are already sometimes hand-crafted; one thinks of old museum "audio tours," their extension into self-guided city tours, and more recent projects like Improv Everywhere's flashmob project, "The MP3 experiment."

What taking this idea further means, then, is coming up with new enumerations of augmentable activities (walking and running, yes -- but what else?), new means of augmenting them, and, to inform their pairings, new ways that these two things might influence one another. How might an audio story change the way that someone traverses a space, and vice versa? How could we use the data available through a mobile device's sensors -- voice, accelerometer, location, elevation -- to influence a response from a helpful guide or a cunning adversary? Could one make an AI version of the narrator from The Stanley Parable that crafts routes for you to follow in any given (well-mapped) location and reprimandingly adapts to your diversions?

In general, I love the idea of a voice speaking in my ear as I move about a space otherwise in solitude -- telling me things about what I am seeing, suggesting avenues for exploration, or augmenting my visual perception with fiction. The last has the power to transform the ordinary or the mundane, perhaps environments that I see every day, into magical objects and spaces, to imbue them with new meaning and appreciate them in a new light. That, to me, is the real appeal of augmented reality, and it's possible -- perhaps even better -- to do it all without a heads-up display.

Augmenting virtuality with people

One of the ways I've been augmenting my runs recently is by listening to podcasts, and this morning I discovered Imaginary Worlds, a podcast about science fiction and fantasy across different media. The first one I listened to was about Then She Fell, a recent immersive theater project by New York's Third Rail Productions.

I've thought and written before about immersive theater, but thinking about it anew in the context of augmented realities made me see connections I hadn't previously. Imaginary Worlds narrator Eric Molinsky's comments with regard to Then She Fell that what felt compelling to him was the intimacy, the experience of having an actor delivering lines inches from your face, making excruciating amounts of eye contact. Not only that, but also listening and responding to everything you say and do with the attention and improvisational cleverness that only humans, so far, really know how to do.

This made me think that people working on augmented reality experiences are really doing a kind of similar thing to what AR designers are trying to do, but they are approaching the objective from opposite directions: where one takes "reality" as primary and then augments it with meaning that comes from something imaginary, a virtuality, the other takes the fiction, the "virtuality," as primary and substitutes a standard literary figure -- a character in the story -- with something from "reality," namely a guest to the production who doesn't know how the story will play out. Immersive theater substitutes for stage makeup and exaggerated drama, the intensity our brains generate when a real live person in front of us is expecting interaction.

Molinsky notes that one thing he didn't like about Then She Fell was the ambiguity, or perhaps even under-thought quality, of the "audience character:" "I didn't know who I was supposed to be," he says. In other words, what is the audience member's role in the story? In the framing of augmented virtuality, that this was experienced as a failure mode makes perfect sense: while the flow of information from story to interactor is well-established, because humanity has plenty of examples to follow for how that direction works, the other direction of taking unpredictable interactions and reflecting them into story meaning has only video games as a guide. And in that case, linguistic and performative interfaces have been purposely limited, because otherwise, a process wouldn't know how to handle them -- unlike a human actor.

Incidentally, just last night I finally got around to playing Jason Rohrer's Sleep is Death.

Sleep is Death also plays this "augmented virtuality" trick, but with video games as a starting point rather than theater, such that the "typically automated" function substituted with human choice, rather than a character reading lines in a play, is the game itself. Sleep is Death substitutes the uniformity of pre-programmed game responses with on-the-fly, human responses to player-typed dialogue or actions. The game is networked two-player: one of you is the player and one the controller. After each time the player does some action (which the controller can see), the controller has a limited time window in which they can swap out scenery or sprites, type a response in a speech bubble, or provide some other game-like response. It's possible to, like a producer of a play, spend a good long while constructing your scenes before allowing an audience, but then all the dynamic action (including dialogue and switching scenes) is up to improvisation.

If you, like me, find yourself too impatient with the controller interface to experience the game first-hand, you can look at some of the "flipbooks" generated from play on the game's website, which offer some insight.

To reiterate: the main point of this post is that visual suspension of disbelief is neither necessary nor sufficient for narrative suspension of disbelief, and I worry that in the (well-deserved!) river of attention being poured into visual augmentation and VR, we are in some danger of conflating breakthroughs in these technologies with breakthroughs in storytelling. I would like to see attention paid also to the ways that other senses (audio, haptic, olfactory, gustatory, proprioceptive...?) can augment fictional experiences, as well as to the role of social play, i.e. the potentially transformative role that other real-live humans have to play in shaping these experiences, whether at a safe internet distance, inches from your face, or some virtually-distorted mediation between the two.

Sunday, August 28, 2016

Arguing for your research

Everything from paper abstracts to grant proposals to fellowship applications, at every level from an undergraduate independent study to a full grant proposal as a faculty member, requires one key task: convincing the reader that your research project is any good. Usually "good" more specifically means: does it solve an important problem? Does it address an important issue? Does it explore important unexplored territory? And, if you haven't done it yet, do you have the right tools to solve/address/explore it?

In general, I'm not a huge believer in "formulaic" writing -- the idea that every body of writing ought to be formatted the same way for best results. Especially in creative domains, so much power can be wielded in breaking traditional structures. But for scientific writing, especially project proposals or article submissions, I do find that it really helps to not have to think about how to structure something and instead just plop down a default outline. If it does happen to make sense for the writing in question, it's great -- heavy scaffolding laid down can save time when later editing the details. Philip Guo talks about this kind of scaffolding in terms of the hierarchical structure (tree, outline) of written text that mediates between the messy undirected web of concepts in our head and the linear string of text that communicates to the reader. I like this framing in general; I want to talk more specifically about the contents of a certain kind of top-level outline.

A formula that I have found especially helpful for everything from abstracts to grant proposals is something I learned from CMU's Global Communications Center, the so-called novelty moves. This strategy proposes three steps:

  1. Establish the territory;
  2. Identify a gap;
  3. Fill the gap with your research.

I have been presenting this structure to the undergraduate students I'm advising this semester in a little bit more detail; my version goes (operative words bolded):

  1. Motivate the research area.
  2. Provide context of what has been achieved in that area.
  3. Identify a gap or a compelling research question.
  4. Describe the approach we're going to take to fill or explore it.
  5. Describe the impact our work will have on (1) if we're successful.

The interesting thing to me about "structures" like these is that they're always given in sequential (list) order, the same way the final writing product will be, but what they really pertain to is an argument structure. Each step of this plan serves a communicative purpose, and the sequence as a whole satisfies a communicative goal.

A bunch of my prior research has focused on formalizing narrative structure for written text or games that are designed to entertain, to tell stories leading to rich emergent interactions between characters. With so much thinking about the structure of scientific arguments, I've instead been thinking of the structure found in those. In fact, my postdoc project at UCSC involved a formalization of proceduralist readings, which are effectively arguments about what a game means. We realized that we could use logic programming techniques to construct these arguments from a set of hand-authored rules (paper coming out soon!).

Each line of the novelty moves serves a purpose -- in conjunction with some axiomatic assumptions (e.g. "My reader believes field X has value"), the line in the argument serves to satisfy the goal "convince the reader that my research solves an important problem related to field X (which they believe has value)." If it doesn't work toward that purpose (or if the reader can't infer its purpose to that end), it will confuse the reader; if one of the assumed inferences or axioms doesn't hold, the reader will fail to be convinced. Of course, formal logic was originally invented for the purpose of formalizing arguments, so it's no surprise that their structure winds up looking very proof-like. (Then again, the inference rules that occur in human cognition are pretty different from those used in formal logics.) It seems like a perfect opportunity to unify narrative discourse generation and formal logic.

So there you go. I've been so fixated on research that now I want to do research on research writing.