Archive for the 'hci' Category

« Previous Entries

Television Signal Path and the Airplay Remote Control

zenith-remote-control

The control systems for television aren’t very good. One reason they persist is that once a viewer is watching a selected program, the control system recedes into the background. In the course of watching a presentation, the essential controls, the ones that control sound (louder, softer, mute), generally work quite well. The rest of the control system is a disaster that people have learned to accommodate. This snarl of technology around controlling a television is generally why people think there’s room for revolutionary innovation in the “battle for the living room.”

googletv_remote

Generally there have been a couple of approaches. The universal remote, a complex remote control device that consolidates all of the other remote controls. So instead of having five or six complex remote controls, you have one really really complex remote control. Google TV’s remote control with a keyboard pushes towards the limits of this kind of conceptual framework. The addition of voice command and SIRI is another solution at the limit. The other approach involves creating a “smart” television. This would be accomplished by integrating a Network connected computer into the television device. This new device would make all of the other devices obsolete. Various forms of this device have been foisted upon the public. It’s not that people don’t buy these “smart” televisions, it’s just that no one uses any of their capability.

The solution to this tangle of technology lies in the role of the remote control. The name “remote control” describes what the device does. It takes the control system from the television and allows it to operate at a distance from the television itself. That meant you didn’t have to get up off the sofa and walk across the room to select a program or control the sound volume. The “remote” has essentially provided the same service since it entered the living room in the mid-1950s. Nikola Tesla described its basic operation in a patent application more than 50 years earlier than that. To some extent, even cloud computing is just a variation of the same theme.

It was while researching wireless audio systems for my study that the basic change in the “remote” became clear to me. With all of my music available through a cloud storage system, I didn’t need a music system to decode physical media. From the many choices available, I selected the Bowers & Wilkins A7. It’s a single speaker that sits in a home WiFi network and listens for AirPlay signals. You can send it music via AirPlay from your phone, iPod, tablet or desktop computer—and that music can be stored remotely on the Network. Radio streams, YouTube sound, podcasts, etc. can be also be sent to this audio system. The key is the change in the signal path. The “remote” is no longer just a controller, it’s the receiver/broadcaster of the audio signal. The “stereo system” now listens for AirPlay signals, decodes and presents the sound. I liked this solution so much, I set up my traditional stereo to operate similarly using AirPort Express as one of the auxiliary inputs.

xfinity

You can see how this model would work for television. Instead of a smart television, you have a dumb television. The big screen does what the big screen does well. It shows high-definition moving pictures synchronized with sound. You can’t solve the “television problem” without changing the signal path. Once the remote control becomes a receiver/AirPlay broadcaster, all the peripheral devices hooked up to your television go away. Even your cable box becomes just another app on your phone or tablet. The interesting thing about this solution is that it doesn’t necessarily disintermediate the cable companies, the premium channels, Netflix, Amazon, Tamalpais Research Institute, Live from the Metropolitan Opera or your favorite video podcast.

back_of_tv

In this analysis, the real problem with the television is identified as the HDMI connector. Every device connected to the screen via HDMI wants to dominate the control system of the television; and every HDMI connection spawns its own remote. Once you get rid of the HDMI connector and transform the remote control into an AirPlay receiver/broadcaster, all the remote controls disappear. The television listens for one kind of signal and plays programming from any authorized source. The new generation of wireless music systems have demonstrated that this kind of solution works, and works today. By changing the signal path and the role of the remote, the solution to the problem of television is well within reach.

6_remotes

Things, Not Strings: The Word Made Flesh

The headline reads: “Introducing the Knowledge Graph: things, not strings.” The implication being, “strings” are bad and limited and “things” are good and what you really wanted all along. After all people don’t want strings of arbitrary alpha-numeric characters in response to their queries, they want the things they’re looking for. And as the advertising message at the end of the introduction says, because you’re getting “things and not strings” on your search result pages, you can spend more time doing the things you love. Who wouldn’t want to do that? The end result of this technological improvement is that your life now contains “more time”— like a toothpaste tube that contains 20% more toothpaste; and that time is filled with love. One might even recast this new product as a machine for filling the world with love.

What Google seems to be introducing is a new user interface to a faceted search. Nothing more. Faceted search acknowledges that the “word” (a single string of characters) isn’t the atom of meaning. Instead it uses the “phrase” in the context of some domain of meaning—a word can be a valid token in multiple systems of meaning. These domains, or facets of meaning, are surfaced and prioritized in search results. So, in addition to Page-ranked links, we get a prioritized set of contexts in which a particular word or phrase is a valid operator. The advance is in creating an index of sub-domains of meaning through analyzing the structure of text as it’s used on the visible Network. There’s no question that faceted search is superior to classic Page-ranked search, however the language used to describe this new product innovation seems to suggest some kind of transcendent experience.

Here’s a description of the vision that drives innovation in the search product at Google:

We’ve always believed that the perfect search engine should understand exactly what you mean and give you back exactly what you want
- Amit Singhal, SVP, Engineering at Google

But when I hear this kind of talk from engineers, their words are drowned out by the characters from Lewis Carroll’s “Through the Looking Glass“:

‘I don’t know what you mean by “glory”,’ Alice said.

Humpty Dumpty smiled contemptuously. ‘Of course you don’t — till I tell you. I meant “there’s a nice knock-down argument for you!”‘

‘But “glory” doesn’t mean “a nice knock-down argument”,’ Alice objected.

‘When I use a word,’ Humpty Dumpty said, in rather a scornful tone, ‘it means just what I choose it to mean — neither more nor less.’

‘The question is,’ said Alice, ‘whether you can make words mean so many different things.’

‘The question is,’ said Humpty Dumpty, ‘which is to be master — that’s all.’

Alice was too much puzzled to say anything; so after a minute Humpty Dumpty began again. ‘They’ve a temper, some of them — particularly verbs: they’re the proudest — adjectives you can do anything with, but not verbs — however, I can manage the whole lot of them! Impenetrability! That’s what I say!’

‘Would you tell me please,’ said Alice, ‘what that means?’

‘Now you talk like a reasonable child,’ said Humpty Dumpty, looking very much pleased. ‘I meant by “impenetrability” that we’ve had enough of that subject, and it would be just as well if you’d mention what you mean to do next, as I suppose you don’t mean to stop here all the rest of your life.’

‘That’s a great deal to make one word mean,’ Alice said in a thoughtful tone.

‘When I make a word do a lot of work like that,’ said Humpty Dumpty, ‘I always pay it extra.’

We can propose the idea that Google has a search engine that “understands exactly what you mean.” And by this what we mean is that your query corresponds to a sub-domain in the index of facets Google has previously collected. The “meaning” doesn’t lie in the “you” that has the query, but rather in the sets of sub-domains contained in Google’s index. When a word does a lot of work in multiple sub-domains of meaning, they pay extra in compute time.

The claim that Google makes is that they’ve gone from “strings” to “things.” But the sub-domains of meaning that Google is collecting are made up of computable sets of strings, not things. The leap that Google is actually trying to make is from “strings” to “words, phrases and contexts.” But the use of the word “thing” is very revealing. Words are not things, they are indexes. They point at things, suggest things, or function in a play of difference within a system of meaning. When we say that we’ve gone from “strings” to “things” we’re actually making a kind of miraculous claim. We’ve gone from “word” to “thing.” The most prominent example of this algorithm can be found in the King James Bible, we see it in John 1.14:

And the Word was made flesh, and dwelt among us, (and we beheld his glory, the glory as of the only begotten of the Father,) full of grace and truth.

If we believe that Google’s knowledge graph provides “things” and not “strings,” we also believe something extraordinary about the power and capability of Google. Even if we take a step back and simply say that Google is merely indexing sub-domains—systems of meaning, we need to examine what this means. We could follow Wittgenstein and say that “meaning” can be described as a form of life. Therefore Google’s index produces a prioritized list of facets (forms of life) that connect to your form of life, given what they know about you. Popular forms of life that don’t currently connect to you serve as a method of discovery.

Of course, there’s also the popular trend of the flesh made word

There are registers of meaning that Google’s approach will never capture. Their index will be filled with gaps and pools of darkness. In particular, only a very limited range of metaphor (cliches) will be caught in the net. Metaphor produces meaning through an algorithmic process (per @the_eco_thought, Tim Morton). Take a noun, take another noun from a different domain and place the word “is” between them. The coffee cup is a blue angel. The metaphor machine makes meaning. Not every metaphor is a good one, but it has some modicum of meaning and it does function as a metaphor.

Like the theoretical one hundred monkeys typing in front of a hundred typewriters for a hundred years, the metaphor machines are constantly operating and feeding the Network with new meaning. Darius Kazemi (@tinysubversions) has created a machine called “Metaphor a Minute” that does just this. You can follow it on Twitter at @metaphorminute. Of course, because of Twitter’s rate limits, there’s actually a new metaphor published every two minutes.

“Hold the newsreader’s nose squarely, waiter, or friendly milk will countermand my trousers.”

After thinking through Google’s new service and the language they’ve used to describe it, we discover that they are using the word “things” metaphorically. At first, we may assume that when engineers are describing the function of their new software, they’re making literal statements about what the machine they’ve constructed is doing. Instead, they’ve taken a two nouns from different domains and inserted the word “is” between them. Ironically, their use of the word “things” is of the type that their new service could not understand it. The narrow band of search engine results that are produced by this system is also being metaphorically called “knowledge.” In order to see these new products clearly, we need to be able to differentiate the rhetoric of hyperbole from the literal functioning of the machine. It also helps to become acquainted how metaphors mean…

Putting Ears on the Television

There’s some slang in the CB radio world, when you want to know if someone is listening, you ask if they have their ears on. As in, “How ’bout ya JB, got ya ears on?” For some reason this is the phrase that popped into my head when thinking about the possibility of an Apple-designed television set. In earlier thoughts about the future of television, my focus settled on HDMI inputs and clumsy switching between these inputs. In essence, the HDMI input becomes the inheritor of the idea of the channel.

When you look at the inputs and outputs of the big screen, the game is to dominate the primary input. Your cable or satellite programming provider doesn’t want you to ever switch to another HDMI input. If you can be that dominant, your external boxes can commandeer the control experience from the television itself. Anyone who’s hooked up a television to a cable systems has had the experience of being presented with two mutually exclusive proprietary control systems. This is the reason you can have 3 or 4 remote controls sitting on your coffee table. Each HDMI input has a separate control system and listens for control events with a separate set of ears.

Customer satisfaction surveys are a great friend to Apple. This is because customer satisfaction is usually just an accommodation to work-arounds. We’ve grown used to the way the television “works.” The work-around is the way it works, and after a while we don’t even notice the strangeness of it. And when we get that call, interrupting our dinner, asking us whether we’re happy with our television set up, we say, “sure, it’s great.” Of course, the reality is it’s a horrible mess we’ve aclimated ourselves to.

So let’s get back to that CB radio reference. Do you have your ears on? The problem with television sets is they don’t have their ears on. Or rather they’ve been trained to only listen to a single voice at a time. As a user of iOS devices, I’d like to be able to send programming to the big screen at any time via AirPlay. As things stand I can only do that when AppleTV2 is the designated input. An Apple-designed television would always be listening for AirPlay events.

As YouTube gets ready to launch a bunch of channels, I can’t help but think that “the channel” has reached the limit of its usefulness. When I ask Siri whether it’s going to snow today, I don’t need to switch the input to the Weather Channel to get an answer. When I ask my iTV whether there’s a Val Lewton movie on, I don’t want to have to know what channel it’s on. I want Siri to take care of searching my subscriptions and report back on what my options are. The effect of this would be to return control of the television to the television itself.

As things stand, Siri would have a limited domain of television programming services to search through. Although this isn’t too different from the current situation with the iPhone 4s. Eventually all television services will migrate toward television over IP. It’s happened in all other mass media, television will be no different. Even your DVR will just save pointers to stream locations in the cloud.

In an interview, Steve Jobs once said that these waves of technological innovation are slow and unfold over many years. The trick is to pick the right wave and position yourself to benefit from the natural current. We can easily say that today, Siri isn’t good enough (in the sense of an innovator’s dilemma). But it’s perfectly positioned to grow and benefit from a huge wave of cloud-based data/identity services. It’ll work the same way with iTV.

Shop Windows And Tablets: Through The Looking Glass

In looking for lost house keys under the light of the street lamp, we put aside the fact that we lost them in the ditch at the other side of the road. It’s odd how we can move so swiftly in a particular direction without really knowing where we’re going. An incredible amount of ingenuity, resources and coordination has been applied to building tablet computers. There’s an unstated assumption that the post-pc era is defined by an evolution of the computer to a new human-computer interface model with a new form factor. And at a technical level, there’s some truth there; however at the level of the market for devices, there’s not enough truth.

To make sense of all this, let’s go back to a 1996 interview by Gary Wolf with Steve Jobs. Jobs was at NeXt and was gazing ahead at the future:

Wolf: What other opportunities are out there?

Jobs: Who do you think will be the main beneficiary of the Web? Who wins the most?

Wolf: People who have something -

Jobs: To sell!

Wolf: To share.

Jobs: To sell!

Wolf: You mean publishing?

Jobs: It’s more than publishing. It’s commerce. People are going to stop going to a lot of stores. And they’re going to buy stuff over the Web!

e-Commerce’s path to the Network was from the paper catalog to the electronic catalog. The Sears Catalog was one of the early prototypes for distance retailing. But what was the paper catalog? Why was it successful? The catalog was an evolution of the shop window in the arcade. And it was the shop window that enabled the romantic imagination of the consumer. Heather Marcelle Crickenberger talks about Walter Benjamin’s idea of the flâneur:

“Flâneur” is a word understood intuitively by the French to mean “stroller, idler, walker.” He has been portrayed in the past as a well-dressed man, strolling leisurely through the Parisian arcades of the nineteenth century–a shopper with no intention to buy, an intellectual parasite of the arcade. Traditionally the traits that mark the flâneur are wealth, education, and idleness. He strolls to pass the time that his wealth affords him, treating the people who pass and the objects he sees as texts for his own pleasure. An anonymous face in the multitude, the flâneur is free to probe his surroundings for clues and hints that may go unnoticed by the others.

Today we call it window shopping. It’s an exercise of the imagination in the role of the consumer. What might I look like in that outfit, listening to that music, with those kitchen appliances? A large plate of glass opened a window on to the possibilities contained within the shop. The flâneur could stroll the arcade moving from this window to that, searching for something that might catch his fancy.

Timothy Morton discusses this performance of the consumer imagination in his essay on “The Beautiful Soul.”

These performative styles are outlined by myself (Morton) and Colin Campbell. One style stands out, and that is a kind of meta-style that Campbell calls bohemianism and I call Romantic consumerism. This kind of consumerism is at one remove from regular consumerism. It is “consumerism-ism” as it were, that has realized that the true object of desire is desire as such. In brief, Romantic consumerism is window- shopping, which is hugely enabled by plate glass, or as we now do, browsing on the internet, not consuming anything but wondering what we would be like if we did. Now in the Romantic period this kind of reflexive consumerism was limited to a few avant-garde types: the Romantics themselves. To this extent Wordsworth and De Quincey are only superficially different. Wordsworth figured out that he could stroll forever in the mountains; De Quincey figured out that you didn’t need mountains, if you could consume a drug that gave you the feeling of strolling in the mountains (sublime contemplative calm, and so on). Nowadays we are all De Quinceys, all flaneurs in the shopping mall of life. This performative role, this attitude, is all the more pervasive, leading me to believe that we haven’t really exited from the Romantic period—another sense in which “prehistory” isn’t quite right for what I’m describing, but extremely right in another sense, namely that we’re still caught in an attitude that we don’t fully understand or become aware of.

When we talk about what’s assumed to be a tablet computer, we’re actually talking about a plate of glass, a shop window. In a discussion with Nick Bilton of the New York Times about why all these tablets look similar, Ryan Block hit on the key, although he may not have realized it:

“We are talking about a screen, where the screen is the entire experience and it can only really look and act one way, and that is to look similar to the iPad,” Mr. Block explained in a phone interview Thursday. “At the end of the day, they are all going to look similar, because a tablet is just a piece of glass.”

The innovations of the post-pc era aren’t to the computing device, they’re to the shop window. The ability to transact as part of the performance and the transformation of the goods from material to digital such that they can be played within the same window are the key additions to the “piece of glass.”

If you view the recent crop of tablet computers through this lens, you’ll see what separates the Apple and Amazon products from the rest. We pass the empty shop window of the deserted store as we move on down the block to see what we might find next. Of course, it’s simple to see how a technologist might confuse a shop window with a flat computing device.

Ironic Architecture: The Audience And Its Double

My eyes trace the curve of a jet black line as it snakes across the paper. There’s a point at which the line stops and my eyes keep going, tracing the trajectory of where the line might have gone. It’s within the bounds of that short distance that we travel into the future. It’s this tracing that doesn’t trace anything that is the subject of this meditation.

“and now I can go on,” is the phrase Wittgenstein used to describe a certain relationship to a series. Given “2, 4, 6, 8, 10,” I think I can see where things are going. “Even positive integers” is a possible answer, but no matter what numbers come next, a logic can be found for it. If the number is 12, that’s one sort of logic; if it’s 22, that’s another. Based purely on the visible, the adjacent invisible can always be colored in with a reasonable pattern.

It turns out that perception works in a similar way. The gaps in our apprehension of the world are bridged, filled in, to create the sensation of the smooth flow of time and experience. We project ourselves into the future. And our memories make liberal use of sampling to construct a rational narrative to account for the dramatic beats of our lives occuring before this one.

While past is not necessarily prologue, if you have enough data on what ‘usually happens’ you can make an educated guess about what will happen next. Through a statistical analysis of big data, the trajectory of partial behavior can be made visible, and the completion of that behavior can be projected. Correlations in the data emerge to tell a story that is unavailable to any one individual. Here the life of the human becomes actuarial, a set of probabilities for the possibilities. Once the percentages of the probabilities have exhibited some durability, casino economics can be installed to manage the risk and profit from these tendencies. The owners and operators of big data systems have a private view into a higher-dimensional phase space. And despite what these organizations tell us about good and evil, they are purely commercial enterprises.

A big data interlude: capturing big data on the Network, used to be the province of spiders. In the search business, it was only through expedition, return and accumulation of pointers and meta-data that a sufficient store of big data could be created. With Twitter and Facebook big data is created second-by-second within the walls of a single location. It’s the users who do all the traveling, sending postcards and pointers back to the archive.

As the probabilities solidify, another landscape emerges—along with the building materials for another level of architecture. For instance, using the tendencies that behavioral finance has uncovered, Thaler and Sunstein suggest building architectures that frame choice in such a way that people are ‘nudged’ into getting with the program. The program might be putting a percentage of one’s salary into a 401k to fund their retirement, or selecting a healthy lunch at the school cafeteria. We tend to accept the default and choose the item put in our path. Sunstein and Thaler call this activity ‘Choice Architecture‘ because while an individual is free to make any choice, the selection set is tilted toward a particular policy agenda. This tilting toward a particular outcome is what they call “a nudge.”

I like to call it “Ironic Architecture,” because while any choice can theoretically be made, the character in this little story is unaware of the manipulation and tilting of the selection set. When the character accepts the nudge and acts as the statistical analysis suggests they might, another level of the story is being played out.

Here’s Fowler’s Modern Usage on irony:

“Irony is a form of utterance that postulates a double audience, consisting of one party that hearing shall hear and shall not understand, and another party that, when more is meant than meets the ear, is aware of both that more and of the outsider’s incomprehension.”

While we make a big show of talking about how we want to engage the rational needs and desires of a user in the networked hypertext environment, more and more we’re seeing choice architecture employed to win without fighting, to persuade without engaging in a rational discussion.

This kind of strategy plays out in a number of domains, in politics, it’s called framing, or a little more obscurely, heresthetic:

“Like rhetoric, heresthetic depends on the use of language to manipulate people. But unlike rhetoric, it does not require persuasion. ‘With heresthetic,’ according to Riker, “Conviction is at least secondary and often not involved at all. The point of an heresthetical act is to structure the situation so that the actor wins, regardless of whether or not the other participants are persuaded.”

Personal behavior data is being created and recorded at an ever increasing rate. The phrase ‘information exhaust’ is an apt description of the continuous inscription of our activities into digital media. And while we may think that some superior form of personalization will be available to us based on this large data set, it’s more likely that big data will yield correlations and trends that are built into our environments and make us characters in stories of which we are unaware.

Harry Brignull has coined the phrase ‘dark patterns’ for this kind of architecture. Brignull writes eloquently about Alan Penn’s lecture on the architecture of Ikea and how consumer movement through that environment results in the unfolding of a singular story that its characters are unaware of:

“What Ikea have done is taken away something which is very fundamental, evolved into us, and they’ve designed an environment that operates quite differently, given that we are forward facing people, embodied [...] from the way it would happen if you just looked down from outer space. Its effect is highly disorienting.”

“Ikea is highly disorienting and yet there is only one route to follow. [...] Before long, you’ve got a trolley full of stuff that is not the things that you came there for. Something in the order of 60% of purchases at Ikea are not the things that people had on their shopping list when they came in the first place. That’s phenomenal.”

The best minds of our generation are designing dark patterns to entangle us in a story in which we spend more than we intend. They’re also designing choice architectures to get us to save for retirement, eat a healthy diet, get immunizations and show up for school. But the conversation and the narrative is happening at a level we don’t have access to—rhetoric without argument.икони

Opening A New Interaction Surface: Microsoft and Kinect

It’s an unexpected moment for Microsoft. What was formerly called Project Natal, and is now called Kinect, has opened a new interaction surface to the Network. I’m trying to think of another example of Microsoft introducing and providing stewardship for an interaction model with this kind of uptake. Generally Mr. Softy has been a follower, an embracer and extender of pre-established modes.

You can tell that Kinect has connected because it’s immediately overflowed its use cases and taken up residence in a whole series of unanticipated projects. It’s an interaction surface that has corporate competitors starting up their copy machines and trying to find the best position as a fast follower. Somehow it’s hard to imagine Microsoft actually getting something out of their labs and on to the street for around $200.00. I suppose it could be the harbinger of a pipeline finally unclogged. At least that’s the marketing spin I’d put on it.

After an initial misstep, Microsoft seems to have embraced the so-called “hacking kinect” movement. What they seemed to think was new kind of game controller turns out to be a general purpose interaction modality with use cases all up and down the Network. It’ll be interesting to see how Microsoft handles the stewardship of this new device. Running a race from the lead position is an entirely different kind of game.

Planes of Silence and Interruption Across The Network

A brief note on two planes of the Network landscape that have recently caught my attention. They are the terrains of interruption and silence. Each of these areas is going through a transition. Each signals changes that are starting to bubble up in other areas of the Network.

The terrain of silence, for the purposes of this discussion, will be defined as unvisited web page locations. Web servers are not purposefully asked to send these pages to waiting browsers, their activity is indistinguishable from background noise. An unvisited page published by an individual is a perfectly acceptable event; here I’m more specially addressing the corporate CMS (content management system) driven behemoth web sites. The enterprise CMS brings the cost of brochure-ware publication down to almost zero. Marketing departments, assembled and calcified in the Web 1.0 era, churn out copy that is sent out to occupy the hard-won turf of their little section of the company’s web site. The products battle for shelf space in a self-defined, self-limited topography of web 1.0 information architecture— home page, tabs, pages, categories, sub-catagories. The navigation scheme based on the hyperlink and the outline implies an almost infinite number of potential pages that can occupy the space below the tip of the iceberg.

Many are learning that if you build it, it doesn’t mean they will come. More often than not this multitude of pages is met with silence. The analytics show that there just aren’t any clicks there. Generally companies retool to get clicks to those pages, because clearly “they” should be coming, there’s simply some adjustment that needs to be made. “User-centeredness” is bolted on so that users will understand that the pages they don’t want to look at are “needs based.” All kinds of lipstick is applied, but in the end, it might just be that the user just isn’t that in to you. The conversation is one-sided in an empty room, the analytics show it. It turns out that automated publishing of linked hypertext documents isn’t the same thing as interactive marketing. The growing silence will eventually change the character of the interaction. The old 1% response rate for junk mail is transferred to the web when direct marketing model is employed without alteration on the Network. The web is just a way of lowering production costs, it’s a notch above the economics of spam. Think of it as the negative space of the page view model.

At the other end of this candle that burns at both ends, is the terrain of the interruption. For the purposes of this discussion, the this terrain will be defined as the the set of Network-attached devices you’ve given permission to ping you when something important occurs. The classic examples are the doorbell and the telephone. Each was originally anchored to a specific location and would signal you with a bell when they required your attention. The telephone went mobile, and then was subsumed into the iPhone as a function of a personal computing device. The bell that signals a telephone call is still there, so is the alert that tells you a text message has arrived. But now there are a whole series of applications that will send you an interruption signal when something has occurred. A stock hits a certain price, a baseball team scores a run, you’re near a store with a sale on an item on your wishlist, or someone just commented on an item in your Facebook newsfeed.

The terrain of interruption used to be limited to a few applications that signaled a request for a real-time communication from another person. The interruption is still event-driven and unfolds in real time, but it’s no longer only an individual signaling for your attention. Now it might just be a state of the world that you’d like to keep tabs on. If any of these things happen, feel free to interrupt me. If I really don’t want to be interrupted, I’ll turn off that channel— so ping me, I’ll pick it up in real time, or as soon as I’m able. What was a sparse and barren landscape is quickly filling with apps that want the privilege of interruption. Multi-tasking becomes simply waiting for the next interruption: interruption interrupting the last interruption— or as T.S. Eliot put it in his poem Burnt Norton, “distracted from distraction by distraction.” The economics and equilibrium of the interruption have yet to find their balance. These interruptions threaten to become an always-on real-time backchannel to daily life. Constant interruption is no interruption at all.

Stories Without Words: Silence. Pause. More Silence. A Change In Posture.

A film is described as cinematic when the story is told primarily through the visuals. The dialogue only fills in where it needs to, where the visuals can’t convey the message. It was watching Jean-Pierre Melville’s Le Samourai that brought these thoughts into the foreground. Much of the film unfolds in silence. All of the important narrative information is disclosed outside of the dialogue.

While there’s some controversy about what percentage of human-to-human communication is non-verbal, there is general agreement that it’s more than half. The numbers are as low as 60% and as high as 93%. What happens to our non-verbal communication when a human-to-human communication is routed through a medium? A written communique, a telephone call, the internet: each of these media have a different capacity to carry the non-verbal from one end to the other.

The study of human-computer interaction examines the relationship between humans and systems. More and more, our human-computer interaction is an example of computer-mediated communications between humans; or human-computer network-human interaction. When we design human-computer interactions we try to specify everything to the nth degree. We want the interaction to be clear and simple. The user should understand what’s happening and what’s not happening. The interaction is a contract purged of ambiguity and overtones. A change in the contract is generally disconcerting to users because it introduces ambiguity into the interaction. It’s not the same anymore; it’s different now.

In human-computer network-human interactions, it’s not the clarity that matters, it’s the fullness. If we chart the direction of network technologies, we can see a rapid movement toward capturing and transmitting the non-verbal. Real-time provides the context to transmit tone of voice, facial expression, hand gestures and body language. Even the most common forms of text on the Network are forms of speech— the letters describe sounds rather than words.

While the non-verbal can be as easily misinterpreted as the verbal, the more pieces of the picture that are transmitted, the more likely the communication will be understood. But not in the narrow sense of a contract, or machine understanding. But rather in the full sense of human understanding. While some think the deeper levels of human thought can only be accessed through long strings of text assembled into the form of a codex, humans will always gravitate toward communications media that broadcast on all channels.

Apple’s UX Strategy: I Want To Hold Your Hand

A few thoughts about the iPhone 4 and why technology does or doesn’t catch on. I’ve yet to hold one in my hand, but like everyone else I’ve got opinions. The typical gadget review takes the device’s feature list and compares using technical measures to other devices deemed competitive. Using this methodology, it would be fairly simple to dismiss the iPhone as introducing no new features. The other lines of attack involve dropped calls on the AT&T network and the App Store approval process. For some people these two items trump any feature or user experience.

Google talks about their mission as organizing the world’s information. When I think of Apple’s mission, at least their mission for the last five years or so, it revolves around getting closer to the user in real time. The technology they build flows from that principle.

I’d like to focus on just two new iPhone 4 features. The first is the new display, here’s John Gruber’s description:

It’s mentioned briefly in Apple’s promotional video about the design of the iPhone 4, but they’re using a new production process that effectively fuses the LCD and touchscreen — there is no longer any air between the two. One result of this is that the iPhone 4 should be impervious to this dust-under-the-glass issue. More importantly, though, is that it looks better. The effect is that the pixels appear to be painted on the surface of the phone; instead of looking at pixels under glass, it’s like looking at pixels on glass. Combined with the incredibly high pixel density, the overall effect is like “live print?.

The phrase that jumped out at me was “the pixels appear to be painted on the surface of the phone; instead of looking at pixels under glass.” While it seems like a small distance, a minor detail, it’s of the utmost importance. It’s the difference between touching something and touching the glass that stands in front of something. Putting the user physically in touch with the interaction surface is a major breakthrough in the emotional value of the user experience. Of course the engineering that made this kind of display is important, but it’s the design decision to get the device ever closer to the user that drove the creation of the technology. Touch creates an emotional relationship with the device, and that makes it more than just a telephone.

In a 2007 interview at the D5 conference, Steve Jobs said:

And, you know, I think of most things in life as either a Bob Dylan or a Beatles song, but there’s that one line in that one Beatles song, “you and I have memories longer than the road that stretches out ahead.

You could say that Apple’s strategy is encapsulated in the Beatles song: I Want To Hold Your Hand.

The lines that describe the feeling Jobs wants the iPhone and iPad to create are:

And when I touch you i feel happy, inside
It´s such a feeling
That my love
I can’t hide
I can’t hide
I can’t hide

The other new feature is FaceTime. Since the launch of the iPhone 3GS it’s been possible to shoot a video of something and then email it to someone, or post it to a network location that friends and family could access. Other phones had this same capability. That’s a real nice feature in an asynchronous sort of way. One of the problems with it is it has too many steps and it doesn’t work the way telephones work. Except when things are highly dysfunctional, we don’t send each other recorded audio messages to be retrieved later at a convenient time. We want to talk in real time. FaceTime allows talk + visuals in real time.

FaceTime uses phone numbers as the identity layer and works over WiFi with iPhone 4 devices only. That makes it perfectly clear under what circumstances these kind of video calls will work. Device model and kind of connectivity are only things a user needs to know. These constraints sound very limiting, but they dispel any ambiguity around the question of whether the user will be able to get video calls to work or not.

We often look to the network effect to explain the success of a product or a new platform. Has the product reached critical mass, where by virtue of its size and connectedness it continues to expand because new users gain immediate value from its scale. The network must absolutely be in place, but as we look at this window into our new virtual world, the question is: does the product put us in touch, in high definition, in real time? The more FaceTime calls that are made, the more FaceTime calls will be made. But the system will provide full value at the point when a few family members can talk to each other. Critical mass occurs at two.

Human Factors: Zero, One, Infinity

Software is often designed with three “numbers” in mind: zero, one and infinity. In this case, infinity tends to mean that a value can be any number. There’s no reason to put random or artificial limits on what a number might be. This idea that any number might do is at the bottom of what some people call information overload. For instance, we can very easily build a User Managed Access (UMA) system with infinite reach and granularity. Facebook, while trying to respond to a broad set of use cases, produced an access control / authorization system that answered these use cases with a complex control panel. Facebook users largely ignored it, choosing instead to wait until something smaller and more usable came along.

Allow none of foo, one of foo, or any number of foo.

Privacy is another way of saying access control or authorization. We tend to think about privacy as personal information that is unconnected, kept in a vault that we control. When information escapes across these boundaries without our knowledge, we call this a data breach. This model of thinking is suitable for secrets that are physically encoded on paper or the surface of some other physical object. Drama is injected into this model when a message is converted to a secret code and transmitted. The other dramatic model is played out in Alfred Hitchcock’s The 39 Steps, where a secret is committed to human memory.

Personal information encoded in electronic communications systems on the Network is always already outside of your personal control. This idea of vaults and breaching boundaries is a metaphor imported from a alien landscape. When we talk about privacy in the context of the Network, it’s more a matter of knowing who or what has access to your personal information; who or what can authorize access to your personal information; and how this leg is connected to the rest of the Network. Of course, one need only Google oneself, or take advantage of any of the numerous identity search engines to see how much of the cat is already out of the bag.

The question arises, how much control do we want over our electronic personal information residing on the Network? Each day we throw off streams of data as we watch cable television, buy things with credit cards, use our discount cards at the grocery, transfer money from one account to another, use Twitter, Facebook and Foursquare. The appliances in our homes have unique electrical energy-use signatures that can be recorded as we turn on the blender, the toaster or the lights in the hallway.

In some sense, we might be attempting to recreate a Total Information Awareness (TIA) system that correlates all data that can be linked to our identity. Can you imagine managing the access controls for all these streams of data? It would be rather like having to consciously manage all the biological systems of our body. A single person probably couldn’t manage the task, we’d need to bring on a staff to take care of all the millions of details.

Total Information Awareness would be achieved by creating enormous computer databases to gather and store the personal information of everyone in the United States, including personal e-mails, social network analysis, credit card records, phone calls, medical records, and numerous other sources, without any requirement for a search warrant. This information would then be analyzed to look for suspicious activities, connections between individuals, and “threats”. Additionally, the program included funding for biometric surveillance technologies that could identify and track individuals using surveillance cameras, and other methods.

Here we need to begin thinking about human numbers, rather than abstract numbers. When we talk about human factors in a human-computer interaction, generally we’re wondering how flexible humans might be in adapting to the requirements of a computer system. The reason for this is that humans are more flexible and adapt much more quickly than computers. Tracing the adaptation of computers to humans shows that computers haven’t really made much progress.

Think about how humans process the visual information entering our system through our eyes. We ignore a very high percentage of it. We have to or we would be completely unable to focus on the tasks of survival. When you think about the things we can truly focus our attention on at any one time, they’re fewer than the fingers on one hand. We don’t want total consciousness of the ocean of data in which we swim. Much like the Total Information Awareness system, we really only care about threats and opportunities. And the reality, as Jeff Jonas notes, is that while we can record and store boundless amounts of data— we have very little ability to make sense of it.

Man continues to chase the notion that systems should be capable of digesting daunting volumes of data and making sufficient sense of this data such that novel, specific, and accurate insight can be derived without direct human involvement.  While there are many major breakthroughs in computation and storage, advances in sensemaking systems have not enjoyed the same significant gains.

When we admire simplicity in design, we enjoy finding a set of interactions with a human scale. We see an elegant proportion between the conscious and the unconscious elements of a system. The unconscious aspects of the system only surface at the right moment, in the right context. A newly surfaced aspect displaces another item to keep the size of focus roughly the same. Jeff Jonas advocates designing systems that engage in perpetual analytics, always observing the context to understand what’s changed, the unconscious cloud is always changing to reflect the possibilities of the conscious context.

We’re starting to see the beginnings of this model emerge in location-aware devices like the iPhone and iPad. Mobile computing applications are constantly asking about location context in order to find relevant information streams. Generally, an app provides a focused context in which to orchestrate unconscious clouds of data. It’s this balance between the conscious and the unconscious that will define the new era of applications. We’ll be drawn to applications and platforms, that are built with human dimensions— that mimic, in their structure, the way the human mind works.

Our lives are filled with infinities, but we can only live them because they are hidden.

« Previous Entries