Category: social graph

Poindexter, Jonas and The Birth of Real-Time Dot Connecting

Published September 6, 2010 by cgerrish

There’s a case that could be made that John Poindexter is the godfather of the real-time Network. I came to this conclusion after reading Shane Harris’s excellent book, The Watchers, The Rise of the Surveillance State. When you think about real-time systems, you might start with the question: who has the most at stake? Who perceives a fully-functional toolset working within a real-time electronic network as critical to survival?

To some, Poindexter will primarily be remembered for his role in the Iran-Contra Affair. Others may know something about his role in coordinating intelligence across organizational silos in the Achille Lauro Incident. It was Poindexter who looked at the increasing number of surprise terrorist attacks, including the 1983 Beruit Marine Barracks Bombing, and decided that we should know enough about these kinds of attacks before they happen to be able to prevent them. In essence, we should not be vulnerable to surprise attack from non-state terrorist actors.

After the fact, it’s fairly easy to look at all the intelligence across multiple sources, and at our leisure, connect the dots. We then turn to those in charge and ask why they couldn’t have done the same thing in real time. We slap our heads and say, ‘this could have been prevented.’ We collected all the dots we needed, what stopped us from connecting them?

The easy answer would be to say it can’t be done. Currently, we don’t have the technology and there is no legal framework, or precedent, that would support this kind of data collection and correlation. You can’t predict what will happen next, if you don’t know what’s happening right now in real time. And in the case of non-state actors, you may not even know who you’re looking for. Poindexter believed it could be done, and he began work on a program that was eventually called Total Information Awareness to make it happen.

In his book, Shane Harris posits a central metaphor for understanding Poindexter’s pursuit. Admiral Poindexter served on submarines and spent time using sonar to gather intelligible patterns from the general background of noise filling the depths of the ocean. Poindexter believed that if he could pull in electronic credit card transactions, travel records, phone records, email, web site activity, etc., he could find the patterns of behavior that were necessary precursors to a terrorist attack.

In order to use real-time track for pattern recognition, TIA (Total Information Awareness) had to pull in everything about everyone. That meant good guys, bad guys and bystanders would all be scooped up in the same net. To connect the dots in real time your need all the dots in real time. Poindexter realized that this presented a personal privacy issue.

As a central part of TIA’s architecture, Poindexter proposed that the TIA system encrypt the personal identities of all the dots it gathered. TIA was looking for patterns of behavior. Only when the patterns and scenarios that the system was tracking emerged from the background, and been reviewed by human analysts, would a request be made to decrypt the personal identities. In addition, every human user of the TIA system would be subject to a granular-level audit trail. The TIA system itself would be watching the watchers.

The fundamental divide in the analysis and interpretation of real-time dot connecting was raised when Jeff Jonas entered the picture. Jonas had made a name for himself by developing real-time systems to identify fraudsters and hackers in Las Vegas casinos. Jonas and Poindexter met at a small conference and hit it off. Eventually Jonas parted ways with Poindexter on the issue of whether a real-time system could reliably pinpoint the identity of individual terrorists and their social networks through analysis of emergent patterns. Jonas believed you had to work from a list of suspected bad actors. Using this approach, Jonas had been very successful in the world of casinos in correlating data across multiple silos in real time to determine when a bad actor was about to commit a bad act.

Jonas thought that Poindexter’s approach with TIA would result in too many false positives and too many bad leads for law enforcement to follow up. Poindexter countered that the system was meant to identify smaller data sets of possible bad actors through emergent patterns. These smaller sets would then be run through the additional filter of human analysts. The final output would be a high-value list of potential investigations.

Of course, once Total Information Awareness was exposed to the harsh light of the daily newspaper and congressional committees, its goose was cooked. No one wanted the government spying on them without a warrant and strong oversight. Eventually Congress voted to dismantle the program. This didn’t change the emerging network-connected information environment, nor did it change the expectation that we should be able to coordinate and correlate data across multiple data silos to stop terrorist attacks in real time. Along side the shutting down of TIA, and other similar government efforts, was the rise of Google, social networks, and other systems that used network-based personal data to predict consumer purchases; guess which web site a user might be looking for; and even the bet on the direction of stocks trading on exchanges.

Poindexter had developed the ideas and systems for TIA in the open. Once it was shut down, the system was disassembled and portions of it ported over to the black ops part of the budget. The system simply became opaque, because the people and agencies charged with catching bad actors in real time still needed a toolset. The tragedy of this, as Shane Harris points out, is that Poindexter’s vision around protecting individual privacy through identity encryption was left behind. It was deemed too expensive and too difficult. But the use of real-time data correlation techniques, social graph analysis, in-memory data stores and real-time pattern recognition are all still at work.

It’s likely that the NSA, and other agencies, are using a combination of Poindexter’s and Jonas’s approaches right now: real-time data correlation around suspected bad actors, and their social graphs— combined with a general sonar-like scanning of the ocean of real-time information to pick up emergent patterns that match the precursors of terrorist acts. What’s missing is a dialogue about our expectations, our rights to privacy and the reality of the real-time networked information environment that we inhabit. We understood the idea of wiretapping a telephone, but what does that mean in the age of the iPhone?

Looking at the structure of these real-time data correlation systems, it’s easy to see their migration pattern. They’ve moved from the intelligence community to wall street to the technology community to daily commerce. Social CRM is the buzz word that describes the corporate implementation; some form of real-time VRM will be the consumer’s version of the system. The economics of the ecosystem of the Network has begun to move these techniques and tools to the center of our lives. We’ve always wanted to alter our relationship to time, we want to know with a very high probability what is going to happen next. We start with the highest-value targets, and move all the way down to a prediction of which television show we’ll want to watch and which laundry detergent we’ll end up telling our friend about.

Shane Harris begins his book The Watchers with the story of Able Danger, an effort to use data mining, social graph and correlation techniques on the public Network to understand Al Qaeda. This was before much was known about the group or its structure. One of the individuals working on Able Danger was Erik Kleinsmith, he was one of the first to use these techniques to uncover and visualize a terrorist network. And while he may not have been able to predict the 9/11 attacks, his analysis seemed to connect more dots than any other approach. But without a legal context for this kind of analysis of the public Network, the data and the intelligence was deleted and unused.

Working under the code name Able Danger, Kleinsmith compiled an enormous digital dossier on the terrorist outfit (Al Qaeda). The volume was extraordinary for its size— 2.5 terabytes, equal to about one-tenth of all printed pages held by the Library of Congress— but more so for its intelligence significance. Kleinsmith had mapped Al Qaeda’s global footprint. He had diagrammed how its members were related, how they moved money, and where they had placed operatives. Kleinsmith show military commanders and intelligence chiefs where to hit the network, how to dismantle it, how to annihilate it. This was priceless information but also an alarm bell– the intelligence showed that Al Qaeda had established a presence inside the United States, and signs pointed to an imminent attack.

That’s when he ran into his present troubles. Rather than relying on classified intelligence databases, which were often scant on details and hopelessly fragmentary, Kleinsmith had created his Al Qaeda map with data drawn from the Internet, home to a bounty of chatter and observations about terrorists and holy war. He cast a digital net over thousands of Web sites, chat rooms, and bulletin boards. Then he used graphing and modeling programs to turn the raw data into three-dimensional topographic maps. These tools displayed seemingly random data as a series of peaks and valleys that showed how people, places, and events were connected. Peaks near each other signaled connection in the data underlying them. A series of peaks signaled that Kleinsmith should take a closer look.

…Army lawyers had put him on notice: Under military regulations Kleinsmith could only store his intelligence for ninety days if it contained references to U.S. persons. At the end of that brief period, everything had to go. Even the inadvertent capture of such information amounted to domestic spying. Kleinsmith could go to jail.

As he stared at his computer terminal, Kleinsmith ached at the thought of what he was about to do. This is terrible.

He pulled up some relevant files on his hard drive, hovered over them with his cursor, and selected the whole lot. Then he pushed the delete key. Kleinsmith did this for all the files on his computer, until he’d eradicated everything related to Able Danger. It took less than half an hour to destroy what he’d spent three months building. The blueprint for global terrorism vanished into the electronic ether.

Comments closed

Bloomsday, The Coffee House and The Network

Published June 16, 2010 by cgerrish

June 16th is known as Bloomsday; it’s the single day, in 1904, on which James Joyce’s novel Ulysses occurs. The day is commemorated around with the world with readings of the book and the hoisting of a pint or two.

Stately, plump Buck Mulligan came from the stairhead, bearing a bowl of lather on which a mirror and a razor lay crossed. A yellow dressinggown, ungirdled, was sustained gently behind him by the mild morning air. He held the bowl aloft and intoned:

— Introibo ad altare Dei.

Halted, he peered down the dark winding stairs and called up coarsely:

— Come up Kinch. Come up , you fearful jesuit.

Solemnly he came forward and mounted the round gunrest. He faced about and blessed gravely thrice the tower, the surrounding country and the awakening mountains. Then, catching sight of Stephen Dedalus, he bent towards him and made rapid crosses in the air, gurgling in his throat and shaking his head. Stephen Dedalus, displeased and sleepy, leaned his arms on the top of the staircase and looked coldly at the shaking, gurgling face that blessed him, equine in its length, and at the light untonsured hair, grained and hued like pale oak.

Buck Mulligan peeped an instant under the mirror and then covered the bowl smartly.

Joyce’s book brought to popular notice the idea of stream of consciousness literature. The term “stream of consciousness” was coined by the philosopher William James in an attempt to describe the mind-world connection as it relates the concept of truth. As a literary technique, it involves writing as a kind of transcription of the inner thought process of a character. In Ulysses, we find that stream rife with puns, allusions and parodies. Joyce was trying to capture another aspect of truth.

What challenged the reader of the day as avant garde and daring has become a relatively normal part of our network-connected lives.

Twitter has become a part of my daystream
– Roger Ebert

The stream of tweets flowing out of Twitter could aptly be described as a stream of collective consciousness. And so today, we think a great deal about various real-time streams and how they wend their way through networks of social connection. The water metaphors we use to speak about these things have roots in our shared history; they describe another kind of network of connections.

Stephen B. Johnson, in his book The Invention of Air, makes the case for the London Coffee House as an early prototype for the internet:

With the university system languishing amid archaic traditions, and corporate R&D labs still on the distant horizon, the public space of the coffeehouse served as the central hub of innovation in British society. How much of the Enlightenment do we owe to coffee? Most of the epic developments in England between 1650 and 1800 that still warrant a mention in the history textbooks have a coffeehouse lurking at some crucial juncture in their story. The restoration of Charles II, Newton’s theory of gravity, the South Sea Bubble— they all came about, in part, because England had developed a taste for coffee, and a fondness for the kind of informal networking and shoptalk that the coffeehouses enabled. Lloyd’s of London was once just Edward Lloyd’s coffeehouse, until the shipowners and merchants started clustering there, and collectively invented the modern insurance company. …coffeehouse culture was cross-disciplinary by nature, the conversations freely roaming from electricity, to the abuses of Parliament, to the fate of dissenting churches.

But the coffeehouse as a nexus of debate was only half of the picture. Cultural practice at the time was to drink beer and wine, and maybe a little gin, at every opportunity. Water was not safe to drink, and so alcoholic alternatives were fondly embraced. The introduction of coffee and tea as popular beverages had a significant impact on the flow of valuable ideas. Again here’s Johnson:

The rise of coffeehouse culture influenced more than just the information networks of the Enlightenment; it also transformed the neurochemical networks in the brains of all those newfound coffee-drinkers. Coffee is a stimulant that has been clinically proven to improve cognitive function— particularly for memory related tasks— during the first cup or two. Increase the amount of “smart” drugs flowing through individual brains, and the collective intelligence of the culture will become smarter, if enough people get hooked.

In our day, the coffee house connected to a wifi network has been an accelerant to the businesses populating the Network. When Starbucks announced that they would be introducing free 1-click wifi in their stores, it reminded me of Stephen Johnson’s descriptions of the London coffeehouses. The coffeehouse provided a physical meeting place and the caffeine in the coffee provided a force multiplier for the ideas flowing through the people. There was a noticeable change in the rhythm of the age. By layering a virtual real-time social medium over a physical meeting place that serves legal stimulants, Starbucks replays a classic formula. Oddly, there’s a kind of collaborative energy that exists in the coffeehouse that has been completely expunged from the corporate workplace. Starbucks ups the ante by running a broadcast web service network through the connection. Here we see wifi emerging as the new backbone for narrowcasted television.

As we try to weave value-laden real-time message streams through the collaborative groupware surgically attached to the corporate balance sheet, we may do well to look back toward Bloomsday and also ask for a stream of unconsciousness. It’s in those empty moments between the times when we focus our attention that daydreams and poetic thought creep into the mix. Those “empty moments” are under attack as a kind of system latency. However it’s in those day dreams, poetic thoughts and napkin scribbles that we find the source of the non-linear jump. Without those moments in our waking life, we’re limited to only those things deemed “possible.”

Comments closed

Internet Identity: Speaking in the Third Person

Published June 1, 2010 by cgerrish

It’s common to think of someone who refers to themselves in the third person as narcissistic. They’ve posited a third person outside of themselves, an entity who in some way is not fully identical with the one who is speaking. When we speak on a social network, we speak in the third person. We see our comment enter the stream not attributed to an “I”, but in the third person.

The name “narcissism” is derived from Greek mythology. Narcissus was a handsome Greek youth who had never seen his reflection, but because of a prediction by an Oracle, looked in a pool of water and saw his reflection for the first time. The nymph Echo–who had been punished by Hera for gossiping and cursed to forever have the last word–had seen Narcissus walking through the forest and wanted to talk to him, but, because of her curse, she wasn’t able to speak first. As Narcissus was walking along, he got thirsty and stopped to take a drink; it was then he saw his reflection for the first time, and, not knowing any better, started talking to it. Echo, who had been following him, then started repeating the last thing he said back. Not knowing about reflections, Narcissus thought his reflection was speaking to him. Unable to consummate his love, Narcissus pined away at the pool and changed into the flower that bears his name, the narcissus.

The problem of internet identity might easily be solved by having all people and systems use the third person. A Google identity would be referred to within Google in the third person, as though it came from outside of Google. Google’s authentication and authorization systems would be decentralized into an external hub, and Google would use them in the same way as a third party. Facebook, Twitter, Microsoft, Apple and Yahoo, of course, would follow suit. In this environment a single internet identity process could be used across every web property. Everyone is a stranger, everyone is from somewhere else.

When we think of our electronic identity on the Network, we point over there and say, “that’s me.” But “I” can’t claim sole authorship of the “me” at which I gesture. If you were to gather up and value all the threads across all the transaction streams, you’d see that self-asserted identity doesn’t hold a lot of water. It’s what other people say about you when you’re out of the room that really matters.

What does it matter who is speaking, someone said, what does it matter who is speaking?
Samuel Beckett, Texts for Nothing

Speaking in the third person depersonalizes speech. Identity is no longer my identity, instead it’s the set of qualities that can be used to describe a third person. And if you think about the world of commercial transactions, a business doesn’t care about who you are, they care if the conditions for a successful transaction are present. Although they may care about collecting metadata that allows them to predict the probability that the conditions for a transaction might recur.

When avatars speak to each other, the conversation is in the third person. Even when the personal pronoun “I” is invoked, we see it from the outside. We view the conversation just as anyone might.

Comments closed

Human Factors: Zero, One, Infinity

Published May 31, 2010 by cgerrish

Software is often designed with three “numbers” in mind: zero, one and infinity. In this case, infinity tends to mean that a value can be any number. There’s no reason to put random or artificial limits on what a number might be. This idea that any number might do is at the bottom of what some people call information overload. For instance, we can very easily build a User Managed Access (UMA) system with infinite reach and granularity. Facebook, while trying to respond to a broad set of use cases, produced an access control / authorization system that answered these use cases with a complex control panel. Facebook users largely ignored it, choosing instead to wait until something smaller and more usable came along.

Allow none of foo, one of foo, or any number of foo.

Privacy is another way of saying access control or authorization. We tend to think about privacy as personal information that is unconnected, kept in a vault that we control. When information escapes across these boundaries without our knowledge, we call this a data breach. This model of thinking is suitable for secrets that are physically encoded on paper or the surface of some other physical object. Drama is injected into this model when a message is converted to a secret code and transmitted. The other dramatic model is played out in Alfred Hitchcock’s The 39 Steps, where a secret is committed to human memory.

Personal information encoded in electronic communications systems on the Network is always already outside of your personal control. This idea of vaults and breaching boundaries is a metaphor imported from a alien landscape. When we talk about privacy in the context of the Network, it’s more a matter of knowing who or what has access to your personal information; who or what can authorize access to your personal information; and how this leg is connected to the rest of the Network. Of course, one need only Google oneself, or take advantage of any of the numerous identity search engines to see how much of the cat is already out of the bag.

The question arises, how much control do we want over our electronic personal information residing on the Network? Each day we throw off streams of data as we watch cable television, buy things with credit cards, use our discount cards at the grocery, transfer money from one account to another, use Twitter, Facebook and Foursquare. The appliances in our homes have unique electrical energy-use signatures that can be recorded as we turn on the blender, the toaster or the lights in the hallway.

In some sense, we might be attempting to recreate a Total Information Awareness (TIA) system that correlates all data that can be linked to our identity. Can you imagine managing the access controls for all these streams of data? It would be rather like having to consciously manage all the biological systems of our body. A single person probably couldn’t manage the task, we’d need to bring on a staff to take care of all the millions of details.

Total Information Awareness would be achieved by creating enormous computer databases to gather and store the personal information of everyone in the United States, including personal e-mails, social network analysis, credit card records, phone calls, medical records, and numerous other sources, without any requirement for a search warrant. This information would then be analyzed to look for suspicious activities, connections between individuals, and “threats”. Additionally, the program included funding for biometric surveillance technologies that could identify and track individuals using surveillance cameras, and other methods.

Here we need to begin thinking about human numbers, rather than abstract numbers. When we talk about human factors in a human-computer interaction, generally we’re wondering how flexible humans might be in adapting to the requirements of a computer system. The reason for this is that humans are more flexible and adapt much more quickly than computers. Tracing the adaptation of computers to humans shows that computers haven’t really made much progress.

Think about how humans process the visual information entering our system through our eyes. We ignore a very high percentage of it. We have to or we would be completely unable to focus on the tasks of survival. When you think about the things we can truly focus our attention on at any one time, they’re fewer than the fingers on one hand. We don’t want total consciousness of the ocean of data in which we swim. Much like the Total Information Awareness system, we really only care about threats and opportunities. And the reality, as Jeff Jonas notes, is that while we can record and store boundless amounts of data— we have very little ability to make sense of it.

Man continues to chase the notion that systems should be capable of digesting daunting volumes of data and making sufficient sense of this data such that novel, specific, and accurate insight can be derived without direct human involvement. While there are many major breakthroughs in computation and storage, advances in sensemaking systems have not enjoyed the same significant gains.

When we admire simplicity in design, we enjoy finding a set of interactions with a human scale. We see an elegant proportion between the conscious and the unconscious elements of a system. The unconscious aspects of the system only surface at the right moment, in the right context. A newly surfaced aspect displaces another item to keep the size of focus roughly the same. Jeff Jonas advocates designing systems that engage in perpetual analytics, always observing the context to understand what’s changed, the unconscious cloud is always changing to reflect the possibilities of the conscious context.

We’re starting to see the beginnings of this model emerge in location-aware devices like the iPhone and iPad. Mobile computing applications are constantly asking about location context in order to find relevant information streams. Generally, an app provides a focused context in which to orchestrate unconscious clouds of data. It’s this balance between the conscious and the unconscious that will define the new era of applications. We’ll be drawn to applications and platforms, that are built with human dimensions— that mimic, in their structure, the way the human mind works.

Our lives are filled with infinities, but we can only live them because they are hidden.

2 Comments