Identity in China: Square Pegs, Round Holes

square_peg_round_hole

This morning over a cup of tea and the NY Times, I discovered a major new Identity System. On the edges we argue about user-centered identity, aggregated/fragmented identity across social networks, or the meaning of custodial identity and its role in commercial or financial transactions. Sharon LaFraniere, of the NY Times, writes about bestowing names, the written Chinese language and databases — and a new identity system for China’s 1.3 Billion citizens.

square_watermelon

By law, every Chinese citizen must carry an identity card– the legacy system is a handwritten card. The government is transitioning to a computer-readable card that will feature a color photo and an embedded microchip containing data including: home address, work history, background, ethnicity, religion and medical insurance. Within this transition we can observe what is lost as we move from the handwritten to the computer-readable.

Let’s start with some numbers:

There are roughly 55,000 written Chinese characters
China’s Public Security Bureau database is programmed to read 32,252 Chinese characters
A government linguistics official has suggested that the new standardized list will only include 8,000 characters
About 3,500 characters are in everyday use

Although China has a large population, it has very few surnames:

100 surnames cover 85% of China’s population
70,000 surnames cover 90% of the U.S.’s population

Because many people have identical surnames, it has become common to bestow an unusual given name to create a unique identity.

“Government officials suggest that names have gotten out of hand, with too many parents picking the most obscure characters they can find or even making up characters, like linguistic fashion accessories. But many Chinese couples take pride in searching the rich archives of classical Chinese to find a distinctive, pleasing name, partly to help their children stand out in a society with strikingly few surnames.”

While the Chinese writing system may be one of the most difficult in which to manage data, it is also the oldest system of writing in continuous use. Since these new identity databases can’t read unusual characters, the government will be asking people to change their names to something machine readable. Given a logographic written language, a handwritten identity card could accommodate an infinite variety. Alphabetic writing systems don’t have this problem as they attempt to convey phonemes rather than morphemes.

This story surfaces a number of issues with regard to technology and identity. The first and most obvious is what personal data should be contained on a government-issued identity card– who controls that data and who has access to it. A more subtle issue is: what is possible with language (written and spoken) as humans use it, and what is possible within the subset of “language” that machines can “understand.” If your name can’t be parsed by the Government’s identity database do you exist? And further, should you change your name to suit the system? Should the landscape change its features to accomodate the limited technology of map making? And if you’re creating an Internet Identity system, should it be in English? Should it be national or global? How should it relate to writing systems, the marks we make to suggest things or states of the world?

What does the technology of identity reveal about the identity of technology?