The Seven Laws of Identity
As we have seen consistently in Chapter 1, a common mistake in the evolution of the IT industry has been trying to extend the use of existing technologies "as is" to deal with problems that are only apparently similar to the ones the original technology was meant to solve.
Breaking this impasse requires distancing ourselves from the tools we (maybe erroneously) believe we may use for solving the problem and trying to consider just the problem itself. By doing so, we might not get an instant solution, but we can certainly obtain precious and unbiased insights into what an acceptable solution may look like. At least as important, we can also learn to recognize nonsolutions. By understanding what the properties are we can't do without, we gain a valuable compass for navigating the problem space toward a solution.
Kim Cameron, an architect at Microsoft, tried to do exactly that. In 2004, he created a blog, www.identityblog.com, from which he elicited discussions on identity management. The focus was on understanding what worked and what didn't in current and past identity management efforts, with an accent on understanding the deep reasons for why things went one way or the other. Issues were examined from multiple angles: technology, social considerations, usability, and privacy. Vendor differences were suspended in the name of understanding the problem from a broad industry perspective. No topic was off limits; in fact, one of the most studied topics was the shortcomings of the most ambitious universal authentication scheme attempt at the time, Microsoft Passport. Cameron successfully involved key industry players and thought leaders from the entire community in the dialogue, gaining consensus even from the least expected sources, such as prominent figures in the open source world.
In 2005, Cameron distilled the results of the discussions in a single white paper, "The Laws of Identity," where the main findings are summarized in concise format. The white paper lists seven "laws." They are principles to which, according to the previously mentioned investigations, an identity management system must comply to be viable. Since the white paper's publication, the identity laws have become immensely popular and are considered by many the manifest of the new user-centered identity management movement. The seven laws, listed in their concise form, are as follows:
- User Control and Consent
- Minimal Disclosure for a Constrained Use
- Justifiable Parties
- Directed Identity
- Pluralism of Operators and Technologies
- Human Integration
- Consistent Experience Across Contexts
The identity laws are not dogmatic by any measure, nor are they blindly prescriptive. Ultimately, they are a set of sound and pragmatic principles, derived from real-world experience, that anybody can verify at any given moment. Their goal is to give rise to a system that can enjoy true acceptance while serving the intended purpose of an identity system to the full satisfaction of all the parties involved. The seven identity laws define how to successfully extend the Internet with an identity management layer. In the remainder of this section, we examine the laws one by one.
In the following section, "The Identity Metasystem," we describe a solution that abides by such laws. The Identity Metasystem is the model of reference for which Windows CardSpace has been designed.
User Control and Consent
- Technical identity systems must only reveal information identifying a user with the user's consent.
- —The Laws of Identity, Cameron, 2005
This is truly the most fundamental principle of an identity management system.
The user must be able to decide to whom he discloses information, which specific data is being shared, when exchanges take place, what the purpose is for which the information is gathered in the first place, and what the trail is that a specific transaction may leave behind. To make that degree of control even possible, the user must understand what is going on. Always.
In today's practices, we witness gross violations of the first law everywhere. Remember the concept of server authentication, discussed in the sections "The Babel of Cryptography" and "The Babel of Web User Interfaces" in Chapter 1? The lousy job we do today of making users able to understand to whom they are disclosing information is one of the root causes of phishing, which is by itself one of the main causes in the decline of the use of the Internet for high-value transactions. A violation of the first law of this magnitude promptly leads to diminished acceptance.
There are other somewhat subtler violations to consider. We are used to the idea that what we transfer in an authentication transaction is just the credentials so that we can unlock our identity on the service provider. In fact, there are many occasions in which our identity can flow from one service to the other. In Chapter 1, in the section "HTTPS, Authentication, and Digital Identity," we have a real-world example in which frequent-flyer privileges of a customer are shared between two commercial partners. In the sections "Hard Tokens" and "Issued Token–Based Authentication Schemes" you saw technologies that give to identities a vessel for traveling across different entities, such as the Security Assertion Markup Language (SAML) token representing the assertion, "Alice is a principal in my realm, and she just successfully logged in using username/password as credentials," mentioned in the section "SAML." This covers the feasibility of the operation from the technical standpoint but says nothing about the way in which what is happening surfaces to the user's attention. Let's say that you are working for an important technology company that has a close partnership with a hardware provider. By virtue of that partnership, purchasers at the hardware vendor site enjoy automatic deals applied specifically for your company. The experience is seamless. While you are browsing your corporate intranet, you click a link to the hardware vendor, and the web store automatically recognizes you as an employee of a partner company; you get a welcome banner with your name, and the deals on the page are adjusted accordingly. That's the magic of single sign-on (SSO; see the section "SAML" in Chapter 1). Sometimes the transition may be so seamless (thanks to layout customizations) that you might not even realize that you are now in a different place and that an authentication step has been performed at all. That might be very convenient from the usability standpoint, but you can't say you had much control over the information about you that flew from your company to the hardware vendor website. From what you can see, the partner website was able to determine your name and your status of employee. But what if much more information was transmitted without your knowledge or consent? If the hardware vendor acquires information about your salary or your home address, something that typically you would not want to disclose, consequences vary from targeting according to the advertisement on the web store to selling that information to marketers, junk mailers, or worse, burglars. Wouldn't it be much better to be warned that your identity is about to be disclosed and to whom and what information is specifically being requested? Wouldn't you require, after you realize what is going on, a mechanism for opting out if you feel it is risky?
That's the essence of the first law. Knowledge is power. Awareness of the situation brings the ability to take action responsibility, which in turn brings confidence and the feeling of being in control.
Minimal Disclosure for a Constrained Use
- The solution which discloses the least amount of identifying information and best limits its use is the most stable long term solution.
- —The Laws of Identity, Cameron, 2005
Let's focus once more on the partnership example we introduced in last section "User Control and Consent."
Your company negotiated access to the hardware vendor website to fulfill a business need, empowering employees to purchase devices for the company with a process as agile as possible. The purchase process needs to gather some specific data from every shopping session. The fact that you are an employee of a certain partner, your name, the business address to which items will have to be shipped, coordinates for emitting an invoice, the spending limit that has been assigned to you or to your role. Omit any of those data, and the transaction cannot take place. Do they need to know your salary? Your home address? Your blood type? You religious beliefs? Your hair length? They would probably be happy to have some of that information, but the answer to all these questions is a resounding no. The reason for which you are shopping at their website is performing purchases for your employer. The fact that you are a geek and that later that night you will buy an oscilloscope for your personal enjoyment is not relevant now, and therefore your home address should not be part of the current transaction.
Even if the hardware partner is acting in good faith and does not sell your personal data to junk mailers, disclosing more data than necessary is still a very bad idea. A rich archive of personal details is a treasure trove for identity rogues and makes the company a very palatable target of attacks. The liability is also higher in case of accidents. A laptop forgotten on a train with a list of names plus company addresses is much less likely to unleash a class action lawsuit than the same list of names with home addresses, birth dates, and so on.
The principle of minimal disclosure can and should also be applied at a finer level of granularity. A business selling wine, in a country where alcohol consumption is allowed only after a certain age, may be tempted to store the birth date of recurrent customers. That is a point of liability that could be easily avoided because it is possible to store only the aspect relevant to the business (that is, a Boolean expressing if the customer is above or below the threshold age).
Unfortunately, today's identity silos often invite practices in open violation of the second law. Many business operations in the United States require disclosure of the Social Security Number or SSN (see the sidebar "America and Identity Theft" in Chapter 1). It often happens that the SSN will end up being memorized in the user profile, even if there's no need to know it beyond the current transaction. It is kept just in case because it is information difficult to obtain. In the most appalling cases, it is even misused as record key because it is a unique identifier. The latter are the worst cases. Not only is the SSN very valuable information per se, it also provides a key for aggregating and interpreting identity data stolen elsewhere! That means spreading the damage across different identity contexts, annihilating one of the only advantages of today's identity silos. Because it is so difficult for information to flow between silos, the scope of damage is often contained too.
The principle of minimal disclosure for constrained use is very pragmatic, and the strategic value of the practice is clear. It is clearly proven architectural wisdom applied to the context of identity.
Justifiable Parties
- Digital identity systems must be designed so the disclosure of identifying information is limited to parties having a necessary and justifiable place in a given identity relationship.
- —The Laws of Identity, Cameron, 2005
One of the first adopters of Microsoft Passport was Victoria's Secret. At the time, it was not a well-known brand in Italy. When one of the authors found out that it was a lingerie brand, he was puzzled. He spent a good deal of time trying to understand the business reasons for which Microsoft needed to be informed of the details of his Valentine's day purchases.
Understanding the circumstances requires recalling what the Internet was in the few years after Y2K. Today it is almost unthinkable for any company not to have substantial web presence. In 2001, there were still many important companies without a website, and the bursting of the dotcom bubble had scared the industry enough that they backed off any mainstream strategy related to the Web. Brick-and-mortar companies often didn't have investments in or know how to invest in web properties: Website creation and maintenance were massively outsourced, almost as experiments and PR bangs, every move clearly giving away that the energies were still on the traditional channels. The Web was not as ubiquitous as today. The demographics of habitual customers, the main target, were not expected to overlap much, from the very beginning, with those of the audience of the website. Web-based campaigns were far from today's maturity in term of tools, demand, structured offerings, and raw material (read, eyeballs).
In that atmosphere, it should not come as a surprise that somebody saw authentication just as another "feature" of the website, and as such suitable to be handled by third parties, too.
The Passport offering was very convenient because it relieved sites from the hassle of managing their own authentication infrastructure, a very delicate aspect of the website architecture. That was the intended role of Passport in the purchase of a Valentine's day present; Microsoft was just an infrastructure provider.
As the Internet became what it is today, many of the conditions that made authentication outsourcing appealing started to fade. It became unmistakably clear that the web presence is a strategic asset, while at the same time online activities became more complex and feature-rich. The attention and resources devoted to it by companies increased. As the number of Internet surfers grew an order of magnitude, the importance of the Web as a medium for reaching customers grew, too. Any information about the user became precious for maintaining loyalty, predicting behavior, and targeting offerings. Online advertising exploded. It was like the offline world, but the eyeball economy made everything faster and global reaching. In these new conditions, in which somebody can earn revenue just by having you look at one page, outsourcing identity management just does not make business sense. That's why nowadays we are so surprised at the attempt to extend the Passport authentication scheme beyond Microsoft assets, but at the time there was some reasoning behind it. In fact, other big Internet players are betting on similar systems still today while Microsoft endorses the Identity Metasystem (see the section with the same name).
While online business went through all those transformations, maturity and awareness in the usage of the Internet increased. Once past the convenience of remembering just a single set of credentials, users and operators began to realize that the web farm of one single operator was in the position of keeping track of all their movements and didn't like the idea. When the technical reasons for outsourcing authentication disappeared, or were greatly reduced, there was no justification for that situation. If you add that some websites tried to make it as unobvious as possible that they were in fact relying on Passport, you can see how users didn't feel much in control.
In fact, "Justifiable Parties" is another flavor of the "User Control and Consent" law. Every time the user discloses his identity information, he needs to be able to assess not only to whom he is sending data, but also understand its role in the current transaction and the implications of its involvement. Let's get back to the wine seller example we introduced in the previous section. The merchant needs to know whether you are of age before serving you alcohol, and he may not take your word for it. In the offline world, the natural solution entails extracting your government-issued ID document and exhibiting it. As we have seen in Chapter 1, in the section "Hard Tokens," this is an action that more and more often we can metaphorically perform in the digital world, too. Here the reasons why the government is involved in the transaction are obvious. The merchant needs to know whether I am of age and won't take my word for it. However, he is willing to believe what the government says about me. Short of finding another entity that the merchant trusts, if I want to go on with the transaction I have no choice but to accept government involvement. (Notice that I still must be given the choice of opting out, when I learn the merchant's policy). Again, there are finer points to be made. The user is the ultimate judge of the justifiability of the participation of somebody in a transaction, and all information for making that call must be made available. Consider this. What if every time you use your electronic ID, your government keeps track of with whom you are conducting business? Would you still say that government involvement is justified? It probably depends. Somebody will recognize that this is a necessary security measure if the transaction is applying for a visa with a foreign government, but it is plain abuse to keep record of how many times you buy wine in a month; somebody else will be okay with both; and so on. This is just one among many examples. When was the last time that a marketing company asked for your permission for monitoring your buying habits? The point is that it is the user who should be the one who justifies the terms of the participation of one entity in the transaction, and a good identity schema should do everything for facilitating that judgment call. That means explicitly and clearly communicating policies about information usage.
Directed Identity
- A universal identity system must support both "omnidirectional" identifiers for use by public entities and "unidirectional" identifiers for use by private entities, thus facilitating discovery while preventing unnecessary release of correlation handles.
- —The Laws of Identity, Cameron, 2005
The fourth law further refines the concept we have of digital identity.
In Chapter 1, we debated the problem of server authentication, and we hinted how Public Key Infrastructure (PKI), certificates and Secure HyperText Transfer Protocol (HTTPS) can help in pinpointing the identity of websites. Who is the beneficiary of that help? In the case of a public website, it will be the "public" itself. Everybody that is not the website itself or, as Kim solipsistically put it in the white paper, "all the other identities."
We call that kind of identity omnidirectional. It is an identity meant to be understood by everybody. This identity will contain the info necessary for the public to decide if they want to do business with it. X.509 certificates and associated URLs are the most natural example in this context, but the instances in the offline world abound. You may have seen at some conferences those badges that display the attendee name, the company he or she is affiliated with, his or her role in the conference (attendee, speaker, staff), and the languages he or she can speak. That information is beamed to everybody coming within visual range of the badge and helps everyone else to recognize the bearer and the methods of interaction. The Web 2.0 breeze that blows on the Internet these days brings many means of doing the same thing online. For example, at the time of writing, Opinity (www.opinity.com) offers to its users a unique URL that provides the function of omnidirectional identifier. (The Opinity URL for Vittorio is http://vibro.opinity.com.)
When an individual enters a transaction, however, the identity he uses is unidirectional. That is, the identity transmitted is meant only to identify the user with the service provider currently engaged. If you are buying an airplane ticket on one website and booking a hotel room on another, the authentication scheme should not help the two websites to join their data and understand that you are the same person (and afterward send you advertisements about shuttle services between your destination airport and your hotel).
This is a very subtle point. A typical objection at this point is this: What if both sites require name and birth date? What can an authentication system do to prevent the two businesses from joining data together? The answer to that is, not much. If the two businesses require name and birth date to perform their function, there's nothing that can be done. You might require that data be encrypted with the public key associated with each site so that the data is not mutually visible, but that covers just the transmission. As soon as the information arrives at its intended destination, two dishonest service providers can still share profiles and search for a match. That's one of the reasons why using something unique and personal such as the SSN is really, really bad practice. The point of the Directed Identity law is that such a possibility should not be offered by the identity management schema in itself. In other words, an authentication schema should not rely on mechanisms that could give rise to correlation handles. Imagine a situation in which the services you are using require you to sign in, but they do not require any further information about you besides the credentials you use for authenticating. One example of such a service could be a photo-retouching website. After having signed in, you can upload one picture, and somebody will fix red eyes on-the-fly and send it back to you in the context of the same session. Another such a service could be a traffic information service or weather reports. When you sign in, you can get information about one area of choice. For both services, you are just sending the credentials required to verify that you subscribed to the service. In that case, an authentication schema respectful of the directional identity law will not allow the traffic service to realize that the person who asked about the situation on Highway 90 is actually the same person who sent those "oh so weird" pictures to be retouched. That separation will typically be obtained by the identity management scheme by ensuring that no two websites share the same identifier for the same user. But that's just an implementation detail. What counts is that the scheme does not enable the kind of abuses previously described; how it accomplishes that does not really matter.
Pluralism of Operators and Technologies
- A universal identity system must channel and enable the inter-working of multiple identity technologies run by multiple identity providers
- —The Laws of Identity, Cameron, 2005
We devoted a good part of Chapter 1 to describing different ways of handling authentication: certificates, SAML, and even passwords. Proposing a single authentication scheme for the Internet has been attempted, but it has failed. As the next lines will hopefully clarify, such an effort is doomed from the very start.
We have seen how the features of different systems are the result of the diverse requirements imposed by the contexts in which they are meant to operate. We should not expect those differences to go away, in much the same way as we should not expect that hammers and screwdrivers will eventually converge into one single tool. Furthermore, we have seen how today's scenarios and associated requirements greatly differ from yesterday's. By induction, we can safely assume that the future will pose challenges that we are unable to predict, and hence the solutions will also take forms we cannot foresee today.
People and businesses will have their own preferences and inclinations, and those will be reflected in their technology choices. As the value of the transaction rises, the level of security required will follow suit; different businesses will deal with risk in different ways, formulating their policies accordingly. Different users will have different degrees of tolerance for information disclosure; the concept of what is or is not acceptable in terms of safeguarding one's own privacy will vary widely by communities, cultures, or who knows what other factors. Just think of the example we made in the section "Justifiable Parties" concerning government tracking of electronic ID usage. Some will accept this unconditionally, and some will push back so hard that merchants will have to adopt different technologies for meeting user's privacy demands to remain in business.
Handling such a diverse mix of tendencies requires pluralism of operator offerings and technologies available. An identity management scheme that aspires to be the universal authentication system cannot fail to take the situation into consideration. Embracing and accommodating existing and future technologies is the only way to achieve the goal.
In the section "The Identity Metasystem" we describe a natural solution to the dilemma.
Human Integration
- The universal Identity Metasystem must define the human user to be a component of the distributed system integrated through unambiguous human-machine communication mechanisms offering protection against identity attacks.
- —The Laws of Identity, Cameron, 2005
Chapter 1, and specifically the section "The Babel of Web User Interfaces," described the inadequacies of current practices in making the user understand what is going on during the authentication process. We have seen how the certificates, although perfectly sound from the purely cryptographic standpoint, are not really helping the user to deal with the server authentication problem.
We have also seen how the wide gamut of different user experiences, despite the fact that in the vast majority of cases they all account for the task of entering username and password, confuses the user to the point of making him vulnerable to the simplest phishing attacks.
If we analyze from the pure engineering standpoint the communication sequence when authenticating to a website, we discover an almost universal pattern. Until the communication happens between machines or software entities, the protocols are predetermined and rigidly followed. Every phase mandates message formats and sequences, and the semantic of every step is unambiguously determined. A good example of this point is given in Chapter 1, in the section "SSL Client Authentication." As soon as human intervention is required, however, things change. Even if the task is almost invariably to enter password credentials, every website will implement the functionality in different ways. There is the diffuse idea that the user will "figure it out," so a reasonable set of controls and a sound process behind it will do. The flaw in that reasoning lies in the fact that reasonable and sound are ill-defined. Apart from the fact that often those systems are designed by computer scientists, who abide by a very different definition of reasonable than end users, the entire idea of relying on the user's ability to "figure it out" is extremely dangerous. When the user is expected to recognize to whom he is disclosing his personal data or which kind of information will be sent, the margin for interpretation should be reduced to an absolute minimum. The way of achieving this is planning for human integration, devising interaction mechanisms that properly account for the user capabilities, eliminating ambiguity, and reducing the room for misinterpretations. In other words, when the user deals with identity management matters, he should be constrained by a protocol, too.
Following a protocol is not exclusive to machines. Humans can do it, too, and have done so since forever, every time it is important to have predictable results. We follow a protocol on election day when we go to vote, when we clear a security checkpoint at the airport, when we sign a contract, when the fire alarm goes off in our office building, when we operate a nuclear plant, when we document a process in the context of ISO9000, when we apply for an immigrant visa. The list can go on and on. In those cases, we follow a protocol because there's a lot at stake in terms of risk or resources and, as painful and uninspiring as it may sometimes be, we accept that as a fact of life.
The way in which a universal identity system (please ignore for the time being the term metasystem in the law enunciate) should integrate humans is by maximizing comprehension while minimizing ambiguity. That is, a universal identity system should make everything as understandable and incontrovertible as it can be. That implies representing facts and entities in ways that the modern science of human computer interaction deems appropriate and defining rigorously the actions that users can perform and their exact semantics. Clarity claims its price on freedom. A system easy to understand and with fixed semantics will limit the room for creativity. However, when operating a nuclear plant, creativity should not be the higher-order bit. The same goes for making all the users understand whether the information they are being requested to send will travel in the clear through an untrusted network or whether it will be encrypted.
Note that this by no means implies limitations on specific authentication technologies. It just states that a universal identity management system should properly accommodate human integration but gives no indications of the architectural layer at which such integration should take place.
Consistent Experience Across Contexts
- The unifying identity metasystem must guarantee its users a simple, consistent experience while enabling separation of contexts through multiple operators and technologies.
- —The Laws of Identity, Cameron, 2005
While using the Internet, we project our identities all the time; we just don't always realize when we do it. In fact, many users do not actually have a clear picture of what identities they have and how they are used across the various services they make use of. The current user experience in that space is so broken that talking about consistency is difficult. Users do not even have a clear perception of what a security context is by now.
Think about it for a moment. The typical user will have a handful of password credentials he uses and reuses (see the section "Decline" in Chapter 1). The actual identities of the user are the sets of relevant facts that are kept on the service provider stores and are unlocked by transmitting the correct set of credentials (see the concept of hostage identity in the section "HTTPS, Authentication, and Digital Identity," in Chapter 1). If a username-password couple is reused across two different services, it will likely correspond to two different identities; this is supremely confusing for the user, who manipulated directly just the credentials and is only vaguely conscious (if at all) of the existence of the associated identities unlocked on the service-provider side. Password manager utilities do not really help, and sometimes they make things worse. By showing that the same username is used across different websites, they may induce the user to believe that he is using the same identity across the group even though the user profiles kept on different service providers may be dramatically different. That is certainly a setback in the attempt to instill context awareness in the user.
This last thought experiment describes just what happens at authentication time. However, there are countless other times at which online applications ask you to disclose fragments of our identities. This typically happens when you engage in a high-value transaction, when the service provider needs to reach beyond the online world and gather data from your offline identity. If you are having something shipped, you need to provide your address; if you are handling some administrative practices with your government, you may have to provide your ID number; if you are verifying the status of your immigrant petition, you have to provide your application number; if you are buying something online, you may have to disclose details of the relationship you have with your credit card provider (that is, enter your credit card number). All those things happen in completely different contexts, following different processes, requiring different interaction patterns. The concept of identity, which would be so useful and the natural tool for modeling those transactions, is implicit at best and is more often than not just an emergent property of the system. No wonder that the user has a hard time handling his or her identities effectively! It is like trying to understand the paths that planets follow in the night sky without knowing that the Earth itself spins and everything revolves around the Sun. Without the latter information, those paths are extremely difficult to understand and predict. Adopting the new perspective, however, makes everything crystal clear.
If we want to solve the problem of identity management for good, we need to be like Galileo and rebuild the system on the basis of the fundamental identity mechanics we have discovered so far. That will lead to a more natural and effective way for users to think about identity. Proficiency in managing it will follow suit.
The first thing we can do is make the concept of identity explicit for users. A user should be able to think about his or her identities as clearly as he or she thinks about his or her files, documents and any other abstract entity that has a visual representation in a user experience.
Once the identities are explicitly represented, we have made an enormous step forward. Users can now create identities for all the contexts and the hats they wear: identities for web mail and low-value services, identities as employees of a certain company, identities as citizens of a certain country, identities as members of a dating service, identities as alumni of a certain university, and identities as just about any kind of digital persona they expect to use in their activities. Information will naturally fall into the right place. How much you paid for taxes last year will be in the citizen identity and not in the alumni identity, whereas for your grade point average it will be vice versa.
Once the information is packaged in explicit representations of identities, it is finally possible to reach a level of consistency in all the transactions involving disclosure of identity data. Users can choose which identities are most suitable in every given context, while services can help users to understand the context by explicitly limiting the set of identities they are willing to accept. A service asking for our email and a service requiring knowing our yearly net income can now do so using the same user experience, relying on the new awareness that the user has about his identities. The system must be secure, and the differences between the two requests must be completely clear (see the section "User Control and Consent" and discussion about unambiguous operations in the section "Human Integration"), but the semantic of the two operations is the same. Disclose some part of one of your identities. It makes sense that the user experience is the same, too, just as the procedure for copying a file between two folders doesn't change regardless of whether the file content is the script of the movie Borat or the true recipe of the philosopher's stone.