Top

What is Personally-Identifiable Information (PII)

Personally-identifiable informationThis post is part of a series of articles covering Online Privacy Policies as they relate to website and blog operators. If your site does not have a Privacy Policy, you may want to read the article that answers the question “Are Privacy Policies Required by Law?

The purpose of these articles is to provide insight into the less obvious legal risks that entrepreneurs are exposed to in the area of privacy law and PII, simply by owning a website or blog. PII in this context is an acronym for personally-identifiable information and it will be used throughout the rest of this article.

To begin, let’s re-state what a privacy policy is:

A privacy policy is a legal document which discloses how personally-identifiable information (PII) is collected, stored, used, shared, and disposed of. It is sometimes referred to as a privacy notice, privacy statement, or online privacy policy. (For a deeper discussion, see previous post: “What is a Privacy Policy?”)

Notice the phrase personally-identifiable in the above definition. Other definitions simply use the term personal information but personally-identifiable more clearly attaches the distinction that the information must identify a particular, unique person to invoke certain legal responsibilities.

As website or blog operators we have a legal responsibility to properly handle any PII we collect from our site visitors whether we have a privacy policy posted or not. Deceptive or unfair practices can result in legal consequences. Having a policy posted can add credibility to your site but it will keep you out of legal hot water only if you uphold its promises.

Unbeknown to some online entrepreneurs is the fact that legal consequences can result even if the mishandling of PII is unintentional. For example, if a privacy policy states that no PII is collected but the site supports third party widgets or plug-ins that collect PII and use it for tracking purposes and targeted ads, that will probably be viewed as deceptive in the eyes of the Federal Trade Commission.

In a future post, I will discuss data collection mechanisms like cookies and third party plug-ins, however, this article is focused on the definition of PII and the difference between PII, non-PII, and potential PII.

Definition of PII

The term PII has become widely accepted in recent years. Here are definitions from three very credible agencies of the federal government:

1)     PII is any information about an individual maintained by an agency, including (1) any information that can be used to distinguish or trace an individual‘s identity, such as name, social security number, date and place of birth, mother‘s maiden name, or biometric records; and (2) any other information that is linked or linkable to an individual, such as medical, educational, financial, and employment information. (Source: Guide to Protecting the Confidentiality of Personally Identifiable Information (PII), National Institute of Standards and Technology, page 2-1)

2)     The Department of Homeland Security defines PII as any information that permits the identity of an individual to be directly or indirectly inferred, including any information which is linked or linkable to that individual regardless of whether the individual is a U.S. citizen, lawful permanent resident, visitor to the U.S., or employee or contractor to the Department. (Source: Handbook for Safeguarding Sensitive Personally Identifiable Information At The Department of Homeland Security, page 6)

3)     PII is “information that can be used to locate or identify an individual, such as names, aliases, Social Security numbers, biometric records, and other personal information that is linked or linkable to an individual. Loss of such information may lead to identity theft or other fraudulent use of the information, resulting in substantial harm, embarrassment, and inconvenience to individuals.” (Source: Report to Congressional Requesters, Protecting Personally Identifiable Information, United States Government Accountability Office, page 1)

A key common denominator of the above definitions is the fact that the information must be linked or linkable to an individual. The best definition, however, in my opinion, and one that is most relevant to website and blog operators, can be found on the Wikipedia website:

Personally Identifiable Information (PII), as used in information security, refers to information that can be used to uniquely identify, contact, or locate a single person or can be used with other sources to uniquely identify a single individual. The abbreviation PII is widely accepted, but the phrase it abbreviates has four common variants based on personal, personally, identifiable, and identifying. Not all are equivalent, and for legal purposes the effective definitions vary depending on the jurisdiction and the purposes for which the term is being used.

Although the concept of PII is ancient, it has become much more important as information technology and the Internet have made it easier to collect PII, leading to a profitable market in collecting and reselling PII. PII can also be exploited by criminals to stalk or steal the identity of a person, or to plan a person’s murder or robbery, among other crimes. As a response to these threats, many web site privacy policies specifically address the collection of PII, and lawmakers have enacted a series of legislation to limit the distribution and accessibility of PII.

(Source: http://en.wikipedia.org/wiki/Personally_identifiable_information)

In short, PII is information that readily identifies a specific individual. During my research, however, when considering the billion dollar industry of Internet advertising, I found that there needs to be a distinction between types of personally-identifiable information: actual PII, potential PII, and non-PII. This type of clarity will help website operators and advertisers stay within the confines of privacy laws.

While Wikipedia and other references offered a two-category breakdown, none that I found presented it in the above-mentioned three-tier classification. I submit that this is a more useful way for website and blog operators to classify PII, especially when developing a privacy policy.

I) Examples of PII

The following list was distilled from numerous state and federal laws that deal with PII and the Internet. Therefore, the list below does not match any particular standard that I know of. If anything, it is a bit more comprehensive. If the goal is to comply with Internet privacy laws of all states, it seems to me it’s best to err on the side of caution. These data elements clearly qualify as PII:

  1. Name (full name or first initial and last name), maiden name
  2. Email address or other online contact information such as instant messaging identifier
  3. Home or other physical address
  4. Telephone number
  5. Credit card or debit card numbers
  6. Bank account numbers
  7. Social Security number
  8. Driver’s license number or state issued ID card number
  9. Passport number
  10. Taxpayer identification number
  11. Personal characteristics such as photographic images (especially of face or other identifying characteristic), fingerprints, or other biometric data (i.e. retina scan, voice signature, facial geometry)

Items 1 through 5 are routinely collected by sites that sell goods and services. Items 1 and 2 are often collected when you sign up for a newsletter or RSS feed. That reality is, whatever the reason, if any of the above information is collected by a website or blog, more and more states are mandating that a privacy policy be conspicuously posted.

Laws in some jurisdictions point out that their privacy rules do not include information that is lawfully obtained from publicly available records.

Sensitive Personally Identifiable Information

The Commercial Privacy Bill of Rights Act of 2011, introduced by Senators John Kerry and John McCain in April 2011, goes further to define certain types of PII as sensitive. According to section (6) of that proposed legislation, the term “sensitive personally-identifiable information” means:

(A) personally identifiable information which, if lost, compromised, or disclosed without authorization either alone or with other information, carries a signficant risk of economic or physical harm; or

(B) information related to:

(i) a particular medical condition or a health record; or

(ii) the religious affiliation of an individual.

For further discussion on the Commercial Privacy Bill of Rights Act, you may be interested in reading 11 Reasons to Have a Privacy Policy Posted on Your Website or Blog.

II) Examples of Potential PII

The following are examples of “potentially personally-identifiable information”. That is, the data elements by themselves cannot be linked to a specific person but when combined with other information (such as items 1 through 11, above), they can be.

12. A persistent identifier such as a generic customer/user value held in a “cookie”
13. IP (Internet Protocol) address or host name
14. Date of birth, age
15. Racial or ethnic background
16. Religious affiliation
17. Gender
18. Height, weight
19. Marital status
20. Employment information
21. Medical information
22. Financial information
23. Credit information
24. Student information

Depending on a site visitor’s browser settings, cookies (item 12), which are small text files, are stored on the visitor’s local drive and transmitted between their browser and the servers hosting the sites visited.

The point here is, as standalone information, these data elements are not PII. They have the potential to be PII. They become PII when they are combined with other more specific data which, in total, identifies a specific person.

For example, a full blown credit report without a link to a specific individual is not PII. It’s simply anonymous credit information. However, even though a credit report might not have a person’s first and last name, if it includes enough information to identify to a particular person (i.e. date of birth + gender + ethnicity + zip code + IP address), it fits the definition of PII.

III) Examples of Non-PII

Non-Personally-identifiable information includes data that web browsers and servers typically collect through the use of cookies. For example:

  1. Browser type
  2. Browser plug-in details
  3. Local time zone
  4. Date and time of each visitor request (i.e. arrival, exit on each web page)
  5. Language preference
  6. Referring site
  7. Device type (i.e. desktop, laptop, or smartphone)
  8. Screen size, screen color depth, and system fonts

This is the type of information that websites track when analytics programs are used, for example. It gets stored on a visitor’s hard drive in the form of cookies. Some sites also track shopping preferences in order to target advertisements. In general, non-PII is widely used across the web to track and report on aggregated statistics regarding user traffic.

The lists above are by no means exhaustive. As always, website and blog owners should use reasonable care when complying with any laws and seek professional assistance when necessary.

PII Data Breaches of Profound Proportions

The timing of this article is a bit uncanny relative to recent PII security events. During the past month, two monumental breaches of PII security took place. Both involved major companies that control the PII of millions of consumers. Given their magnitude and relevance to this topic I was compelled to mention them.

The Epsilon Episode

In early April 2011, Epsilon International, the self-proclaimed largest permission-based email marketer in the world, announced that email addresses and/or customer names were stolen from their email system. Some analysts estimate that tens of millions of consumer names and email addresses were exposed. According to media reports, Epsilon handles email marketing for 2,500 clients including Best Buy, Capital One, Citi, and JPMorgan Chase.

In a press release, Epsilon reported that the affected clients are approximately 2 percent of total clients for which Epsilon provides email services. Media coverage included headlines like “Data breach is the Exxon Valdez of privacy.”

The real significance of the story for this article is the fact that the company seems to be confused about the definition of PII. In this case, in my opinion, that’s mind boggling. You be the judge. Here is a excerpt from an official statement released by the parent company. I highlighted the questionable claim:

Alliance Data Systems Corporation (NYSE: ADS), parent company of Epsilon, today reaffirmed Epsilon’s previous statement that the unauthorised entry into an Epsilon email system was limited to email addresses and/or customer names only. No personal identifiable information (PII) was compromised, such as social security numbers, credit card numbers or account information. (Source: Press release, Alliance Data Provides Statement Surrounding Unauthorised Entry Incident at Epsilon Subsidiary)

The Sony Saga

I use the term saga because this PII breach will be talked about for years to come. Just weeks after the Epsilon episode, hackers stole names, DOBs, credit card numbers, and other PII belonging to 100 million people who play online video games through Sony’s PlayStation network. Here is an excerpt from a Sony blog entry to its customers on the PlayStation.com website:

Although we are still investigating the details of this incident, we believe that an unauthorized person has obtained the following information that you provided: name, address (city, state, zip), country, email address, birthdate, PlayStation Network/Qriocity password and login, and handle/PSN online ID. It is also possible that your profile data, including purchase history and billing address (city, state, zip), and your PlayStation Network/Qriocity password security answers may have been obtained. If you have authorized a sub-account for your dependent, the same data with respect to your dependent may have been obtained. While there is no evidence at this time that credit card data was taken, we cannot rule out the possibility. If you have provided your credit card data through PlayStation Network or Qriocity, out of an abundance of caution we are advising you that your credit card number (excluding security code) and expiration date may have been obtained.  (Source: Playstation Network, Consumer Alerts)

Numerous Internet sources have since reported that the FBI, DOJ, and EU, among other authorities are investigating this massive privacy breach.

On a positive note, Sony subsequently confirmed that the credit card data was encrypted. While it’s small consolation to the millions of consumers affected, at least Sony was in compliance with industry standards regarding credit card info (PCI DSS: payment card industry data security standards). As of this writing, it remains to be seen whether the hackers have been able to decrypt and use the credit card numbers.

While Sony does not use the term PII in their public announcements, they do seem to be making a genuine effort to follow privacy rules to the letter of the law. Case in point: if you go to the PlayStation blog on this matter, you will see that they address Massachusetts customers in a separate blog entry from the rest of their U.S. customers. That’s because Massachusetts has enacted some of the toughest privacy laws in the nation (See previous post: Are Online Privacy Policies Required by Law?). Among other unique rights, in cases where PII security has been breached, MA law allows its residents to place a security freeze on their credit reports at no charge. Credit reporting agencies are prohibited from charging for it as long as a police report is submitted. At this time, laws of other states do not provide this right so credit reporting agencies could potentially benefit greatly by charging (i.e. $5.00) each time someone requests a security freeze or removal.

PII – Clarity is Critical

The bottom line message here is “clarity is critical” when it comes to defining PII. The Epsilon episode is an excellent example of how confusion can prevail without it. We need a clear set of rules, ideally at the federal level, to ensure consistency across the 50 states. With that, website owners like you and I will truly know how to comply and be in a better position to do so.

Comments

  1. Posted 12 years ago

    Very well written and comprehensive article Bob. Since there have been several noteworthy breaches (LinkedIn, Zappos, etc) since the ones you mention perhaps you should post a follow-up. Or maybe publish the press releases and comments made by the companies subsequent to the breaches. That would make for some entertaining reading, don’t you think ?


  2. Posted 13 years ago

    Wow! What an article. I am a student at CTU Online and our instructor suggested for us to read it. I want to thank you very much for posting this. It was very informative! Mary O.


Leave a Comment

Comments are moderated so they do not appear immediately. Please be relevant to the topic covered in the article. Off-topic comments and solicitations are treated as spam. Legitimate, relevant comments are approved swiftly. Thanks for stopping by!

Your email address will NOT be published.

Please note - After submitting your comment using the above "Submit Comment" button, your comment will show above the comment form flagged as "awaiting moderation." You may need to scroll up to see it. Please do not submit your comment a second time. Thank you.

Render Visions Consulting