Philip Guo (Phil Guo, Philip J. Guo, Philip Jia Guo, pgbovine)

Personal data, from private to public

Summary
I list the spectrum of personal data from private to semi-public to fully public, to get us thinking about what it really means to share our words and images with others.

For a longgg time I've been thinking about the spectrum of private to public as it relates to personal data, probably ever since 1997 when I posted my middle-school ramblings to my first website. I doubt anyone actually read what I wrote back then, even though it was technically public.

Fast forward 22 years. I just read Hidden cities by Nadia Eghbal, along with her June 2019 newsletter, which re-sparked this long-standing thought in my mind that just because something is online doesn't mean it's necessarily public. I also recently wrote Communicating, Fast and Slow, which classifies modern communication on a spectrum from fast to slow. That gave me the idea to try to classify personal data on a spectrum from private to public. Here goes!


The first category includes data that we expect to be private:

  • thoughts in our minds (super-private unless someone can coerce you into revealing them)
  • something we tell others privately in confidence1
    • (what if you're talking to a friend privately but you're out in a public location so others can eavesdrop and record you?)
  • something we write down on paper and keep at home (someone needs to break into your home to access)
    • this also applies to audio or video recordings kept only on portable media like memory cards or (gasp!) CD/DVD/tape2
  • any data2 on an external hard drive that's not connected to the internet at home
  • any data2 on your desktop computer at home (any internet-connected device is vulnerable to remote data theft)
  • any data2 on your desktop computer at your office (more people can physically access your office than your home)
  • any data2 on your laptop or mobile device (much easier than desktop computers to get lost or stolen)
  • private data that you back up to a cloud-based service (maybe better if you're paying for a service with a good reputation)
  • data that you submit to any website (e.g., online shopping, banking, HR paperwork for your employer, tax returns) that you expect for them to keep private
    • these websites are often the target of cyberattacks, so think about all the security breaches that happen every year.
    • PSA: to protect yourself from these inevitable breaches, do regular credit checks, freeze your credit, and monitor your bank accounts carefully

Even though you expect all of this data to be private, anything in the cloud is less private since others can more easily access it.


The next category includes data that we intentionally communicate to an audience but expect to be kept somewhat private. Note how this is starting to relate to Communicating, Fast and Slow. Let's go from the least to the most public here:

  • phone or video call1 (more private than text-based messaging since it's more trouble to record audio/video than text)
  • mailing a letter 🐌🐌🐌
    • (phone calls and letters might even be more private than digital data from the above category)
  • one-on-one messaging (e.g., texting, Twitter DMs, Gchat)
    • slightly more private than email since you can't as quickly forward messages like you can with emails
  • messaging on a private group chat (e.g., iMessage, WhatsApp, Slack, Facebook Messenger)
  • email to one or a few hand-picked recipients
    • people can easily forward emails without your permission, but there's a social norm that they're expected to be private
  • file or document that you share using an obfuscated URL (e.g., a Dropbox file, YouTube video, or Google Doc that's unlisted but anyone with URL can access it)
    • theoretically public but shouldn't be indexed by search engines, so only those with the URL can realistically access
  • folder that you share with others but you expect them to keep private (e.g., Dropbox or Google Drive shared folders)
  • posting to a private mailing list or discussion forum3
    • e.g., internal company mailing list, college class discussion forum, invite-only list/forum, members-only list/newsletter/forum that people pay to join
    • the more people on the list, the more public it is
  • posting something to social media that you expect to be semi-private, such as only being visible to your friends/followers
    • Social media is notorious for secretly making posts more widely visible than what you intended, since that gets them more user engagement. Thus, chances are way more people than you expected will be able to see your posts; and sometimes permissions change over time, so what you intended to be seen by only your friends will later be seen by, say, friends of friends, etc.
  • replying to someone else's post on social media, which you expect to be semi-private
    • replying is more public than making your own original post since you don't know the permissions that have been set on the post you're replying to

This final category includes data that we intentionally make public, but in practice not everything is equally public (see related links in Appendix). Again from the least to the most public ...

  • sending an email newsletter to your subscribers (which is also archived online and publicly accessible, unlike a private members-only newsletter)
    • people can reply privately to you but can't reply-to-all, which reduces the potential for amplification and snowball effects
  • posting to a public mailing list3
    • usually not well-indexed by search engines so harder for random people to find
  • writing a professional article or paper book
  • posting a public audio podcast4
    • harder for people to extract out a snippet to re-share
  • livestreaming4
    • more public if your livestream video is archived, but those videos tend to be long so they're harder to sift through than shorter public videos (see next item)
  • posting a public video4
    • more public than audio since someone can skim the video to pull out screenshots or clips to re-share
    • the shorter the video is, the more public it is since it's easier for people to watch, react to, and re-share
  • writing something on your own personal website3
    • more public than audio or video since people can easily skim and snip out excerpts to share4
    • more private if you set your website not to be indexed by search engines
    • more private if you don't embed social media sharing buttons on your website (some blog platforms automatically add those buttons, which make them more public)
  • posting to a public discussion forum3
  • posting something to social media with fully-public permissions (e.g., a public Twitter or Facebook post)
    • Most notably, social media is a “push” medium where your posts get proactively pushed into the feeds of all of your friends/followers. e.g., if I write “I love spinach” in some article on my website, then only people who happen to be reading that exact article will learn this (fun!) fact; but if I post “I love spinach” on Twitter/Facebook/etc., then everyone who follows me sees that in their feed right away, regardless of what they're doing at the moment. Social media feels more public since posts are pushed into people's feeds.
    • Social media posts also spread much more easily than forums, mailing lists, or personal website posts since social media is designed for rapid sharing and amplification.
  • posting online about any topic that people might get outraged about5, which can incite people to spread your words to audiences that you didn't originally expect to target
  • replying to a social media thread that people might get outraged about or with controversial people involved, which again can incite an online mob to virally spread your words to unintended audiences
    • again, social media is designed for rapid amplification
  • someone from an online community that you don't know about finds your website/forum post (maybe even a super-old post!) and re-shares it with that community, possibly with inflammatory commentary, which causes it to virally spread there (again see Hidden cities)
  • popular media (e.g., major websites, news aggregators, television) features something you posted, which can bring a slew of unintended publicity to something you originally intended for a niche audience so you didn't make your words “battle hardened” to counter drive-by rando criticisms
  • a popular public figure finds your post and re-shares it with their large online audience to advance their own agenda, possibly with inflammatory commentary, which shines a huge spotlight on something you never intended to be so public
  • writing something for a widely-read online publication
    • this is the most public form of data transmission, I think
    • even more public if your topic is potentially controversial

Note that just because you post something publicly doesn't mean that you're necessarily expecting a ton of people to view and react to it; people can get caught off-guard when content from their niche personal website gets featured on, say, major news outlets.

Another note: the longer your content is (whether it's writing, audio, or video), the less public it feels since it takes more effort for someone to sift through it to find something to react to. For instance, a short Twitter or Facebook post feels very public since everyone can instantly see and knee-jerk react to it within seconds. But a longform article, podcast, or book requires a lot more dedicated attention before someone spreads or reacts to it.

Also, the more well-known you are, the more of a multiplier there is on the “publicness” of anything you post. (Think of the difference between a random unknown person posting something and the CEO of a major company posting that exact same thing.)


1What if someone secretly records you and posts publicly without your permission? Or gossips with ill intent? This simple one-dimensional spectrum doesn't take deception into account.

2Encrypted data is more private than unencrypted.

3Using pseudonyms, throwaway accounts, or posting anonymously can make these more private.

4I've personally found audio and video formats to be much more expressive than writing because they feel less public. Since audio/video requires more dedication from my audience to consume, it's less likely that my words will get amplified out of context. For details, jump to 7:57 of my VidCon 2019 recap video.

5Often these people aren't purposely looking to incite outrage for personal gain. They're just unaware that their views are unpopular with certain audiences that they're not even targeting. But it's so easy for anyone to pick up on those words and re-share them with communities that the post wasn't originally meant to reach.

Parting Thoughts

Hopefully this spectrum has made you think more deeply about what it means for our personal data to be private, public, or somewhere in between. Modern life isn't as simple as “Well, if you don't want the entire world to see something, then don't post it online!” That's almost like telling people not to ever go out in public. Many of us communicate online publicly or semi-publicly since that's the most convenient way to reach our audience, which could be up to dozens or hundreds of likeminded people. And some of us write online to think out loud or to keep ourselves accountable/motivated, even if (almost) nobody is reading. But we're not prepared for the context collapse and unintended attention when our words spread beyond our intended bounds.

Another important facet I didn't have room to fit into this one-dimensional spectrum is time. What if you excitedly posted something super-publicly ten years ago, but now you wish that weren't online anymore since your viewpoints or identity might have changed during those years? Sometimes you can't even take your old content offline, since you don't control the website where it's posted (e.g., a discussion forum or online publication). And even if you could take it offline (e.g., from your own website), it's probably archived in the Internet Archive so someone will always be able to dig it up and re-post it with their own commentary.

Finally, what this article also didn't cover is that other people often post information about us, such as our parents, friends, classmates, teachers, or coworkers. They often use our names, words, and images in their own posts, which we can't fully control. Like it or not, our data is always going to be less private than we expect, so we all need to learn to cope with this reality.

Appendix

Related links

How this article was written ...

Behind-the-scenes of me writing this article from scratch in a single 1.5-hour session at the coffee shop. Video is at 500% speed.

Keep this website up and running by making a small donation.

Created: 2019-07-15
Last modified: 2019-07-15
Related pages tagged as software: