August 10, 2006
Thought Fashion: Are You In or Out?
It's true. I peeked.
Yes, I downloaded the AOL files. And I peeked. Why? Because I wanted to write this blog post and I wanted to see for myself what sort of gestures people were making as they searched for porn or socks or how to bury their pet birds or wives they'd just killed. I also needed to see the form the data was in. And I'm a voyeur just like everyone else in and around this story, and I wanted to rubberneck my way into other's private intellectual spaces.
But it's not right. The part where I and every news outlet, blogger, reader and looky-loo has been engaging in, judging people by their searches, making assumptions and behaving as if we ourselves have never made any searches or expressed any thoughts that would not look funny to someone else.
It's also not right because the data is personally identifying. Reporters have been tracking down people based upon their searches. It's not that hard, if you yourself are a good searcher.
What was it Bob Blakely said? About how "dragging all human behavior into the public is literally totalitarian." He is the chief security and privacy scientist for IBM's Tivoli Systems. "If you erode privacy, you erode liberty, because people don't tolerate things going on in front of them that they don't approve of." I was struck by how succinctly he answered the question that is always asked of people who object to the government or some other large and powerful entity as compared with you: What do you have to hide? If you're not doing anything wrong?
Every article on AOL's mess up that says something like AOL's Disturbing Glimpse Into Users' Lives is buying into this whether they know it or not. Thank you CNet for reaffirming our intolerance.
Let's get clear on the definition of "aggregated" data. For us geeks, we use this term often, as we reassure those whose data we work with that aggregation means we are removing anything personally identifying, and placing it with other user's data, so that it's just a pile of anonymized data that could never be distinguished by the person. An example might be the aggregation of all the searches on "dog," where who did them is removed but we know that 38 people searched on that term during a particular hour and day.
But users don't think that way. They hear the word, aggregated, and they think the data handlers are aggregating everything the system may know about just them, specifically and personally, and lumping it all together. Talk about miscommunication. And it terrifies the non-geeks.
What we really should be saying is that the data is "anonymized" and therefore you are safe. AOL's data was not safe because it was not anonymized, and for users, it was their definition of aggregated.
The AOL data which lumped each user's searches together with a user ID over three months, making profiling and finding them easy, meant that AOL provided enough data in some cases to indicate a lot about who the data related to very specifically. Leading to judgments by the rest of us. About the people who do or think things on the edges of society.
And why is this wrong? Because it hurts people. It makes them feel defensive about their own thoughts and ideas.
So, well, if you aren't doing anything wrong, what *do* you have to hide? Well, everyone has something they do or think about that would be an edge thought or that in one context would be in the middle, but in another, must be defended as it resides on the edge. And that would be disapproved by someone. Something the rest of society might not tolerate.
Intolerance leads to the totalitarian. We, the human race, have been intolerant since the beginning of time. What we are intolerant of is a moving target depending on the fashion of the day. In the 30's in some places it was fashionable to be intolerant of Jews and gays. In the 40's it was Germans and the Japanese, and in the 50's communists and socialists. In the 60's it was civil rights proponents and hippies and in the 70's liberals. In the 80's we were back to communists, and in the 90's it was Hispanics (remember all the state propositions outlawing them from medical care?). And what is it today? Islam? Are thoughts you think today and the cultural references associated with them that are in the middle going to fall to the edges in the next decade?
We have used the fear of all these intolerable people and their thoughts as excuses to hunt for more proof of their intolerableness by surveilling everyone in society and searching through all the detritus of our lives. With digital data more available, we think we can find the proof we need in these edge thoughts. And then we will persecute the people having them. And what better way to do it now that the internet, ISPs and heavily used search systems can provide one or another level of very personal, thought data. Search terms, or a database of intentions as John Battelle has talked about so much, are one slice of your data that tell a lot about you. And if we can get it in a neat little file, machine readable and searchable and quantifiable, then well, why not?
If you believe that sacrificing freedom to keep freedom is the way to go, then you probably don't see any problem with demonizing people who have thoughts you don't like. Especially if those thoughts are in the form of passing gestures such as search terms plugged into a browser.
But until we decide (or default) into a Minority Report society (and change our constitution), we are not yet convicting people for thinking things. Everyone has had the thought that they'd like to kill someone once or twice in their lives. But people, the vast vast majority, don't do it. The idea that we demonize someone for searching on this, which is a gesture I would put into the fleeting thought category for almost everyone, is taking an edge thought, which we all have from time to time, and putting it firmly under the scrutiny of the middle. I believe we really only want to find people who make serious plans to hurt others, or actually carry it out. That is what our law it based on, and the premise of our society. But to track everyone, their searches, their every digital gesture, and expose it in one or another ways is going to be troublesome. And it begs a question I've asked before: is your digital identity your personal intellectual property? Is your Google identity yours or someone else's? And by extension, is your clickstream a personal expression (carefully chosen and shaped by you)? In other words, can you copyright your clickstream and exert ownership?
There are at least two choices. One of them is to do what we are doing now: have ISPs and search services collect this data, and when asked by the government, have it turned over. But that means the data is still in many ways secret. Of course the companies don't want the data getting out because it is proprietary. And neither does the government, because they don't want anyone to know quite how much is out there about you, in case you are trying to cover your tracks or you want to defend yourself. But having all the data, the government has the upper hand. And secrets are powerful. How do you show, if you are being accused of something based upon your searches, that everyone else searches on those same things too? That it's actually a social norm? If you can only ask for your own searches to defend a case against you, and not everyone else's, in order to compare yourself to it, you won't be able to argue social norms which judges rely heavily on when making decisions.
But there is another choice. And that brings up the Attention Trust premise (I'm a Board Member) which is that people own a copy of their own data, no matter where they do things: Amazon.com purchases, Google searches, or AOL clickstreams, or anywhere else you might land in a browser on your computer. As a co-owner of your data, you can take it anywhere and do what you wish. There could be many business models built upon this data controlled and shared by users. Google takes all the data they collect and plugs it into AdSense. If lots of users took their own data and made it available voluntarily, a new and more 'open source' style AdSense could be created.
But much more importantly, something like Steve Gillmor's Gesture Bank, where users opt-in their clickstream information, in an anonymous form, exists to open up this kind of data. The Bank will make the aggregation of anonymized data available to anyone for any purpose. While this may lead to businesses working from this pool of searches and clicks, it also means that a growing pool of data is there to show the edge thoughts and potentially unpopular ideas people may exhibit. The pool can be used to defend against totalitarian efforts to single out in secret those who are out of fashion politically. Which may turn out to be you. Or someone who uses your computer.
That I think is far more important than an open source AdSense, though a business built upon this data would likely justify and make a better case for us to have a Gesture Bank of ideas and thoughts that support political freedom.
Seth Goldstein and Steve Gillmor already offer Root. net users the opportunity to put their data into the Gesture Bank if they wish, though any person can contribute to this anonymous pool of user data. And for that matter, attention streams can be sent to multiple services.
And, at the October 4 Attention Conference, Steve and Seth will announce Attention Soft. Stay tuned.
August 04, 2006
OpenID2 Developer Info Day Aug 10th Bay Area
From Kaliya Hamlin:
- I am really pleased to announce that we have an OpenID Informational Evening for Developers August 10th 6-9 in Berkeley at 2029 University, Upstairs.
- The Big news is the community has converged and figured out the authentication layer - OpenID…OpenID is just the authentication layer - but on top of this ad hoc standard lots of cool stuff can happen. The goal of the evening is not to geek out on identity but to connect with a developers working on applications that require users to login.
- Find out more about what it is…how it works…how you can install. The incentives to learn are high with the $5000 bounty for having OpenID in Open Source projects.
- Presenting and answering Questions
David Recordon formerly of Live Journal/Six Apart now of Verisign will be presenting a bit about the origins of OpenID but most importantly how it works…and how you install it.
- Andy Dale from ooTao will talk a bit about i-names and how they work with OpenID2 and looking forward to what comes next after authentication - profile sharing. ooTao is also data sharing are running ibroker services.
- Mary Hodder CEO of Dabble will talk about the work happening around the development of itags.
- I am helping coordinate the evening please RSVP to me - kaliya (at) Mac (dot) com and feel free to ask me any questions.
- If you know a developer - pass the word along.
ps. for all you Technorati guys who keep having questions, now is your chance to ask the guys who know.
- UPDATE: Scott Keveton from JanRain will be there too. He just posted an OpenID walk through on his site.
- UPDATE 2:Dick Hardt from Sxip will be in town and will also be joining us for the evening. Hopefully he will share some of the cool stuff sxip is doing with OpenID.
August 03, 2006
Dabble Launched, etc.
Some people know that I have been busy launching Dabble, the company I founded for searching, browsing and organizing media.
That was pretty much a month underwater, as we worked around the clock for 30 days in the most insane schedule. But we are lucky, as we have an amazing team and we were so happy with the response to the launch which was overwhelmingly good.
We are now working on fixing things for users and adding some new features they've asked us to do.
It's great, but also tiring. The best part is when someone tells us they are really enjoying using the site. It feels really nice.
If you have feedback, send us a message at feedback at dabble.com. We are looking for ways to improve and can use all the help we can get. Thanks!

