Recently had an epiphany relative to data storage and digital identity. Granted my thoughts did not accrue in a vacuum, suffice to say that I have always been concerned with ownership. More specifically, the idea that it is very important to note that convenience comes at a price. I am not here to judge or criticize the desire for people to minimize complexity in their lives. I too have fallen victim to convenience.
I suppose half the battle is understanding the risks. What to do about it?
Here is an interesting use case.. Someone once told me that they did not like the fact that Flickr did not report the origin of the visitors to their photos. Flickr simply reports the number of views for each photo on their system. Perhaps this person wanted to figure out a way to make revenue from the traffic? Nope, that wasn't the concern at all. The concern was that their was no simple way to understand why people were interested in those particular photo sets. What brought them to those set of photos? Why did they find them particularly interesting? Some would probably ask, "Why do you care?" Well, it really is a matter of having some level of control over your data.
Ahh, but is the data really yours anymore? Methinks not. You have agreed to give up those rights the moment you agreed to share the content with Yahoo!
In return Flickr, will mix and slice the information to your liking. The same model applies to every so-called Web2.0 service. You throw your data up into the cloud and rely on the convenience of the service. Some people call it SaaS. To paraphrase Eben Moglen , giving your rights to the bailiff, pretty much guarantees that you do not have control your data anymore. Gmail works the same way, with the exception that the bailiff now earns residual income off your habits and the habits of your unknowing friends. It's all about data mining and ads revenue.
So, again the question is what are we to do about this concern? Well consider that a fundamental level, both Y! and GOOG utilize the same building materials that are readily available in the wild. In a world Free and Open Source Software. I'm not talking about building a private datacenter. Of course that would be cost prohibitive. What I am suggesting is that you begin to store some of your own data. The cost of a terabyte of storage is roughly 1/4 the cost of the same quantity 5yrs ago. The software that you could use to search and index your own data is free. However, cost of learning how to manipulate and use these building materials is not free.
Well, you have to use your good judgment. What are the tradeoffs? What is the cost benefit? I do not have all answers. I merely pose the question. In fact, it is a question that many others have asked too. Fortunately, we still have time think about a solution.
Eben Moglen's MySQL Conference talk
Bradley Kuhn Software Freedom Law Center
*Aside* If you listen to Brad's talk closely, you might actually hear my question about tivoization and GOOG towards the end of the Q&A period.