Sunday, April 7, 2013

Decentralized Data Decrapitude

Tim O'Reilly: "Given that you put the web into the public domain... Are you a socialist?" Tim Berners-Lee: "LOL!"
Opening of "A Conversation with Tim Berners-Lee", Web 2.0 Summit 2009.

Why does Facebook have an approximate 1 billion users, Google+ 350 million and Diaspora, an open source alternative, only a meager 406,000? You could argue that Facebook started earlier and was at the time the better platform. But why was it the better platform? Because it used the infamous Social Graph. Supposedly developed by Philippe Bouzaglou in 2002, one of the early Facebook guys at Harvard University, it found its way into the hands of the Zuckerberg cabal. For further facts, watch the movie ;-) However, it wasn't the first attempt at crossing the boundaries of the Web as just a "bucket of text and links" (and the occasional image). As we know, the Semantic Web was thought up by Tim Berners-Lee, the very founding father of the aforementioned bucket, and the first article about it was published in the Scientific American in 2001. Of course, the Semantic Web was not yet Facebook technology in any way, but envisioned as a means to not only collect meaningful data, but to share it as well. These strategies, collection and sharing, are always paired, because when you start accumulating knowledge about persons, you can do the same for other entities, and vice versa. This kind of knowledge was what the Web then lacked: you may go to the library or a bookstore to get a book, because you somehow know about it, but sometimes you borrow a book from a friend, who thought you really ought to read it. Your friend knows about you and about the book, so that is why he or she recommended it to you. Interestingly, Bouzaglou now works on the collecting side of the coin, developing a semantic search engine, but unfortunately the demo currently throws an error.

In 2007 Tim Berners-Lee wanted to recoin the WWW as GGG: the Giant Global Graph. FOAF (short for Friend of a friend) was introduced as a decentralized format for describing persons and their relations. Actually, FOAF is not a format in itself, but an RDF ontology, a way of encoding human knowledge about a certain subject. RDF is an open standard, and in that way built upon the foundation of the WWW: a totally decentralized bucket of anything. But, for some reason, RDF failed, or rather, continues to fail, because it was never adopted by the likes of Facebook or Google. Facebook terms its public interface to its data the Open Graph, but that name is a bit of a hoax. It is just a "front door": behind this door is the actual internal structure, the real connection of all data Facebook has. Given permission, a developer can get a tiny bit out and use it to create his or her own application. This may look like standalone social data, but it cannot live outside the Facebook realm without loosing its meaning, even when converted to FOAF (which, after all, is possible to do). How can this be? This has to do with a fundamental (philosophical) problem, namely the Frame Problem. Data is only meaningful within a certain frame, and this of course also applies to the social graph. Before Facebook, people knew nothing of any "social graph", and only now that we have it we can denote it: Facebook became our frame for socially meaningful data, and despite its current decline in popularity, continues to be so. Google+ is "Google's Facebook", Diaspora is "an open source Facebook". FOAF will never become anything else than a way of serializing Facebook data. Berners-Lee, and we, the lesser gods, are merely considering the consequences of this reality after the fact ...

One of the main concerns of this Facebook framing of the world is that "our" data is in the hands of a commercial, corporate entity. Of course, Facebook can easily state that "You own all of the content and information you post on Facebook", when ownership in the sense of copyright is not a moneymaker for them. Furthermore the legal terms state that "you can control how it is shared through your privacy and application settings", which underlines that control lies with you, but only as far as the Facebook implementation goes. The real money is in the fact that only Facebook knows everything, and will allow third parties to commercially exploit a portion of this knowledge. Facebook has monopolized social data for the last couple of years, only to compete with the same model, as found in Google+. This very much resembles the way large corporations do business on the whole: stock-trading, driving up prices, offshoring, and more of that modern-day imperialism. As an aside, Google+ started out with a system called OpenSocial, which may or may not be like the hoax we encountered in Facebook, but the system was abandoned by Google last year. As far as I can tell it continues to fuel MySpace, but who cares anyway... The major players now are Facebook and Google, and while the popularity between them is equaling out, they drive the same strategy: data slavery. How do we want to counter this epidemic? Berners-Lee stated that "I express my network in a FOAF file, and that is a start of the revolution.", but now that seemed to be rather misguided. It wasn't the technology in the first place that caused a revolution, it was the concept in the hands of a bunch of Harvard misfits.

What concept do we have to strike back, if all we can come up with is that "it has to be open"? Not much. A dubious initiative operating givememydata.com wants you to get your data out of Facebook, and offers some formats, including a GraphViz file, that will at least allow you to display and explore your part of the network. Apart from its leftist motto it resembles a third-party application in every way, including the "U.S. commercial" extension to its domain. And again, what to do with your data when it's no longer "in the grid"? Port it to Diaspora? Well, it doesn't necessarily have to mean the same there, so you might have some work to do, provided you know what you're working with. Just sling your FOAF on the web, like TBL proposes you do? That means exposing your complete shopping profile to all kinds of potential harm, possibly worse than Facebook (on a shorter term at least). I don't know the answer yet, but I think it will take a lot more common knowledge about what social data is and what power it harbors. It will take a system of trust and authorization that is far more fine-grained than anything available, and that needs to be usable by laymen. But the most important thing is that people need to be a little more responsible when it concerns their interaction with the web. So for now, "open" and "social" will have to become "constrained" and "responsible", and that does sound a lot more boring than "#ifihadglass I'd share the world to my almost million followers"...

Update 2013/04/08: I posted the 2010 Web 2.0 conference interview with Mark Zuckerberg integrally. For the sake of completeness.

Update 2013/04/16: Shutting down the Open Knowledge Graph


No comments:

Post a Comment