Governing and authorship models at Wikipedia and Britannica

Elsewhere, we have spoke about the complex and interesting governing and authorship model in Wikipedia. How counter-intuitive is it that a model like “anyone can edit anything they want” could produce a useful information resource?!

We have conducted some characterizations of the social dynamics within this community, and tracked its changes over time. Interestingly, in the last few days, both Wikipedia and Britannica have been in the news for debates on their stance of the authorship and editorial model.

First, on Jan 24th, we learned from the BBC that the president of Britannica wrote a blog entry in which he outlined a new plan at Britannica to enable readers as well as more experts and editors to help expand and maintain the articles. While not naming Wikipedia by name, it was a clear nod toward a more collaborative relationship Britannica will have with its readers. Specifically, in the blog entry, Jorge Cauz says that “We believe that the creation and documentation of knowledge is a collaborative process but not a democratic one.” Most would agree that, in the past, the collaborative process that Britannica had was much more restrictive, and now they seem to have decided to open the door wider to include more people in the editorial process.
Then, today, we learned, also from BBC, that Jimmy Wales have caused a huge stir at Wikipedia for suggested a more restrictive approach to the editing process. He now believes that Wikipedia should follow a model in which edits from anonymous users have to be vetted by one of the site’s editors before becoming live.

Apparently, the heated debate is now spreading, and is being mentioned as a big news item on the Yahoo! front page after being written up by AFP. So here we have a system that has been extremely liberal with its editorial policy moving toward a more restrictive authorship model.

So what gives? Is there a right way or wrong way to constructing and compiling knowledge resources? As designers of social systems, what should be the governance model for these systems?

For one thing, we still know awfully little about the social dynamics in these large social systems. We have been quoted in the past that our characterization models of editors show that the top 1% of the editors in Wikipedia generates 50% of the edits. While that is true, the other 50% is being generated by the other 99% of the editors. This other 50% is just as important as the first 50%!

We have been recently conducting some additional research to understand class structures in Wikipedia. We already know that the distribution of editors and their frequency of edits in Wikipedia is a classic power law curve. In order to understand editors through out this distribution, we first ranked editors by their edit frequency, and then divided all of the edits into four quarters, according to this sort.

For one month worth of edit data, there are about 220 editors that are at the very top of the pyramid. These top (most frequent) editors produce the first quarter (25%) of the edits. The next 25% of the edits come from about 1000 editors. While the 3rd quarter of edits come from about 4000 editors, and the last quarter comes from about 15000 editors.

So now the research question is whether you want to design your editing policy to favor the upper class (top editors and administrators), the middle class (the 5000-6000 editors who contribute the middle 50% of all edits), or the lower class (the 15000 editors who contribute the last 25%).

One way to think about this problem is to study the amount of resistance each of these four classes of editors experience on Wikipedia. A metric that we used is the reverts-to-edits ratio. That is, on average, what percentage of edits were reverted, as experienced by each of these four classes of editors? Turns out that the reverts-to-edits ratio for each of these 4 classes of editors were 1.3%, 1.4%, 1.5%, and 4.7%, respectively. Meaning that the lower class of editors clearly experience greater resistance, such that, on average, 1 out of every 20 edits they contribute are reverted. Moreover, the resistance they have experienced have generally increased over time (from about 3% in early 2006 to 5-6% in 2007-2008, and back down to around 5% in late 2008).

So, even without the “flagged revision” mechanism such as the ones suggested by Jimmy Wales, it has already been getting harder for the lowest class of occasional editors to produce edits that remain as contribution in Wikipedia.

The AFP article points to the fact that the debate over the policy came about because of vandalism on Ted Kennedy’s page, which had falsely suggested he died after suffering a collapse at a lunchon during Obama’s inauguration. But apparently this was corrected within minutes, suggesting that the current system is still correcting most mistakes quite rapidly. Moreover, after I did some sleuthing in the editing history, it appears that the original vandalism edit was done by a registered user named “Gfdjklsdgiojksdkf”, and not an anonymous user.

So, it is unclear to me that the current system is not working. Are we fixing something that isn’t broken (at least not yet)?

