UNM, Rice Researchers Document the Velocity of Censorship

Jed Cran­dall

Cen­sors at Sina Weibo, a Chi­nese web­site sim­i­lar to Twit­ter, work with amaz­ing speed and effi­ciency. That’s the con­clu­sion of research con­ducted by UNM Assis­tant Pro­fes­sor of Com­puter Sci­ence Jed Cran­dall and Rice Uni­ver­sity Pro­fes­sor of Com­puter Sci­ence and Elec­tri­cal Engi­neer­ing Dan Wal­lach in con­junc­tion with an inde­pen­dent researcher and an under­grad­u­ate researcher from Bow­doin College. 

The study is titled, “The Veloc­ity of Cen­sor­ship: High-Fidelity Detec­tion of Microblog Post Dele­tions,” and is under­go­ing peer-review.

The point of our mea­sure­ment study of Weibo is to take a closer look at global online cen­sor­ship prac­tices,” Cran­dall said. “There has been con­sid­er­able debate in the U.S. recently about extend­ing copy­right law enforce­ment to include var­i­ous kinds of fil­ter­ing online. China already has laws in place for com­pa­nies within China to fil­ter online content.”

Weibo is one of the biggest social net­work com­pa­nies in China, and it faces the dual chal­lenge of keep­ing its users engaged (and thus, watch­ing adver­tise­ments and mak­ing money for Weibo) while keep­ing the con­tent it hosts com­pli­ant with local laws. If Weibo had insuf­fi­cient con­trols, the gov­ern­ment may take action against the com­pany. If their con­trols were too rigid, users might aban­don them for one of their com­peti­tors. Weibo’s suc­cess implies that it has found a happy medium, and the research team says that is what makes Weibo an inter­est­ing social media plat­form to study.

In Feb­ru­ary 2012 Weibo had more than 300 mil­lion users and about 100 mil­lion mes­sages that were sent daily.  Weibo, like Twit­ter lim­its mes­sage length to 140 characters. It also allows embed­ded pho­tos and videos and com­ment threads to be attached to posts.

In spite of the tremen­dous vol­ume, the com­pany can detect a cen­sor­ship event within one minute of post­ing. In their paper the researchers describe how they were able to track the cen­sors at work and hypoth­e­size sev­eral dif­fer­ent fil­ter­ing meth­ods that appear to com­prise Weibo’s defense-in-depth sys­tem of censorship.

Weibo gives us a win­dow into the future for what Inter­net cen­sor­ship of social media around the world may look like,” Wal­lach said. “For­mer Supreme Court Jus­tice Louis Bran­deis cham­pi­oned trans­parency a cen­tury ago when he wrote, ‘sun­light is said to be the best dis­in­fec­tants.’ We hope that our research shines a light on how laws cre­ated by gov­ern­ments and imple­mented by the pri­vate sec­tor can affect free speech every­where, includ­ing here in the U.S.”

One of the most inter­est­ing ele­ments of the research is the find­ing about the speed at which the cen­sor­ship process actu­ally works.  Accord­ing to Cran­dall, “There have been some stud­ies on Weibo show­ing that posts are deleted after a day or two, but we see posts being deleted after five or ten min­utes. Basi­cally we showed that if you want to have a com­plete pic­ture of inter­net cen­sor­ship you have to have some­thing that can mea­sure very quickly on the order of min­utes, and you have to be able to mea­sure a wide vari­ety of things.”

About the Research
The research group first had to deter­mine how to approach the prob­lem, so they looked at the post­ings of indi­vid­u­als who had pre­vi­ously been cen­sored.  Over weeks, they were able to slowly expand their group, adding any user with more than five deleted posts and even­tu­ally find­ing more than 3,500 users to track for their sample.

Once they set­tled on a group to mon­i­tor, they decided to check their posts every minute to deter­mine how long it took cen­sors to remove a post.  Their web crawler searched for posts that appear, and then are sub­se­quently deleted.  Their data showed that five per­cent of the dele­tions hap­pened in the first eight min­utes and within 30 min­utes nearly 30 per­cent of the dele­tions were com­pleted.  More than 90 per­cent of the dele­tions occurred within one day after a post appeared.

This infor­ma­tion lead the team to think about what resources it would take to mon­i­tor that river of infor­ma­tion.  They cal­cu­lated it would take 4,200 work­ers read­ing 50 posts a minute in eight hour shifts to cen­sor using only human review of the posts.  That led them to the con­clu­sion that much of the fil­ter­ing must be auto­mated, through ini­tial flag­ging and ret­ro­spec­tive searches.  This con­clu­sion allowed them to set up six hypotheses.

  1.  There is a sur­veil­lance key­word list that trig­gers for posts to be look at by a mod­er­a­tor for pos­si­ble deletion.
  2. Weibo tar­gets spe­cific users, such as those who fre­quently post sen­si­tive comments.
  3. When a sen­si­tive post is found, a mod­er­a­tor will find all of its related reposts and delete them all at once.
  4. Weibo removes posts retroac­tively via key­word search, caus­ing spikes in the dele­tion rate of a par­tic­u­lar key­word within a short amount of time.
  5. The cen­sors work rel­a­tively inde­pen­dently, in a dis­trib­uted fash­ion.  Some of them may work in their spare time.
  6. Dele­tion speed is related to the topic. That is, par­tic­u­lar top­ics are tar­geted for dele­tion based on how sen­si­tive they are.

The research team notes there may be many mech­a­nisms beyond those they have hypoth­e­sized, which future work may reveal.  The team did not con­sider inter­ac­tions between social media and tra­di­tional media but sug­gest that would be an inter­est­ing topic for future research.

Media con­tacts: UNM Karen Went­worth (505) 277‑5627; email: kwent2@unm.edu
Rice Uni­ver­sity, Jade Boyd (713) 348‑6778; email: jadeboyd@rice.edu
David Ruth (713) 348‑6327; email: david@rice.edu

Posted in Academics & Faculty, Research, University News |