Category Archives: Uncategorized

Percentage of pie charts which resembles Pac Man (as a Google pie chart)

The URL for generating the following pie chart on the fly via Google charts, containing all the needed parameters (that’s why it is so long), is
http://chart.apis.google.com/chart?chxt=x,y&cht=p&chco=FAFAFA,FFFF00,FAFAFA&chs=600×300&chtt=Percentage%20of%20Google%20Chart%20Which%20Resembles%20Pac-man%20Chart%20title&chd=t:10,80,10&chl=Does%20not%20resemble%20Pac-man|Resembles%20Pac-man which produces

From mattcutts.

Predicting the Future with Social Media (SoNet slides)

During our weekly SoNet internal research meeting, my colleague Napo presented the paper “Predicting the Future With Social Media” by Sitaram Asur and Bernardo A. Huberman, archived on arXiv in March 2010. Using Twitter posts, they are able to forecast box-office revenues for movies, outperforming market-based predictors. They also do sentimental analysis on Twits by asking Mechanical Turk to tag few twits as positive, neutral, negative and then they train LingPipe to predict the positiveness of all the other millions of twits. Read it! Very interesting paper!

As droves: are Wikipedia editors leaving, or are new editors joining?

Logo of the English Wikipedia
Image via Wikipedia

Is “Wikipedia editors are leaving in droves” as the Wall Street Journal wrote, picking up a study by Felipe Ortega?
Or is “New editors are joining English Wikipedia in droves?” as Erik Zachte, Data Analyst at Wikimedia Foundation replies?
The blog post by Erik is very interesting. Basically you can take it as a warning about the fact with the amount of data available nowadays thanks to Web2.0 services you can say almost anything; it really depends on how you define quantities. Just as an example, Felipe counted every person as editor who made one update over the years while Erik (for Wikipedia’s internal statistics) only counts a person as editor who has 5 or more edits in one month.
The second lesson you can take away is: if you want to get picked up by newspapers (such as WSJ) synthesize your huge work (the PhD thesis of Felipe is a PDF of 228 pages) into few catchy and dramatic headlines such as “Wikipedia editors are leaving in droves”.

Reblog this post [with Zemanta]

Library of Congress gives to every twit bibliographic status!

The Twitter fail whale error message.Every public tweet, ever, since Twitter’s inception in March 2006, will be archived digitally at the Library of Congress, the largest library in the world.
I’m still totally puzzled by how a so simple service (basically you can post 140 chars of text and nothing more) got so widely used! A typical Matthew effect (the rich gets richer)!
See more on official Library of Congress blog post “How Tweet It Is!: Library Acquires Entire Twitter Archive”.

Reblog this post [with Zemanta]
Image via Wikipedia

Review of “Feedback Effects between Similarity and Social Influence in Online Communities”

Today I presented to the other SoNetters a wonderful paper titled “Feedback Effects between Similarity and Social Influence in Online Communities” by David Crandall, Dan Cosley, Daniel Huttenlocher, Jon Kleinberg, Siddharth Suri of Cornell University, presented at the 2008 KDD conference on Knowledge discovery and data mining. My review just under the slides I used for the presentation.

Besides the points already presented in the slides, here I add few points relevant for our research on Wikipedia.

Social influence: People become similar to those they interact with
Interaction ? similarity
Selection: People seek out similar people to interact with
Similarity ? interaction

They considered registered users to the English Wikipedia who have a user discussion page (~510,000 users as of April 2, 2007). They are responsible for 61% of edits to the roughly 3.4 million articles. They ignore actions by users without discussion pages, who tend to have very few social connections.

User’s activity vector v(t): number of times that he or she has edited each article up to that point in time t.
Similarity(u,v): similarity between activity vectors of user u and v.
Time of ?rst meeting for two users u and v = time at which one of them ?rst makes a post on the user discussion page of the other.

In principle, we could also try to infer social interactions based on posting to the interactions based on posting to the same article’s discussion page. Moreover, we found that using simple heuristics to infer interaction based on posts to article discussion pages produced closely analogous results to what we obtain from analyzing user discussion pages.

They ?nd that there is a sharp increase in the similarity between two editors just before they ?rst interact (selection), with a continuing but slower increase that persists long after this ?rst interaction (social influence).

They also create a model and estimate the unobservable parameters based on maximum-likelihood. The estimates are as follows:
* The parameter ?, the probability of communicating versus editing, was 0.058 (i.e. every 100 actions, 6 are talks while 94 are page edits). We can cite it and we can even verify this across different wikipedias and at different time slots.
* When considering article edits as actions, the article is chosen from one’s own interests with probability ? = 0.35, from a neighbor’s interests with probability ? = 0.081, from the overall interests of Wikipedia editors with probability ? = 0.5, and by creating a totally new article with probability ? = 0.069.
* When considering talks as actions, the user to communicate with is chosen randomly from the overall set of users with probability ? = 0.71, and someone who has engaged in a common activity with probability 1-? = 0.29

They also do some content analysis (30 instances of two users meeting for the ?rst time. We examined the content of the initial communication and any reply, looking for references to speci?c articles or other artifacts in Wikipedia. We also compared the edit history of the two users).
Of the 30 messages, 26 referenced a speci?c article, image, or topic. In 21 cases, the users had both recently worked on the artifact that was the subject of conversation.
The gap between co-activity and communication was usually short, often less than a day, though it stretched back three months in one case.
Informally, communications tended to fall into a few broad categories: o?ering thanks and praise, making requests for help, or trying to understand the editing.behavior of the other person.
This sample of interactions suggests that people most often come to talk to each other in Wikipedia when they become aware of the other person through recent shared activity around an artifact. Awareness then leads to communication, and often coordination.

A really wonderful paper!

Experiment: people are less likely to help others if they are made to think about money

The Burghers of Calais
Image via Wikipedia

Interesting psychology experiment has shown that people are less likely to help others if they are made to think about money! (“The Psychological Consequences of Money,” Kathleen D. Vohs, Nicole L. Mead, Miranda R. Goode, Science, November 17, 2006.) Stanford GSB professor, Jennifer Aaker, comes to a similar conclusion as seen in her 2008 paper, The Happiness of Giving: The Time-Ask Effect.
(via Stanford social innovation review)

Reblog this post [with Zemanta]

Crowdsource world saving to everyone, through online games!

Jane McGonigal, director of Games Research & Development at the Institute for the Future, makes a passionate case for online games in which players, by playing, help in saving the world (‘Gamers are a human resource that we can use to do real-world work, that games are a powerful platform for change.‘) At the end of her TED talk, she mention her last effort: Evoke, a crash course in changing the world. (‘This is a game done with the World Bank Institute. If you complete the game you will be certified by the World Bank Institute., as a Social Innovator, class of 2010.‘). Whatever it means, you have to admit that it is clever giving the possibily of calling yourself “World Bank Institute Certified Social Innovator”!

EVOKE trailer (a new online game) from Alchemy on Vimeo.

Google attack on Viacom (following Viacom vs. YouTube lawsuit)

Google talks directly to everyone via Youtube blog to stop Viacom lawsuit against Youtube (owned by Google).

We ask the judge to rule that the safe harbors in the Digital Millennium Copyright Act (the “DMCA”) protect YouTube from the plaintiffs’ claims.

And then after some blabla, the final attack:

For years, Viacom continuously and secretly uploaded its content to YouTube, even while publicly complaining about its presence there. It hired no fewer than 18 different marketing agencies to upload its content to the site. It deliberately “roughed up” the videos to make them look stolen or leaked. It opened YouTube accounts using phony email addresses. It even sent employees to Kinko’s to upload clips from computers that couldn’t be traced to Viacom. And in an effort to promote its own shows, as a matter of company policy Viacom routinely left up clips from shows that had been uploaded to YouTube by ordinary users. Executives as high up as the president of Comedy Central and the head of MTV Networks felt “very strongly” that clips from shows like The Daily Show and The Colbert Report should remain on YouTube.

Viacom’s efforts to disguise its promotional use of YouTube worked so well that even its own employees could not keep track of everything it was posting or leaving up on the site. As a result, on countless occasions Viacom demanded the removal of clips that it had uploaded to YouTube, only to return later to sheepishly ask for their reinstatement. In fact, some of the very clips that Viacom is suing us over were actually uploaded by Viacom itself.

70 Open PhD Positions in ICT at the University of Trento

The Department of Information Engineering and Computer Science (DISI) at the University of Trento has 70 open PhD positions in the ICT area, almost all of them covered by scholarship.
The deadline for applications is April 20, 2010, before 12 noon, local time.

The Department of Information Engineering and Computer Science is one of the leading and faster-growing research institutions, characterized by a young and international faculty and by a large, international student population. Indicators for scientific production put the department among the very top in Europe. The successful candidate will therefore have the opportunity to work in a dynamic and exciting environment. Trento is a vibrant city with a beautifully preserved historic center, consistently ranked at the top for quality of life in Italy. It offers a variety of cultural and sports opportunities all year around, as well as excellent food and wine.