Yearly Archives: 2005

New paper: Learning Contextualised Weblog Topics

I forgot about another paper I wrote: Learning Contextualised Weblog Topics (pdf) will be presented at WWW 2005 2nd Annual Workshop on the Weblogging Ecosystem: Aggregation, Analysis and Dynamics in Chiba, Japan, May 10th 2005. My boss was going to WWW2005 for presenting another paper and so we decided to submit our ongoing work to this workshop to get some feedback. We are still working with the system but we should be ready for prime time soon enough … stay tuned!
[I would have loved to meet Ethan Zuckerman that is the invited speaker at this workshop and whose work on media attention is just delicious. (I even proposed to help him in coding something for monitoring the Italian media world but it’s too bad I’m so lazy)]

If you like, check the paper Learning Contextualised Weblog Topics (pdf)
Abstract: In this paper, we examine how a topic-centric view of the Blogosphere can be created. We characterise the problems in aligning similar concepts created by a set of distributed, autonomous users and describe current iniatives to solve the problem. We introduce the Tagsocratic project, a novel initiave to solve the concept alignment problem using techniques derived from research in language acquisition among distributed, autonomous agents.

Tag your friends

The interface of Rojo is totally unusable (at least to me), i don’t understand the interface metaphors. What attracted me was the ability to tag your friends. So a curiosity: how would you tag me?
Our vision is that the next generation of feed reading requires new forms of organization so we built in the ability to tag your world, your content, your feeds, and even your friends.

FolkOS: Folksonomy Operating System

We were used to organize our bookmarks in folders, then del.icio.us came and we now appreciate folksonomies (flat taxonomies, just a set of free keywords you can attach to URLs). We are used to operating systems that allow us to categorize files (knowledge) on folders, would it make sense to have an operating system that allows us to categorize files only based on taxonomy (just add keywords to any file, all the files are in a flat pool)? I don’t know.
What I know is that the total lack of concurrency in the Operating Systems domain (actually just one global monopoly) is depriving all of us of new ideas, new paradigms, progress. If you compare it with the vibrant Web, where a new idea gets implemented and proposed almost daily, you can maybe see how far we would be if there were a free market for Operating Systems.
Anyway, how could we call it? What about FolkOS? FolkOS, the Folksonomy Operating System, I can already see the advertisements…. And, yes, I patented the idea, I got every possible TradeMark and not only on Earth. I patented FolkOS also on Venus and Alpha Centauri (venusians and alphacentaurians be aware! Don’t use my patented ideas! I have the best lawyers of the galaxy!).
[I tend to overload my emails of smilies (for expressing when I’m joking) but I don’t like them on blog posts, so I’m not sure my 4 readers understand when I (try to) make a joke. So, just to be sure, this is a joke … I think patenting computational ideas is a total nonsense (maybe a video can help in understanding why)].

Microsoft isn’t exactly in fighting trim

From fortune.com: But Microsoft isn’t exactly in fighting trim. Its ambitious new operating system, code-named Longhorn, is more than a year late, even after having been scaled back. Linux, the free operating system that Gates once scoffed at, is fighting Microsoft for share in both the server and desktop markets, forcing the company to do the unthinkable: offer customer discounts. Last year it had to spend $1 billion to rewrite thousands of lines of code to make its programs less susceptible to viruses. Its Xbox gaming console is winning raves from players but has yet to make serious money. Meanwhile, Apple has stolen the show in online music with its hugely popular iPod and iTunes Music Store. Plus, the recently released Firefox browser, which can be downloaded free, has forced Gates to reconstitute an Internet Explorer development team. Indeed, four years have passed since Microsoft released a piece of software that generated the kind of buzz Google seems to generate every month.

Teens share innermost feelings with parents or ….?

When teens are asked to choose whether they prefer to share their innermost feelings with their parents or a blog, they are split with roughly half (51%) selecting their parents and 49% choosing a blog. (from BusinessWire, via an email on SocNet).
Yes, I didn’t follow Clay’s advice with this entry, posting a news that just is too postable not to be posted.
Most of us will not be able to afford the calling and re-calling of sources to double-check a quote, but all of us can ask ourselves, just before we hit Submit, ‘Is this true?’. And the time we should be most careful to do that is if we feel really satisfied with what we’ve written.
This result seems so perfectly fabricated for having bloggers post it … with self-satisfaction and I’m brainlessy posting it not pondering enough ‘Is this true?’, but that’s how the world goes these days …

An Automatic Patent Requests Generator: overflooding the Patent Office?

Since really trivial patents get granted (as long as you pay), i was wondering how hard could it be to organize a Distributed Denial of Service Attack on the Patent Office [the Patent Office probably reviews a bit patent requests, eventually accepting all of them since the only funds they received is from granting patents].
The idea: to modify a bit the SCIgen – An Automatic CS Paper Generator (a wonderful GPL-licenced generator of Computer Science papers who created a random paper that got accepted to a conference!) and overflood the Patent Office with automatically generated Patent Requests. I bet that 95% of the (randomly generated) Patent Requests would be accepted. Did I heard “NoSoftwarePatents“?

Patenting the obvious: Google and how much a news source is trusted

Google had filed a patent for “ranking news according to quality (or at least NewScientist says so, I didn’t check).
The database will be built by continually monitoring the number of stories from all news sources, along with average story length, number with bylines, and number of the bureaux cited, along with how long they have been in business. Google’s database will also keep track of the number of staff a news source employs, the volume of internet traffic to its website and the number of countries accessing the site. Google will take all these parameters, weight them according to formulae it is constructing, and distil them down to create a single value. This number will then be used to rank the results of any news search.
So can you patent something so obvious? It is as trivial as “I take 2 parameters (how many words you say per minute and your height) and I do a weighted sum on them”. Can it be reasonable that you patent weighted sums of A and B?
This is why we should say nosoftwarepatents.com.
Moreover the idea that FoxNews is a “trusted” source because many people visit its site is really bad for me. This is what I call a global trust metric. If I tell Google that I trust Indymedia, then I should receive personalized results (personalized in the sense that the weight given to FoxNews is 0!).

Contact me if you’ll be in Trieste next week for the School on Structure and Function of Complex Networks.

I’ll be in Trieste at the Abdus Salam ICTP (Unesco funded school) during next 2 weeks (16 – 28 May 2005) for the School and Workshop on Structure and Function of Complex Networks (i was advertising about it time ago and I got accepted). I’m so excited. The list of speakers is simply great (see below) and there are participants from all over the world, in fact “Although the main purpose of the Centre is to help research workers from developing countries, a limited number of students and post-doctoral scientists from developed countries are also welcome to attend.“.
If you happen to be there and want to discuss a bit about blogosphere, trust, reputation, social software, social networks, languages, globalization, … just whatever, please contact me!
Continue reading

My first Firefox extension: SemanticLinks

For the previously mentioned paper, I created a small Firefox extension called SemanticLinks. The purpose? Showing VoteLinks, rel=”nofollow” and information about the linked resource by appending a small icon near the link text (anchor text). SemanticLinks is a simple change of TargetAlert to which I just added a 1%. You can find more information about SemanticLinks and how to install it on the SemanticLinks page. You might also want to see some screenshots.

New paper: “Page-reRank: using trusted links to re-rank authority”

I uploaded another paper of mine in the papers section. This is still under review for the Web Intelligence 2005 conference and is titled “Page-reRank: using trusted links to re-rank authority” (pdf). Let me know what you think of it, if you like.

Abstract The basis of much of the intelligence on the Web is the hyperlink structure which represents an organising principle based on the human facility to be able to discriminate between relevant and irrelevant material. Second generation search engines like Google make use of this structure to infer the authority of particular web pages. However, the linking mechanism provided by HTML does not allow the author to express different types of links such as positive or negative endorsements of page content. Consequently, algorithms like PageRank produce rankings that do not capture the different intentions of web authors. In this paper, we review some of the initiatives for adding simple semantic extensions to the link mechanism. Using a large real world data set, we demonstrate the different page rankings produced by considering extra semantic information in page links. We conclude that Web intelligence would benefit in adoption of languages that allow authors easily encode simple semantic extensions to their hyperlinks.