Google, do hire Stan before Yahoo! does it. Stan is the author of “Outfoxed – Personalize your internet.” I didn’t play with the code yet (seems a Linux version is not yet ready at the moment, but on the way). Yes, the code is open source (Mozilla Public Licence), sweet! Anyway, the detailed description is fantastic! It is a bit like what I want to do for my PhD thesis. The difference? Stan did it! Check the site: it has a lot of interesting pages such as The Outfoxed Idea (A collection of thoughts on the theoretical aspects of Outfoxed, and the whole idea of using social networks for metadata distribution). Or at least the page A Third Phase of Internet Search in which Stan pictiorally shows the 3 phases: Naive trust –> PageRank and inferred quality –> Social networks to determine subjective quality
Continue reading
Tag Archives: Metadata
hReview: a semantic microformat for reviews
Some weeks ago, Tantek was introducing a new microformat hReview.
We are pleased to announce the first public draft (v0.1) of hReview, jointly co-authored by representatives from America Online, CommerceNet Labs, Microsoft, Six Apart, Technorati, and Yahoo!. hReview is an open microformat standard for publishing and indexing distributed reviews on the Web. This standard enables users to contribute, identify, and aggregate review content on their own web sites and blogs as well as on community sites.
I didn’t have time yet to dig into it but it is good that they analyzed previous attempts (I was trying to use RVW by Alf Eaton and to keep my list on Allconsuming but I didn’t put too much effort into this) and that they ask for Feedback; almost all the links are to Wikipages so you can edit them directly there.
In general I really appreciate the work of Technorati (I also wrote a paper backing their proposal of VoteLinks, submitted to Web Intelligence 2005: “Page-reRank: using trusted links to re-rank authority” (pdf)).
Some other link I’ll try to digest later on: jluster on hreview, hreview on technorati, hreview on del.icio.us, organizedshopping on hreview, adriancuthbert suggested to use this_is_an_hreview as common tag (tagspace?).
It would be great to have this format widely adopted so that the amount of decentralized published reviews will become soon huge and I will have a large amount a data for what I’m working on in my PhD: Trust-aware decentralized Recommender Systems. If interested, check my (a bit outdated) PhD proposal at my papers page.
“Tag_the_tag” tag and metadadaism
In Tagwebs, Flickr, and the Human Brain, Jakob argues “a neuron in your brain is a lot like a tag in a tagweb“. A tagweb is a network of tags whose edges are the “this tag is tagged with this tag” relationship, for example he tags the tag “Victoria” with the tag “female”. He states that it is not possible to tag tags on flickr but there is a workaround. If you tag a page that “represents” a tag, you are implicitly tagging that tag and you can do it with del.icio.us. I tagged some pages representing tags with the new tag “tag_the_tag” (metatag has already another meaning due to HTML). It can be a sort of wordnet but bottom up. I’m skeptical about the rise of “tagging tags” but, if this happen, then tag spam will be an issue. Jacob ends with “I now understand how my brain works, and I can act in ways that embraces that knowledge.” that really seems an enormous excess of “technology-driven optimism”.
[New word you find in the text: metadadaism (search for metadadaism and write metadadaism in wikipedia).]
[Note for myself: an online article with colorful pictures is more likely to attract attention (at least for me) but .mov videos are bad since I have many problems watching them on my operating system libre]
What is “Tag Spam”? Or better, Tag Spam exists?
Leigh asks So any signs that “tag spam” has started yet? (found because he uses “trust metrics” a keyword to which I’m subscribed in a number of service). Here I ask the same question. It seems very unlikely that web spammers (they called themselves “search engine optimizer”) cannot see in seconds the value of getting the wanted URL (of the to-be-busted book, movie, …) or photo (of to-be-busted movie, product, …) under my eyes. Afterwards, we are in the attention economy, aren’t we? Getting attention of some humans (or aggregators and, as a consequence, of many humans) on your item is the first step towards you getting reputation (and possibly money). [by the way, the same is true for this blog post].
However, if you look it from a biodiversity point of view, spam is good because forces you to evolve, to differentiate, to invent new solutions.
So, any signs of “tag spam”? If you find something, write it on wikipedia pages Spam or Spamdexing (there is nothing at the moment about this) or ask Britannica to insert it in the next version (hope you get the difference…).
But first, how to define “tag spam”? A bot is always a spammer? If you genuinely think that microsoft.com could be tagged as crap, then this is not spam? But if you tag something just in order to capture attention of other people, then this is spam? If I tag on del.icio.us this post as “folksonomy“, is this spam? If I tag my papers on CiteULike as “Cool” is this tag spam?
Rebecca pointed out that someone tagged on flickr an antisemite protest sign as “MLK” (Martin Luther King). Is this tag spam? She says “community standards” do not, indeed, can not defend against abuse of the system–only design can do that. Off the top of my head, there are several simple things Technorati could do to prevent this sort of thing from happening in the future:
And in fact, Rebecca is already starting to provide anti-spam techniques:
* Technorati could design their system not to publish any photo Flickr users have tagged “Might be offensive”.
* Technorati could create their own tagging system, and not publish any photo Technorati users tagged “Might be offensive”.
* Technorati could provide an email address so that users could alert staff if a photo was offensive or inappropriate, and then the staff could go in and tag the inappropriate photo so that it would not appear on Technorati’s site–or hand-select an appropriate one.
And in fact David Weinberger’s (implicitly) also suggesting to use a trust metric when he says
“Tags work because they’re so simple and because they are so connected to the human semantic context, but having billions of tags won’t work because they’re so simple and connected to the human semantic context. Will we be able to triangulate tags with other data – especially social data – so that we can get more out of them than we put in? It doesn’t seem impossible to me – simply knowing who created a tag lets you get more out of the tag than the person put in – but it’s not up to me to invent the stuff.”
Let me make a strong point here: “Tag Spam does not exist. What does exist are different ways of viewing stuff in the world (and I hope there will always be!). What does exist are also incentives to get attention of other people”. How can we take the most out of decentralized tagging? I think that using trust metrics we can choose to consider only tags provided by sources we deem trustworthy and exclude all the rest. There is the risk of DailyMe here: that is you will see only world classifications of people you already agree with and you will never ever get exposed to different way of thinking. I was speculating about it some time ago and leave this topic for next time.
Ok, I started with “trust metrics” and, having closed the circle, here I stop.
UPDATE: you can never stop. While I was writing 2 posts on Corante appeared that are very relevant.
In “issues of culture in ethnoclassification/folksonomy” danah argues that tagging is culture dependent. The great example about the book “Women, Fire and Dangerous Things” tells us that if someone (of a the culture described in the book) tags a picture of a woman under “danger”, this is not at all tag spam but simply a different point of view on world, a different culture (not a better or worst one).
And in Folksonomy is better for cultural values Clay replies that the same problems applies to ontologies but exacerbated and that “The aggregate good of tags is not that they create consensus or accuracy; they observably don�t, and this is very observability is much of their value.” He also reports that “But the relativity can also be interesting when crossed-tabbed with the identity of the tagger; I don�t want �toread� or �funny� generally, but I do want Liz�s �toread� tags, and Matt Webb�s �funny� links.” In my Jargon, he is here expressing a trust statement (I trust as 1/1 Liz in the context of “toberead” tag). What I propose is to use this information to automatically discover the identities trusted by Liz in the context of “toberead” context and automatically suggest them to Clay. The balance between “i keep a small and direct and controllable social network of people i really know” or” i use also automated tools that can infer, based on the global social network, how much i could trust unknown users” should be an user option in my opinion. The first is more controllable, the second is more prone to serendipity, exposure to something new and new persons but also less controllable and under risk of social attacks.
Since I’m here, there are other interesting posts I found later on navigating some of the links. They are here below:
Cheap Eats at the Semantic Web Caf�
Folksonomy Notes: Considering the Downsides, Behavioral Trends, and Adaptation
The Politically Correct Police (PCP) are making lots of noise about how “This isn’t right and SOMETHING SHOULD BE DONE”.
Technorati Tags Set for Abuse who is tagged as “Nude Celebrities” just to prouve the concept
Shapes of knowledge, word for poodles
Making use of tags and tagsonomies
Controlled Vocabularies and Folksonomies: Why Change is Good.
Social consequences of social tagging
and i guess you will find all of them on del.icio.us’s “folksonomy” tag
Lucas, please, add tagging to WebJay
A lot of discussion about why tags are so useful (folksonomies is the current buzzword) on Many2Many. As I noted in a previous post, at the moment there are services that allows you to tag: URLs (del.icio.us), photos (flickr), your emails (gmail), posts on metafilter (metafilter), posts on your blog (technorati, using <a ref="tag">
), scientific papers (citeulike), todo items (43things) and books (bookswelike, which I just discovered).
But what we would really really really enjoy is MUSIC TAGGING!
So, is there a site where it is possible to apply free tags to songs? And to collections of songs? I’m not aware of such a site. I mean totally free tags (such as PsyChill or MaleNeuvoFolk) to express your personal categorization of a set of songs and not ID3 tags.
I think WebJay should be our friend here. Let me first say that I’m in total love with webjay, a site that helps you listen to and publish web playlists, i.e. collections of mp3s (and other formats) available on the web. Here are my webjay playlists.
So, coming back to the subject of the post, I definitely think we would enjoy a free-tagging music site and I think WebJay is the best candidate and Lucas Gonze (the developer of webjay) totally rocks. Anyway, Seb (@WJ) was arguing on webjay forum some time ago that “Webjay Needs Tags” and the entire discussion (7 posts) is really interesting. Lucas is not that convinced since he argues that playlists and tags are very different metaphors and very different organizing principles and also that “WJ is radically constrained by the lack of money”.
I think tags could be applied both to playlists (I tag this playlist as “mellow”) and single songs (I tag this song as “PsyChill”). Both ways seems appealing to me about what they can produce, for example: “let me see all the playlists tagged as ‘mellowblues’ or “i’ve 30 free minutes, play me a collection of ‘coldrelax’ songs”). The good of tags is that they are not imposed from the top (i agree with riddle (@wj) when he says “I’m glad that you didn’t impose a preconceived notion of genre on Webjay”) but they emerge from the bottom. Lastly, hideout (@WJ) proposes to use del.icio.us system to tag playlists since every playlist has a permanent URL. This can be a low-impact-on-webjay solution and very small-pieces-loosely-connected one and i definitely think it can make sense. Maybe it would be better if Lucas integrates the remote tagging made on del.icio.us on WebJay interface in order to make it in some way visible and promote its usage.
What do you think? Want to share your point of view? You can either do it on WebJay Forum or with a comment here. Music free tagging is the next step! [ehm, I think I’m not good in inventing new words … so i leave to you the option to invent the new cool buzzword for music-free-tagging].
Folksonomies spreading is a river, we can just steer the kayak
“To put this metaphorically, we are not driving a car, with gas, brakes, reverse and a lot of choice as to route. We are steering a kayak, pushed rapidily and monotonically down a route determined by the enviroment. We have a (very small) degree of control over our course in this particular stretch of river, and that control does not extend to being able to reverse, stop, or even significantly alter the direction w’re moving in.”
I love how Clay Shirky writes! Read the entire post on Many2Many about the fact “mass amateurization of cataloging” is going to happen anyway.
HTML tag <A> gets a new attribute: nofollow
I read on News.com that Google is promoting a new attribute for the html tag <A> for preventing comment spam.
Example: Visit my <a href=”http://www.example.com/” rel="nofollow">discount pharmaceuticals</a> site.
Google will not follow such a link (because of the nofollow attribute) and hence the linked site will not get Pagerank. This should give less incentives to blogspammers in automatically commenting your blog with spam messagges. I think it will not work but this is just a try for tacking spam and hence worthwhile.
What is more interesting is the “decentralized” evolution of (HTML) language. The new attribute is just a proposal from Google to extend a standard language but Google has a so high reputation that many people will follow this suggestion and this means Google has the power to change HTML language. Technorati did something similar proposing rel="tag"
just few days ago. Technorati proposed also VoteLink with rel="vote-for"
and rel="vote-against"
and XFN with rel="friend met"
and others relationships-related tags.
Actually everyone can propose a change in HTML language (or whatever language/protocol) but it is of course difficult to have it accepted by a significant number of players/content creators.
It will also be interesting to see if this language evolution will produce different linking behaviours.
<a rel=”tag”> and technorati aggregates your post based on category
If you want to have your category-tagged posts aggregated by Technorati, “tag” your post by including a special link:
<a href="http://technorati.com/tag/[tagname]" rel="tag">[tagname]</a>
(from Technorati Tags Help).
Since I was there, I’ve modified how post categories are visualized on my blog, they should float on the right with a little cloud image linking to the relative tag page on Technorati. If you notice any problem (especially with IExplorer), please let me know.
2 more “things” technorati could aggregate: papers and todo lists.
Some entries ago I was asking if there was somewhere a repository of category-tagged blog posts (for a project I was thinking about with some colleagues on evolution of a shared language). Few days ago, Technorati made a big step in providing it.
It aggregates URLs bookmarked under a certain tag in del.icio.us, photos tagged under the same tag in flickr and ALSO blog posts categorized under the same “tag”. Cool! For example, see the page about the tag “peace”.
Are there other services that use tags to tag things? Yes, there are. citeUlike lets you tag scientific papers. 43things lets you tag “todo lists” (I didn’t play with 43things so I’m not really sure what you tag). For example, see citeUlike page for design “tag” and 43things page for design “tag”. Gmail as well allows you to tag received emails but of course (at least for the moment) emails are private and it is not possible to aggregate them. We will investigate “would it be useful?” next time.
Are there more services that allow you to tag things? If you know any, please report them in the comments. I especially think we could really enjoy a songs-tagging site but more about this later.