Motivations for Tagging: Categorization vs. Description

21 07 2009

UPDATE March 17 2010: More results can be found in the following publication: M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users’ Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010. (Download pdf)

In a past post, I talked about the role of tagging motivation in social tagging systems, and a distinction between users who use tags for Categorization and users who use tags for Description purposes.

One question that is interesting in this context is: “How do tag clouds of Categorizers respectively Describers actually look like – and what can we learn from them?“.

Categorizers vs. Describers: Our previous work suggests how tag clouds of Categorizers/Describers would look like theoretically: Categorizers would rather use general terms for tagging, terms that are useful labels for categories based on his model of the world.  On the other hand, Describers would use terms that are specific to a resource or concepts that can be found directly within a resource, based on characteristics of the resource. That’s the theory.

Christian Körner, one of my PhD students, looked into this question empirically based on his current work, where he applies previously discussed measures to detect tagging motivation (Conditional Tag Entropy and Orphaned Tags) to several tagging datasets. While in reality we expected that most tagging behaviour is the result of a combination of categorization and description motivation, Christian was particulary interested in “extreme” cases, i.e. cases of “extreme” Categorizers and “extreme” Describers. Here are selected results:

Example of an Extreme Categorizer: Among 445 delicious users, the following screenshot shows the tag cloud of the single user that scored highest on our “Categorization” measure (the most extreme Categorizer in our dataset).

delicious_categorizer

An example tag cloud of an "Extreme Categorizer" (based on ~1900 bookmarks)

The results are quite intriguing: The above user clearly uses very general terms to annotate his resources, and introduces an elaborated taxonomy to categorize them. While some parts of his vocabulary are more elaborate and fine grained (e.g. “fashion” and corresponding sub-categories “fashion_blog” and “fashion_brand”) others are less elaborated  (e.g. “games, health, etc”). The user also produced a controlled vocabulary and sticked to it over the course of 1900 bookmarks, which I think can be seen as another indication for the inclination of this user to use tags for categorization purposes. The fact that a combination of our measures for tagging motivation (Conditional Tag Entropy and Orphaned Tags) has produced this interesting example of an extreme Categorizers provides some evidence for the plausibility of these measures. I think that’s great news.

Example of an Extreme Describer: The next screenshot shows an excerpt of a tag cloud of the user that scored highest on the “Description” measure (the most extreme Describer in our dataset).

delicious_describer

An example tag cloud of an "Extreme Describer" (excerpt, based on ~1700 bookmarks)

It is interesting to note that this tag cloud represents an excerpt, the original tag cloud of this user is ~twice this size. The user clearly introduces a large set of tags, and uses many different variations of the same or similiar concepts, without much consideration with regard to terminological or conceptual differences (e.g. exce,  excel, Excel_Functions, Excel2007, Exceler, excelets, ExcelPoster, Excl, excxel). Again, the fact that our measures for tagging motivation produced this particular user as an extreme example of a Describer can be seen as an indicator for the principle plausibility of our measures.

However, what is also apparent from this example is that even in the case of this extreme Describer, some categories seem to be present in his tag vocabulary (e.g. “ebooks, fun, etc”). This suggests that a binary approach to understanding tagging motivation (a user is EITHER a Categorizer OR a Describer) is inplausible.

Open Questions: Overall, the examples of two users motivated by diametrically different motivations for tagging raises a number of interesting questions worth studying: What are characteristics, utilities and properties of tags produced by Categorizers and Describers? How do these different types of tagging motivation influence resulting folksonomies? And how do they influence quality attributes of algorithms (e.g. search, ranking) and applications (e.g. tag recommendation) that are processing folksonomical data? We are looking into some of these questions in our current research.

UPDATE March 17 2010: More results can be found in the following publication: M. Strohmaier, C. Koerner, R. Kern, Why do Users Tag? Detecting Users’ Motivation for Tagging in Social Tagging Systems, 4th International AAAI Conference on Weblogs and Social Media (ICWSM2010), Washington, DC, USA, May 23-26, 2010. (Download pdf)Motivations for Tagging: Categorization vs. Description