Call for Papers: International Workshop on Modeling Social Media 2010 (MSM’10)

15 03 2010

I’d like to point you to a Call for Papers for a workshop I’m involved in organizing at Hypertext 2010 in Toronto this June. I’m really excited about the focus of this event, and I’m looking forward to lots of exciting discussions and presentations (check out the invited talks and panelists!).

International Workshop on
Modeling Social Media 2010 (MSM’10)


June 13, 2010, co-located with Hypertext 2010,
Toronto, Canada

Important Dates:

* Submission Deadline: April 9, 2010
* Notification of Acceptance: May 13, 2010
* Final Papers Due: May 20, 2010
* Workshop date: June 13, 2010, Toronto, Canada

Workshop Organizers:

  • Alvin Chin, Nokia Research Center, Beijing, China, alvin.chin (at)
  • Andreas Hotho, University of Wuerzburg, Germany, hotho (at)
  • Markus Strohmaier, Graz University of Technology, Austria, markus.strohmaier (at)


The workshop will be opened by an invited talk given by Ed Chi (Palo Alto Research Center). The talk will be followed by a number of peer-reviewed research and position paper presentations and a discussion panel including Barry Wellman (University of Toronto), Marti Hearst (University of California, Berkeley) and Ed Chi (Palo Alto Research Center).

Workshop’s Objectives and Goals:

The goal of this workshop is to focus the attention of researchers on the increasingly important role of modeling social media. The workshop aims to attract and discuss a wide range of modeling perspectives (such as justificative, explanative, descriptive, formative, predictive, etc models) and approaches (statistical modeling, conceptual modeling, temporal modeling, etc). We want to bring together researchers and practitioners with diverse backgrounds interested in 1) exploring different perspectives and approaches to modeling complex social media phenomena and systems, 2) the different purposes and applications that models of social media can serve, 3) issues of integrating and validating social media models and 4) new modeling techniques for social media. The workshop aims to start a dialogue aiming to reflect upon and discuss these issues.


Topics may include, but are not limited to:

+ new modeling techniques and approaches for social media
+ models of propagation and influence in twitter, blogs and social tagging systems
+ models of expertise and trust in twitter, wikis, newsgroups, question and answering systems
+ modeling of social phenomena and emergent social behavior
+ agent-based models of social media
+ models of emergent social media properties
+ models of user motivation, intent and goals in social media
+ cooperation and collaboration models
+ software-engineering and requirements models for social media
+ adapting and adaptive hypertext models for social media
+ modeling social media users and their motivations and goals
+ architectural and framework models
+ user modeling and behavioural models
+ modeling the evolution and dynamics of social media

Preliminary Program Committee (confirmed):
  • Ansgar Scherp, Koblenz University, Germany
  • Roelof van Zwol, Yahoo! Research Barcelona, Spain
  • Marti Hearst, UC Berkeley, USA
  • Ed Chi, PARC, USA
  • Peter Pirolli, PARC, USA
  • Steffen Staab, Koblenz University, Germany
  • Barry Wellman, University of Toronto, Canada
  • Daniel Gayo-Avello, University of Oviedo, Spain
  • Jordi Cabot, INRIA, France
  • Pranam Kolari, Yahoo! Research, USA
  • Tad Hogg, Institute for Molecular Manufacturing, USA
  • Wai-Tat Fu, University of Illinois at Urbana-Champaign, USA
  • Thomas Kannampallil, University of Texas, USA
  • Justin Zhan, Carnegie Mellon University, USA
  • Marc Smith, ConnectedAction, USA
  • Mark Chignell, University of Toronto, Canada


WSDM 2010 List of Accepted Papers

26 12 2009

The list of accepted papers for WSDM 2010 is available now. Lot’s of exciting papers, I’m particularly interested in the ones related to tagging, microblogging, search intent and user goals. Here’s an excerpt of my reading list (including links to pdf-versions whenever they were available):

  • Query Reformulation Using Anchor Text (pdf)
    Van Dang and Bruce Croft
  • Tagging Human Knowledge (technical report)
    Paul Heymann, Andreas Paepcke and Hector Garcia-Molina
  • Ranking Mechanisms in Twitter-Like Forums (pdf)
    Anish Das Sarma, Atish Das Sarma, Sreenivas Gollapudi and rina panigrahy
  • Large Scale Query Log Analysis of Re-Finding (pdf)
    Sarah Tyler and Jaime Teevan
  • TwitterRank: Finding Topic-sensitive Influential Twitterers (pdf)
    Jianshu Weng, Ee-peng Lim, Jing Jiang and Qi He
  • I tag, You tag: Translating tags for advanced user models (pdf)
    Robert Wetzker, Carsten Zimmermann and Christian Bauckhage
  • Folks in folksonomies: Social link prediction from shared metadata (pdf)
    Rossano Schifanella, Alain Barrat, Ciro Cattuto, Benjamin Markines and Filippo Menczer

ACM Hypertext’09 Student Competition

4 07 2009

I just came back from a road trip to ACM Hypertext’09 with my students, and I’m particulary happy that one of them, Christian Körner, won the 1st place in this year’s ACM Hypertext’09 Graduate Student Research Challenge. Bravo Christian!

The competition was strong, and Christian did a great job in presenting preliminary results from his PhD research on Tagging Motivation. Here are a few links to his competition material:

I would also like to congratulate all runner-ups at the competition. All participating students worked really hard to present their research to conference attendees in an engaging way. I really liked the enthusiasm of the students, and the student competition as well as the conference as a whole gave me a bunch of new ideas and perspectives.

You might also be interested in my liveblogging notes on Ricardo Baeza-Yates‘ and Lada Adamic‘s very interesting keynotes at the conference.

ACM Hypertext'09 Student Research Competition

Liveblogging Wednesday @ Hypertext’09

1 07 2009

I’m sharing my live notes from the second hypertext keynote on Relating Content by Web Usage by Ricardo Baeza-Yates at Hypertext ‘09.

In case you have any additions, comments or links that would make my notes more complete / more useful, please leave a comment and fill in the blanks.

On the nature of search and intent:

Ricardo starts by stating that Search is not about document retrieval anymore. Given Ricardo’s history in document retrieval, this is an interesting thing to hear.

Search is rather about mediating user goals, in particular:

  1. idenitfying a users’ task
  2. providing means for task completion

For search to be successful, intent of searchers needs to be related to content available on the web. Ricardo argues that rather than focusing on content, search engines need to focus on objects, such as people, places, businesses, restaurants etc. Search intent then can be satisfied by exploiting and mapping characteristics of such objects and their corresponding attributes.

On the nature of content:

So how can we learn about objects and attributes? One approach is to look into metadata, where Ricardo distuingishes betweeen explicit (Metadata, Y! Answers, Flickr, etc) and implicit (anchor text, queries and clickthrough, etc) metadata. Ricardo points out that some of this metadata is private, making usage more complicated.

A key question in this context is “What is the quality of different kinds of metadata?”. Ricardo mentions that although user-generated metadata is noisy, on an aggregate level, he believes that it outperforms metadata generated by experts.

Search in Social Media:

Ricardo introduces TagExplorer, a Yahoo resesarch prototype for tag-based, faceted navigation/search of Flickr. Facets that are supported are locations, subjects, activites, time, names and others. I didn’t fully understand how these facets are identified or determined, but it seems the selection is based on / informed by previous empirical Yahoo research on different types of tags in Flickr.

Another prototype Ricardo demonstrates is the Correlator.

Web Usage:

Ricardo starts with the assumption that “when users use the web, they think”, and he suggests that we can/should tap into the outcome of these cognitive processes and exploit them for search. An example of that are query logs, where users actively make relevance judgements and engage in search query formulation / reformulation strategies.

Ricardo gives a number of examples where this might be useful, for example it might help in learning about relationships between queries, sessions and documents.

Open Issues:

Ricardo concludes his talk by discussing a number of issues he feels are important for future research. He discusses the interesting research question of studying explicit social networks (where links between users are made explicit) versus implicit social networks (where links between users are inferred). Related to this problem is the problem of implicit and explicit metadata. Ricardo refers to that problem as the virtuous cycle, where both implicit and explicit metadata can be used/should be used to inform search.

Another problem Ricardo mentions is the question when it is necessary to acquire more data vs. when we need to tweak our algorithms. As researchers, I guess we tend to have a bias towards working on the algorithmic rather than the data aspect.

My impressions:

I think Ricardo’s talk gave a great overview of the many activities at Yahoo Research. Due to the number of projects being presented, it was difficult for me to capture everything that was presented, and I feel that my notes in this post capture only a small part of what Ricardo talked about in his keynote. So check out Ricardo’s website / Yahoo research website / the slides of this talk to get a more complete picture of their exciting projects.

Update: I just stumbled upon Alvin Chin‘s notes of Ricardo’s keynote, which nicely complement my notes here.

Liveblogging Tuesday @ Hypertext’09

30 06 2009

I’m sharing my live notes from Lada Adamic‘s keynote on “The Social Hyperlink” at Hypertext ’09.

In case you have any additions, comments or links that would make my notes more complete / more useful, please leave a comment and fill in the blanks.

Lada starts by telling a story about the different social networks at MIT vs. Stanford, where at MIT fraternaties are well established and play an important role in defining social communities, while at Stanford they are discouraged – each year you have to enter a room lottery that determines with whom you gonna live with in the coming year. This difference can be observed in the social networks among students. But analyzing the relationships between people, and the actions they perform is challenging because of the difficulty of correlation vs. causation. Do two friends buy the same item because they have a social relationship (causation) or do they happen to buy the same item independent of their relation (correlation)?

The Social Hyperlink (how intent spreads through Second Life):

That’s why Lada got interested in Second Life, as in SL it is possible to trace how information (e.g. dance moves, items) spreads along social ties. In many cases, SL maintains information about previous item owners, allowing us to study how items propagate through networks of SL users. The example Lada talked about was gesture transfer among users of second life. Lada presented results from a study analyzing 12.6 mio transfers (where 23% have accurate previous owner info). What you can do with this data is investigate patterns of information spread through the social network.


  • 48% of transfers happen between friends.
  • Cascades among friends are deeper / items are passed along social ties more often (higher precentage of non-leaf nodes)
  • But: adoption over time is weaker in social networks. Lada speculates that a reason for that is that information spread among friends is “niche” information (only relevant to a small group of homogeneous friends)

The next question Lada deals with is whether targeting hubs/early adopters would be a promising strategy to spread information in networks, by dividing the network into early adopters and laggards:

  • early adopters (or Mavens in Gladwell’s terms) were less social (fewer friends than the average)
  • they were also not active in distributing assests, that means that they are not influencers


  • social networks influences adoption
  • niche items get a bigger boost (from social relations)
  • some individuals have more influence than others

User Intent and Social Networks: What I find interesting about this work, particularly the Second Life Case, is that it allows us to study the propagation of intent in social networks. This kind of data enables us to examine how social relations influence what people want. I find this to be an important research question, because intent is generally assumed to be an attribute of individuals rather than a characteristic of social networks as a whole. I think that people tend to prefer believing that their goals are individual and intrinsic, rather than determined(?) by their social network. Studies such as the SL study have the potential to explore this question empirically.

But network analysis can be employed for other aspects of links as well, Lada gives two more examples:

The Knowledge-Exchange Hyperlink:

One of the questions Lada talked about in this context was: What motivates users to answer questions?

From Interviews from Naver: altruism, learning, hobby, business, points

From crawls: filling in the blanks, correcting others

The Trust Hyperlink:

Lada got interested in Couchsurfing as a way to study trust in social networks. (The rationale being that trust is required to let somebody stay in your home.)

The study included 600.000 users, 156.000 surfed or hosted. 55.000 in largest, strongly connected component

Observerations: Overtime, people tend to engage in both surfing and hosting.

Results: direct reciprocity only accounts for 12-18% (surf the couch of the person you have hosted). Generalized reciprocity is at place.People are willing to vouch for people they only knew via couch-surfing. They tend to vouch for fewer couch-surfing friends than best friends, but overall there are  more couch-surfing friends.

My impressions:

I really enjoyed Lada’s keynote, I think the keynote did a great job in motivating and illustrating the potential of network analysis to explore different aspects of linked information on the web. I came across her work many times before in my own research and I’m happy to have had the chance to hear her talk in person.

Next up are my students Christian and Mark who are pitching their posters on “Understanding the Motivation behind Tagging” (Christian Körner) and “Towards Automatically Annotating Textual Resources with Human Intent” (Mark Kröll). Good luck!

Update (Jul 4 2009): Lada’s slides of the talk are available online!

CfP’s for Upcoming Information Retrieval events

16 02 2009

I’d like to point you to two upcoming IR events in which I’m involved in:

  • International Workshop on Text-based Information Retrieval (TIR 2009), colocated with DEXA 2009, Deadline for paper submission April 01, 2009, 24:00 (CET)
    this event is co-organized by Benno Stein, and my colleague at the Know-Center Michael Granitzer
  • ACM SIGIR Conference, Posters (SIGIR 2009)
    Posters submission deadline: Feb 23, 2009

Upcoming events

17 12 2008

I’d like to direct your attention to the following exciting events:

    Great opportunities to present and discuss your research!