Towards Contextual Text Mining
Add to Google Calendar
Text is generally associated with all kinds of contextual information. Contextual information can be explicit, such as the time and the location where a blog article is written, and the author(s) of a biomedical publication, or implicit, such as the positive or negative sentiment that an author had when he/she wrote a product review; there may also be complex context such as the social network of the authors. Many applications require analysis of patterns of topics over different contexts. For instance, analysis of search logs in the context of users can reveal how we can improve the quality of a commercial search engine by optimizing the search results according to particular users, while analysis of text in the context of a social network can facilitate discovery of more meaningful topical communities. Since contextual information affects significantly the choices of topics and words made by authors, in general, it is very important to incorporate it in analyzing and mining text data. In this talk, I will present a new paradigm of text mining, called contextual text mining, where context is treated as a "first-class citizen."
I will introduce general ways of modeling and analyzing various kinds of context in text, including simple context, implicit context, and complex context, in the framework of probabilistic language models. I will show the effectiveness of these general contextual text mining techniques with a few sample applications in web search, information retrieval, and social network analysis.