I am a Principal Applied Scientist at Microsoft's Bing R&D Division where I work on data modeling and machine learning problems in Web Search. Currently my work is focussed on online user experience metrics and visualization. Since joining Bing in 2009, some of the web search projects I responsible for prototyping and shipping are: machine-learned ranking and placement of vertical search results on the SERP; crawling policy that uses a machine-learned selection algorithm; index-based conflation of knolwedge in Bing Satori knowledge base; language-independent document quality classifiers; and as the Spam Dev Lead, I scaled out the spam removal process by establishing a ML+Editor-in-the-Loop process, introduced a modular ML-centric architecture for rapid model development and deployment, and contained spam attacks.
Prior to joining Microsoft, I was a Principal Scientist at the Yahoo! Labs from June 2007 to July 2009. At Yahoo! I was responsible for proposing and prototyping the machine-learned algorithm for web summarization (aka snippets/captions), and I was the Tech lead of the team that productized the ML summarization framework. The summarizer served over 3 billion queries per month in the US market and was deployed in various international markets.
From February 2001 to May 2006 I was a Research Staff Member at the IBM Almaden Research Center, San Jose, CA, where I was an early member of the Webfountain project and worked on web search relevance and information extraction. Prior to joining IBM I was a Co-Director of the Language and Media Processing Laboratory, at the University of Maryland, College Park. I received my PhD in Electrical Engineering from the University of Washington, Seattle, WA.
I have over 80 publications in the areas of web search and mining, information retrieval and extraction, optical character recognition, computer vision, and systems. A paper I co-authored received the best paper award at the WWW 2003 conference. At University of Maryland (with colleagues) I raised over 3 million dollars from government and industry. I have co-chaired and have been a member of program committee of numerous conferences. I am a senior member of the IEEE.
In my spare time I enjoy mountaineering and running.
- Model Characterization Curves for Federated Search Using Click-Logs , WWW 2011.
- On Composition of a Federated Web Search Result Page , WSDM 2011
- On the Use of Long Dwell Time Clicks for Measuring User Satisfaction with Application to Web Summarization , Yahoo! Labs Technical Report, 2010
- Web Search Result Summarization and Presentation , CIKM, 2009.
- Predicting the Readability of Short Web Summaries, WSDM 2009
- Machine-Learned Sentence Selection Strategies for Query-Biased Summarization, SIGIR Workshop on Learning to Rank, 2008
- SemTag and SEEKER: Bootstrapping the Semantic Web via Automated Semantic Annotation, WWW, 2003
- An Efficient k-Means Clustering Algorithm: Analysis and Implementation, IEEE PAMI, 2002.
- T. Kanungo, J. O. Pedersen, T. Sarlos, "System and Method for Web Summary Composition," filed 2008
- D. Ciemiewicz, T. Kanungo, A. Laxminarayanan, M. Stone, "System and Method for Online Measurement of User Satisfaction Using Long Duration Clicks", 2009
- H. Shemtov, T. Kanungo, D. Metzler, R. Samdani, "System and Method for Identifying Phrases in Sentences using Stopwords," filed 2008
- T. Kanungo, D. Orr, "System and Method for Predicting Readability of Web Summaries," filed 2007
- T. Kanungo, D. Metzler, "System and Method for Ranking Sentences", filed 2007.
- Z. Bar-Yossef, T. Kanungo and R. Krauthgamer, System, method, and service for using a focused random walk to produce samples on a topic from a collection of hyper-linked pages , issued 2009.
- T. Kanungo, C.Y. Lin, J. O. Pedersen, M. Stone, "WSSP2009: WWW Workshop on Web Result Summarization and Presentation, Madrid, 2009