I am a Principal Applied Scientist at Microsoft's Bing R&D Division where
I work on data modeling problems in Web Search.
Currently my work focussed on online user experience metrics and visualization.
Since joining Bing in 2009, some of the web search projects I responsible
for prototyping and shipping are:
machine-learned ranking and placement of vertical search results on the SERP;
crawling policy that uses a machine-learned selection algorithm;
index-based conflation of knolwedge in Bing Satori knowledge base;
language-independent document quality classifiers;
and as the Spam Dev Lead, I scaled out the spam removal process
by establishing a ML+Editor-in-the-Loop process, introduced a modular ML-centric architecture for
rapid model development and deployment, and contained spam attacks.
Prior to joining Microsoft, I
was a Principal Scientist at the
Yahoo! Labs from June 2007 to July 2009.
At Yahoo! I was responsible for proposing and prototyping the
machine-learned algorithm for web summarization (aka snippets/captions),
and I was the Tech lead of the team that productized the ML summarization
framework.
The summarizer served over 3 billion queries per month
in the US market and was deployed in various international markets.
From February 2001 to May 2006 I was a
Research Staff Member at the
IBM Almaden Research Center, San Jose, CA,
where I was an early member of the Webfountain project and worked on web search
relevance and information extraction.
Prior to joining IBM I was a Co-Director
of the Language and Media Processing
Laboratory,
at the University of Maryland,
College Park. I received my PhD in
Electrical Engineering from the University of Washington, Seattle, WA in 1996.
I have over 80 publications in the areas of
web search and mining,
information retrieval and extraction,
optical character recognition,
computer vision, and
systems.
A paper I co-authored received the
best paper award
at the WWW 2003 conference.
At University of Maryland (with colleagues) I raised
over 3 million dollars from government and industry.
I have co-chaired and have been a member of program committee of
numerous conferences.
I am a senior member of the IEEE.
In my spare time I enjoy mountaineering and running.
Citations
Selected Publications:
Papers
-
Model Characterization Curves for Federated Search Using Click-Logs ,
WWW 2011.
-
On Composition of a Federated Web Search Result Page ,
WSDM 2011
-
On the Use of Long Dwell Time Clicks for Measuring User Satisfaction with Application to Web Summarization ,
Yahoo! Labs Technical Report, 2010
-
Web Search Result Summarization and Presentation ,
CIKM, 2009.
-
Predicting the Readability of Short Web Summaries, WSDM 2009
-
Machine-Learned Sentence Selection Strategies for Query-Biased
Summarization,
SIGIR Workshop on Learning to Rank, 2008
-
SemTag and SEEKER: Bootstrapping the Semantic Web via Automated Semantic
Annotation, WWW, 2003
-
An Efficient k-Means Clustering Algorithm: Analysis and Implementation,
IEEE PAMI, 2002.
Patents
- T. Kanungo, J. O. Pedersen, T. Sarlos, "System and Method for Web Summary Composition," filed 2008
- D. Ciemiewicz, T. Kanungo, A. Laxminarayanan, M. Stone, "System and Method for Online Measurement of User Satisfaction Using Long Duration Clicks", 2009
- H. Shemtov, T. Kanungo, D. Metzler, R. Samdani, "System and Method for Identifying Phrases in Sentences using Stopwords," filed 2008
- T. Kanungo, D. Orr, "System and Method for Predicting Readability of Web Summaries," filed 2007
- T. Kanungo, D. Metzler, "System and Method for Ranking Sentences", filed 2007.
- Z. Bar-Yossef, T. Kanungo and R. Krauthgamer, System, method, and service for using a focused random walk to produce samples on a topic from a collection of hyper-linked pages , issued 2009.
Workshop