Categories

Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining

You haven’t logged in yet. Sign In to continue.

Request for Review Sample

Through our website, you are submitting the application for you to evaluate the book. If it is approved, you may read the electronic edition of this book online.

English Title Text Data Management and Analysis: A Practical Introduction to Information Retrieval and Text Mining
Copyright Usage
Notes
 

Special Note:
The submission of this request means you agree to inquire the books through RIGHTOL, and undertakes, within 18 months, not to inquire the books through any other third party, including but not limited to authors, publishers and other rights agencies. Otherwise we have right to terminate your use of Rights Online and our cooperation, as well as require a penalty of no less than 1000 US Dollars.


Review

"...advanced undergraduate students might find this book to be a valuable reference for getting acquainted with both information retrieval and text mining in a single volume, a worthwhile achievement for a 500-page textbook." - Fernando Berzal for ACM Computing Reviews

Feature

★ Recommended by ACM Computing Reviews, this book condenses knowledge of two key domains—information retrieval and text mining—within a 500-page volume. Balancing theoretical foundations and practical guidance, it serves as an excellent reference textbook for advanced undergraduate students.
★ Focuses on the core needs of text data management and analysis, systematically explaining statistical and heuristic processing methods. These techniques feature cross-language and cross-topic universality, capable of addressing the challenges of massive unstructured text processing.
★ Aligns with the current industry trend of explosive growth in text data, providing feasible technical paths for text analysis tasks across multiple scenarios such as social media, enterprise documents, and scientific literature.

Description

Recent years have seen a dramatic growth of natural language text data, including web pages, news articles, scientific literature, emails, enterprise documents, and social media such as blog articles, forum posts, product reviews, and tweets. This has led to an increasing demand for powerful software tools to help people analyze and manage vast amounts of text data effectively and efficiently.

Unlike data generated by a computer system or sensors, text data are usually generated directly by humans, and are accompanied by semantically rich content. As such, text data are especially valuable for discovering knowledge about human opinions and preferences, in addition to many other kinds of knowledge that we encode in text.

In contrast to structured data, which conform to well-defined schemas (thus are relatively easy for computers to handle), text has less explicit structure, requiring computer processing toward understanding of the content encoded in text. The current technology of natural language processing has not yet reached a point to enable a computer to precisely understand natural language text, but a wide range of statistical and heuristic approaches to analysis and management of text data have been developed over the past few decades. They are usually very robust and can be applied to analyze and manage text data in any natural language, and about any topic.

Author

ChengXiang Zhai is a Professor of Computer Science and Willett Faculty Scholar at the University of Illinois at Urbana-Champaign, where he is also affiliated with the Graduate School of Library and Information Science, Institute for Genomic Biology, and Department of Statistics. He received a Ph.D. in Computer Science from Nanjing University in 1990, and a Ph.D. in Language and Information Technologies from Carnegie Mellon University in 2002. He worked at Clairvoyance Corp. as a Research Scientist and then Senior Research Scientist from 1997-2000. His research interests include information retrieval, text mining, natural language processing, machine learning, biomedical and health informatics, and intelligent education information systems. He has published over 200 research papers in major conferences and journals. He served as an Associate Editor for Information Processing and Management, as an Associate Editor of ACM Transactions on Information Systems, and on the editorial board of Information Retrieval Journal. He was a conference program co-chair of ACM CIKM 2004, NAACL HLT 2007, ACM SIGIR 2009, ECIR 2014, ICTIR 2015, and WWW 2015, and conference general co-chair for ACM CIKM 2016. He is an ACM Distinguished Scientist and a recipient of multiple awards, including the ACM SIGIR 2004 Best Paper Award, the ACM SIGIR 2014 Test of Time Paper Award, Alfred P. Sloan Research Fellowship, IBM Faculty Award, HP Innovation Research Program Award, Microsoft Beyond Search Research Award, and the Presidential Early Career Award for Scientists and Engineers (PECASE).

Sean Massung is a Ph.D. candidate in computer science at the University of Illinois at Urbana-Champaign, where he also received both his B.S. and M.S. degrees. He is a co-founder of META and uses it in all of his research. He has been instructor for CS 225: Data Structures and Programming Principles, CS 410: Text Information Systems, and CS 591txt: Text Mining Seminar. He is included in the 2014 List of Teachers Ranked as Excellent at the University of Illinois and has received an Outstanding Teaching Assistant Award and CS@Illinois Outstanding Research Project Award. He has given talks at Jump Labs Champaign and at UIUC for Data and Information Systems Seminar, Intro to Big Data, and Teaching Assistant Seminar. His research interests include text mining applications in information retrieval, natural language processing, and education.

Explore​

Computers & Internet
Computers & Internet
Computers & Internet
Computers & Internet
Computers & Internet

Share via valid email address:


Back
© 2026 RIGHTOL All Rights Reserved.