As we enter the third decade of the World Wide Web (WWW), the textual revolution has seen a tremendous change in the availability of online information. Finding inf- mation for just about any need has never been more automatic—just a keystroke or mouseclick away. While the digitalization and creation of textual materials continues at light speed, the ability to navigate, mine, or casually browse through documents too numerous to read (or print) lags far behind. What approaches to text mining are available to ef?ciently organize, classify, label, and extract relevant information for today’s information-centric users? What algorithms and software should be used to detect emerging trends from both text streamsandarchives?Thesearejustafewoftheimportantquestionsaddressedatthe Text Mining Workshop held on April 28, 2007, in Minneapolis, MN. This workshop, the ?fth in a series of annual workshops on text mining, was held on the ?nal day of the Seventh SIAM International Conference on Data Mining (April 26–28, 2007). With close to 60 applied mathematicians and computer scientists representing universities, industrial corporations, and government laboratories, the workshop f- tured both invited and contributed talks on important topics such as the application of techniques of machine learning in conjunction with natural language processing, - formation extraction and algebraic/mathematical approaches to computational inf- mation retrieval. The workshop’s program also included an Anomaly Detection/Text Mining competition. NASA Ames Research Center of Moffett Field, CA, and SAS Institute Inc. of Cary, NC, sponsored the workshop.
Les mer
This workshop, the ?fth in a series of annual workshops on text mining, was held on the ?nal day of the Seventh SIAM International Conference on Data Mining (April 26–28, 2007).
Clustering.- Cluster-Preserving Dimension Reduction Methods for Document Classification.- Automatic Discovery of SimilarWords.- Principal Direction Divisive Partitioning with Kernels and k-Means Steering.- Hybrid Clustering with Divergences.- Text Clustering with Local Semantic Kernels.- Document Retrieval and Representation.- Vector Space Models for Search and Cluster Mining.- Applications of Semidefinite Programming in XML Document Classification.- Email Surveillance and Filtering.- Discussion Tracking in Enron Email Using PARAFAC.- Spam Filtering Based on Latent Semantic Indexing.- Anomaly Detection.- A Probabilistic Model for Fast and Confident Categorization of Textual Documents.- Anomaly Detection Using Nonnegative Matrix Factorization.- Document Representation and Quality of Text: An Analysis.
Les mer
The proliferation of digital computing devices and their use in communication has resulted in an increased demand for systems and algorithms capable of mining textual data. Thus, the development of techniques for mining unstructured, semi-structured, and fully-structured textual data has become increasingly important in both academia and industry.
This second volume continues to survey the evolving field of text mining - the application of techniques of machine learning, in conjunction with natural language processing, information extraction and algebraic/mathematical approaches, to computational information retrieval. Numerous diverse issues are addressed, ranging from the development of new learning approaches to novel document clustering algorithms, collectively spanning several major topic areas in text mining.
Features:
• Acts as an important benchmark in the development of current and future approaches to mining textual information
• Serves as an excellent companion text for courses in text and data mining, information retrieval and computational statistics
• Experts from academia and industry share their experiences in solving large-scale retrieval and classification problems
• Presents an overview of current methods and software for text mining
• Highlights open research questions in document categorization and clustering, and trend detection
• Describes new application problems in areas such as email surveillance and anomaly detection
Survey of Text Mining II offers a broad selection in state-of-the art algorithms and software for text mining from both academic and industrial perspectives, to generate interest and insight into the state of the field. This book will be an indispensable resource for researchers, practitioners, and professionals involved in information retrieval, computational statistics, and datamining.
Michael W. Berry is a professor in the Department of Electrical Engineering and Computer Science at the University of Tennessee, Knoxville.
Malu Castellanos is a senior researcher at Hewlett-Packard Laboratories in Palo Alto, California.
Les mer
Overview of current methods and software for text mining Experts from academia and industry share their experiences in solving large-scale retrieval and classification problems Highlights open research questions in document categorization and clustering, and trend detection Describes new application problems in areas such as email surveillance and anomaly detection Includes supplementary material: sn.pub/extras
Les mer
Produktdetaljer
ISBN
9781849967136
Publisert
2010-10-13
Utgiver
Vendor
Springer London Ltd
Høyde
235 mm
Bredde
155 mm
Aldersnivå
Research, P, 06
Språk
Product language
Engelsk
Format
Product format
Heftet