This book presents pattern-based problem-solving methods for a variety of machine learning and data analysis problems. The methods are all based on techniques that exploit the power of group differences. They make use of group differences represented using emerging patterns (aka contrast patterns), which are patterns that match significantly different numbers of instances in different data groups. A large number of applications outside of the computing discipline are also included. Emerging patterns (EPs) are useful in many ways. EPs can be used as features, as simple classifiers, as subpopulation signatures/characterizations, and as triggering conditions for alerts. EPs can be used in gene ranking for complex diseases since they capture multi-factor interactions. The length of EPs can be used to detect anomalies, outliers, and novelties. Emerging/contrast pattern based methods for clustering analysis and outlier detection do not need distance metrics, avoiding pitfalls of the latter in exploratory analysis of high dimensional data. EP-based classifiers can achieve good accuracy even when the training datasets are tiny, making them useful for exploratory compound selection in drug design. EPs can serve as opportunities in opportunity-focused boosting and are useful for constructing powerful conditional ensembles. EP-based methods often produce interpretable models and results. In general, EPs are useful for classification, clustering, outlier detection, gene ranking for complex diseases, prediction model analysis and improvement, and so on. EPs are useful for many tasks because they represent group differences, which have extraordinary power. Moreover, EPs represent multi-factor interactions, whose effective handling is of vital importance and is a major challenge in many disciplines. Based on the results presented in this book, one can clearly say that patterns are useful, especially when they are linked to issues of interest. We believe that many effective ways to exploit group differences' power still remain to be discovered. Hopefully this book will inspire readers to discover such new ways, besides showing them existing ways, to solve various challenging problems.
Les mer
In general, EPs are useful for classification, clustering, outlier detection, gene ranking for complex diseases, prediction model analysis and improvement, and so on.EPs are useful for many tasks because they represent group differences, which have extraordinary power.
Les mer
Acknowledgments.- Introduction and Overview.- General Preliminaries.- Emerging Patterns and a Flexible Mining Algorithm.- CAEP: Classification By Aggregating Multiple Matching Emerging Patterns.- CAEP for Classification on Tiny Training Datasets, Compound Selection, and Instance Selection.- OCLEP: One-Class Intrusion Detection and Anomaly Detection.- CPCQ: Contrast Pattern Based Clustering-Quality Evaluation.- CPC: Pattern-Based Clustering.- IBIG: Ranking Genes and Attributes for Complex Diseases and Complex Problems CPXR and CPXC: Pattern Aided Prediction Modeling and Prediction Model Analysis.- Other Approaches and Applications Using Emerging Patterns.- Bibliography.- Author's Biography.- Index.
Les mer

Produktdetaljer

ISBN
9783031007859
Publisert
2019-02-22
Utgiver
Vendor
Springer International Publishing AG
Høyde
235 mm
Bredde
191 mm
Aldersnivå
Professional/practitioner, P, 06
Språk
Product language
Engelsk
Format
Product format
Heftet
Original title
Exploiting the Power of Group Differences

Forfatter

Om bidragsyterne

Dr. Guozhu Dong is a professor of Computer Science and Engineering, and a member at the Knoesis Center of Excellence, at Wright State University. He received a Ph.D. in Computer Science from the University of Southern California and a B.S. in Mathematics from Shandong Univerity. Before joining Wright State University, he was a faculty member at the University of Melbourne. His research interests span data mining, machine learning, databases, data science, bioinformatics, and artificial intelligence. He co-authored a book on Sequence Data Mining, co-edited two books on Contrast Data Mining and on Feature Engineering, respectively, and authored a book on Exploiting the Power of Group Differences. He is known for his pioneering work and sustained effort on emerging/contrast pattern mining and on the use of such patterns in problem solving. He has published hundreds of papers at major international conferences and in top-rate journals in the fields of data mining and databases. He received several best research paper awards at major data mining conferences. At Wright State University, he was recognized for Excellence in Research in his college. He has served on hundreds of program committees of international conferences, and he has chaired the program committees for several such conferences. He is a senior member of both ACM and IEEE.