Current language technology is dominated by approaches that either enumerate a large set of rules, or are focused on a large amount of manually labelled data. The creation of both is time-consuming and expensive, which is commonly thought to be the reason why automated natural language understanding has still not made its way into “real-life” applications yet. This book sets an ambitious goal: to shift the development of language processing systems to a much more automated setting than previous works. A new approach is defined: what if computers analysed large samples of language data on their own, identifying structural regularities that perform the necessary abstractions and generalisations in order to better understand language in the process?After defining the framework of Structure Discovery and shedding light on the nature and the graphic structure of natural language data, several procedures are described that do exactly this: let the computer discover structures without supervision in order to boost the performance of language technology applications. Here, multilingual documents are sorted by language, word classes are identified, and semantic ambiguities are discovered and resolved without using a dictionary or other explicit human input. The book concludes with an outlook on the possibilities implied by this paradigm and sets the methods in perspective to human computer interaction.The target audience are academics on all levels (undergraduate and graduate students, lecturers and professors) working in the fields of natural language processing and computational linguistics, as well as natural language engineers who are seeking to improve their systems.
Les mer
Current language technology is dominated by time-consuming approaches that either enumerate a large set of rules, or focus on a large amount of manually labeled data. This volume advocates a new open-source methodology that is much more automated.
Les mer
Foreword by Antal van den Bosch.- 1.Introduction.- 2.Graph Models.- 3.SmallWorlds of Natural Language.- 4.Graph Clustering.- 5.Unsupervised Language Separation .- 6.Unsupervised Part-of-Speech Tagging.- 7.Word Sense Induction and Disambiguation.- 8.Conclusion.- References​.
Les mer
Current language technology is dominated by approaches that either enumerate a large set of rules, or are focused on a large amount of manually labelled data. The creation of both is time-consuming and expensive, which is commonly thought to be the reason why automated natural language understanding has still not made its way into “real-life” applications yet. This book sets an ambitious goal: to shift the development of language processing systems to a much more automated setting than previous works. A new approach is defined: what if computers analysed large samples of language data on their own, identifying structural regularities that perform the necessary abstractions and generalisations in order to better understand language in the process?After defining the framework of Structure Discovery and shedding light on the nature and the graphic structure of natural language data, several procedures are described that do exactly this: let the computer discover structures without supervision in order to boost the performance of language technology applications. Here, multilingual documents are sorted by language, word classes are identified, and semantic ambiguities are discovered and resolved without using a dictionary or other explicit human input. The book concludes with an outlook on the possibilities implied by this paradigm and sets the methods in perspective to human computer interaction.The target audience are academics on all levels (undergraduate and graduate students, lecturers and professors) working in the fields of natural language processing and computational linguistics, as well as natural language engineers who are seeking to improve their systems.  
Les mer
The book sets an ambitious goal: to shift development of language processing systems to a much more automated setting than previous works A new approach is defined All software described is open source and freely available ? Includes supplementary material: sn.pub/extras
Les mer

Produktdetaljer

ISBN
9783642442308
Publisert
2014-03-01
Utgiver
Vendor
Springer-Verlag Berlin and Heidelberg GmbH & Co. K
Høyde
235 mm
Bredde
155 mm
Aldersnivå
Graduate, P, 06
Språk
Product language
Engelsk
Format
Product format
Heftet

Forfatter
Foreword by