Multilingual Natural Language Processing Applications: From Theory to Practice

Add To My Wish List

Multilingual Natural Language Processing Applications: From Theory to Practice

By Daniel Bikel, Imed Zitouni
Published Nov 15, 2011 by IBM Press. Part of the IBM Press series.

Rough Cuts

Available to Safari Subscribers
About Rough Cuts

Rough Cuts are manuscripts that are developed but not yet published, available through Safari. Rough Cuts provide you access to the very latest information on a given topic and offer you the opportunity to interact with the author to influence the final publication.

Not for Sale

Description

Sample Content

Updates

More Information

Description

Copyright 2012
Dimensions: 7" x 9-1/8"
Pages: 640
Edition: 1st

Rough Cuts
ISBN-10: 0-13-293094-3
ISBN-13: 978-0-13-293094-9

This is the Rough Cut version of the printed book.

Multilingual Natural Language Processing Applications is the first comprehensive single-source guide to building robust and accurate multilingual NLP systems. Edited by two leading experts, it integrates cutting-edge advances with practical solutions drawn from extensive field experience.

Part I introduces the core concepts and theoretical foundations of modern multilingual natural language processing, presenting today’s best practices for understanding word and document structure, analyzing syntax, modeling language, recognizing entailment, and detecting redundancy.

Part II thoroughly addresses the practical considerations associated with building real-world applications, including information extraction, machine translation, information retrieval/search, summarization, question answering, distillation, processing pipelines, and more.

This book contains important new contributions from leading researchers at IBM, Google, Microsoft, Thomson Reuters, BBN, CMU, University of Edinburgh, University of Washington, University of North Texas, and others.

Coverage includes

Core NLP problems, and today’s best algorithms for attacking them

Processing the diverse morphologies present in the world’s languages
Uncovering syntactical structure, parsing semantics, using semantic role labeling, and scoring grammaticality
Recognizing inferences, subjectivity, and opinion polarity
Managing key algorithmic and design tradeoffs in real-world applications
Extracting information via mention detection, coreference resolution, and events
Building large-scale systems for machine translation, information retrieval, and summarization
Answering complex questions through distillation and other advanced techniques
Creating dialog systems that leverage advances in speech recognition, synthesis, and dialog management
Constructing common infrastructure for multiple multilingual text processing applications

This book will be invaluable for all engineers, software developers, researchers, and graduate students who want to process large quantities of text in multiple languages, in any environment: government, corporate, or academic.



Sample Content

Preface xxi

Acknowledgments xxv

About the Authors xxvii

Part I: In Theory 1

Chapter 1: Finding the Structure of Words 3

1.1 Words and Their Components 4

1.2 Issues and Challenges 8

1.3 Morphological Models 15

1.4 Summary 22

Chapter 2: Finding the Structure of Documents 29

2.1 Introduction 29

2.2 Methods 33

2.3 Complexity of the Approaches 40

2.4 Performances of the Approaches 41

2.5 Features 41

2.6 Processing Stages 48

2.7 Discussion 48

2.8 Summary 49

Chapter 3: Syntax 57

3.1 Parsing Natural Language 57

3.2 Treebanks: A Data-Driven Approach to Syntax 59

3.3 Representation of Syntactic Structure 63

3.4 Parsing Algorithms 70

3.5 Models for Ambiguity Resolution in Parsing 80

3.6 Multilingual Issues: What Is a Token? 87

3.7 Summary 92

Chapter 4: Semantic Parsing 97

4.1 Introduction 97

4.2 Semantic Interpretation 98

4.3 System Paradigms 101

4.4 Word Sense 102

4.5 Predicate-Argument Structure 118

4.6 Meaning Representation 147

4.7 Summary 152

Chapter 5: Language Modeling 169

5.1 Introduction 169

5.2 n-Gram Models 170

5.3 Language Model Evaluation 170

5.4 Parameter Estimation 171

5.5 Language Model Adaptation 176

5.6 Types of Language Models 178

5.7 Language-Specific Modeling Problems 188

5.8 Multilingual and Crosslingual Language Modeling 195

5.9 Summary 198

Chapter 6: Recognizing Textual Entailment 209

6.1 Introduction 209

6.2 The Recognizing Textual Entailment Task 210

6.3 A Framework for Recognizing Textual Entailment 219

6.4 Case Studies 238



Updates

Submit Errata



More Information

