Home > Store

Multilingual Natural Language Processing Applications: From Theory to Practice

Rough Cuts

  • Available to Safari Subscribers
  • About Rough Cuts
  • Rough Cuts are manuscripts that are developed but not yet published, available through Safari. Rough Cuts provide you access to the very latest information on a given topic and offer you the opportunity to interact with the author to influence the final publication.

Not for Sale

Description

  • Copyright 2012
  • Dimensions: 7" x 9-1/8"
  • Pages: 640
  • Edition: 1st
  • Rough Cuts
  • ISBN-10: 0-13-293094-3
  • ISBN-13: 978-0-13-293094-9

This is the Rough Cut version of the printed book.

Multilingual Natural Language Processing Applications is the first comprehensive single-source guide to building robust and accurate multilingual NLP systems. Edited by two leading experts, it integrates cutting-edge advances with practical solutions drawn from extensive field experience.

Part I introduces the core concepts and theoretical foundations of modern multilingual natural language processing, presenting today’s best practices for understanding word and document structure, analyzing syntax, modeling language, recognizing entailment, and detecting redundancy.

Part II thoroughly addresses the practical considerations associated with building real-world applications, including information extraction, machine translation, information retrieval/search, summarization, question answering, distillation, processing pipelines, and more.

This book contains important new contributions from leading researchers at IBM, Google, Microsoft, Thomson Reuters, BBN, CMU, University of Edinburgh, University of Washington, University of North Texas, and others.

Coverage includes

Core NLP problems, and today’s best algorithms for attacking them

  • Processing the diverse morphologies present in the world’s languages
  • Uncovering syntactical structure, parsing semantics, using semantic role labeling, and scoring grammaticality
  • Recognizing inferences, subjectivity, and opinion polarity
  • Managing key algorithmic and design tradeoffs in real-world applications
  • Extracting information via mention detection, coreference resolution, and events
  • Building large-scale systems for machine translation, information retrieval, and summarization
  • Answering complex questions through distillation and other advanced techniques
  • Creating dialog systems that leverage advances in speech recognition, synthesis, and dialog management
  • Constructing common infrastructure for multiple multilingual text processing applications

This book will be invaluable for all engineers, software developers, researchers, and graduate students who want to process large quantities of text in multiple languages, in any environment: government, corporate, or academic.

Sample Content

Table of Contents

Preface         xxi

Acknowledgments         xxv

About the Authors         xxvii

Part I: In Theory         1

Chapter 1: Finding the Structure of Words         3

1.1 Words and Their Components   4

1.2 Issues and Challenges   8

1.3 Morphological Models   15

1.4 Summary   22

Chapter 2: Finding the Structure of Documents         29

2.1 Introduction   29

2.2 Methods   33

2.3 Complexity of the Approaches   40

2.4 Performances of the Approaches   41

2.5 Features   41

2.6 Processing Stages   48

2.7 Discussion   48

2.8 Summary   49

Chapter 3: Syntax         57

3.1 Parsing Natural Language   57

3.2 Treebanks: A Data-Driven Approach to Syntax   59

3.3 Representation of Syntactic Structure   63

3.4 Parsing Algorithms 70

3.5 Models for Ambiguity Resolution in Parsing   80

3.6 Multilingual Issues: What Is a Token?   87

3.7 Summary   92

Chapter 4: Semantic Parsing         97

4.1 Introduction   97

4.2 Semantic Interpretation   98

4.3 System Paradigms   101

4.4 Word Sense   102

4.5 Predicate-Argument Structure 118

4.6 Meaning Representation   147

4.7 Summary   152

Chapter 5: Language Modeling          169

5.1 Introduction   169

5.2 n-Gram Models   170

5.3 Language Model Evaluation   170

5.4 Parameter Estimation   171

5.5 Language Model Adaptation   176

5.6 Types of Language Models   178

5.7 Language-Specific Modeling Problems  188

5.8 Multilingual and Crosslingual Language Modeling   195

5.9 Summary   198

Chapter 6: Recognizing Textual Entailment         209

6.1 Introduction   209

6.2 The Recognizing Textual Entailment Task   210

6.3 A Framework for Recognizing Textual Entailment   219

6.4 Case Studies   238

Updates

Submit Errata

More Information