- Copyright 2022
- Edition: 2nd
-
Online Video
- ISBN-10: 0-13-767018-4
- ISBN-13: 978-0-13-767018-5
5 Hours of Video Instruction
Overview
Natural Language Processing LiveLessons covers the fundamentals of Natural Language Processing in a simple and intuitive way, empowering you to add NLP to your toolkit. Using the powerful NLTK package, it gradually moves from the basics of text representation, cleaning, topic detection, regular expressions, and sentiment analysis before moving on to the Keras deep learning framework to explore more advanced topics such as text classification and sequence-to-sequence models. After successfully completing these lessons you'll be equipped with a fundamental and practical understanding of state-of-the-art Natural Language Processing tools and algorithms.
Skill LevelIntermediate
Learn How To* Represent text
* Clean text
* Understand named entity recognition
* Model topics
* Conduct sentiment analysis
* Utilize text classification
* Understand word2vec word embeddings
* Define GloVe
* Transfer learning
* Apply language detection
Who Should Take This Course
Data scientists with an interest in natural language processing
Course Requirements
Basic algebra, calculus, and statistics, plus programming experience
Lesson DescriptionsLesson 1, Text Representations: The first step in any NLP application is the tokenization and representation of text through one-hot encodings and bag of words. Naturally, not all words are meaningful, so the next step is to remove meaningless stopwords and identify the most relevant words for your application using TF-IDF. The next step is to identify n-grams. Finally, you learn how word embeddings can be used as semantically meaningful representations and finalize things with a practical demo.
Lesson 2, Text Cleaning: Lesson 2 builds on the text representations of Lesson 1 by applying stemming and lemmatization to identify the roots of words and reduce the size of the vocabulary. Next comes deploying regular expressions to identify words fitting specific patterns. The lesson finishes up by demoing these techniques.
Lesson 3, Named Entity Recognition: In named entity recognition you develop approaches to tag words by the part of speech to which they correspond. You also identify meaningful groups of words by chunking and chinking before recognizing the named entities that are the subject of your text. The lesson ends with a demonstration of the entire pipeline from raw text to named entities.
Lesson 4, Topic Modeling: Lesson 4 is about developing ways of identifying what the main subject or subjects of a text are. It begins by exploring explicit semantic analysis to find documents mentioning a specific topic and then turns to clustering documents according to topics. Latent semantic analysis provides yet another powerful way to extract meaning from raw text, as does latent-Dirichlet allocation. Non-negative matrix factorization enables you to identify latent dimensions in the text and perform recommendations and measure similarities. Finally, a hands-on demo guides you through the process of using all of these techniques.
Lesson 5, Sentiment Analysis: After identifying the topics covered in a document, the next place to go is how you extract sentiment information. In other words, what kind of sentiments are being expressed? Are the words used positive or negative? The next step is to consider how to handle negations and modifiers and use corpus-based approaches to define the valence of each word as demonstrated in the lesson-ending demo.
Lesson 6, Text Classification: In this lesson you learn how to use feed forward networks and convolutional neural networks to classify the sentiment of movie reviews as a test case for how to deploy machine learning approaches in the context of NLP. It also discusses further applications of this approach before proceeding with a hands-on demo.
Lesson 7, Sequence Modelling: Lesson 7 builds on the foundations laid in the previous lesson to explore the use of recurrent neural network architectures for text classification. It starts with the basic RNN architecture before moving on to gated recurrent units and long short-term memory. It also includes a discussion of auto-encoder models and text generation. The lesson wraps up with the demo.
Lesson 8, Applications: This course has focused on some fundamental and not-so-fundamental tools of natural language processing. This final lesson considers specific applications and advanced topics. Perhaps one of the most important developments in NLP in recent years is the popularization of word embeddings in general and word2vec in particular. This enables you to delve deeper into vector representations of words and concepts and how semantic relations can be expressed through vector algebra. GloVe is the main competitor to word2vec, so this lesson also explores its advantages and disadvantages. Also discussed are the potential applications of transfer learning to NLP and the question of language detection. The lesson finishes with a demo.
About Pearson Video Training
Pearson publishes expert-led video tutorials covering a wide selection of technology topics designed to teach you the skills you need to succeed. These professional and personal technology videos feature world-leading author instructors published by your trusted technology brands: Addison-Wesley, Cisco Press, Pearson IT Certification, Sams, and Que. Topics include IT Certification, Network Security, Cisco Technology, Programming, Web Development, Mobile Development, and more. Learn more about Pearson Video training at http://www.informit.com/video.