AIND: Natural Language Processing

Coding exercises for the Natural Language Processing concentration, part of Udacity's Artificial Intelligence Nanodegree program.

Setup

You need Python 3.6+, and the packages mentioned in requirements.txt. You can install them using:

pip install -r requirements.txt

Data

Data files for exercises are included under data/, but some of the NLP libraries require additional data for performing tasks like PoS tagging, lemmatization, etc. Specifically, nltk will throw an error if the required data is not installed. You can use the following Python statement to open the NLTK downloader and select the desired package(s) to install:

import nltk
nltk.download()

This opens a GUI. DO NOT download everthing. Required files include:

Models > punkt (13MB)
Corpora > stopwords (11kB)
All Packages > averaged_perceptron_taggers (2.4MB)
All Packages > maxent_ne_chunkers" (12.8MB)
Corpora > Words (740kB)
Corpora > wordnet (10.3MB)

For each of the above, select it and click "Download" (explained here)

You can also download all available NLTK data packages, which includes a number of sample corpora as well, but that may take a while (10+GB).

Note: Install GhostScript: brew install ghostscript to avoid error NLTK was unable to find the gs file! (reference: https://stackoverflow.com/questions/36942270/nltk-was-unable-to-find-the-gs-file)

Run

To run any script file, use:

python <script.py>

To open a notebook, use:

jupyter notebook <notebook.ipynb>

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Please refer to Udacity Terms of Service for further information.

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
data		data
.gitignore		.gitignore
README.md		README.md
nlp_text_processing_steps.png		nlp_text_processing_steps.png
parse_tree2.png		parse_tree2.png
read_file.py		read_file.py
requirements.txt		requirements.txt
text_processing.ipynb		text_processing.ipynb
tree_entity.png		tree_entity.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AIND: Natural Language Processing

Setup

Data

Run

About

Releases

Packages

Contributors 2

Languages

ltfschoen/AIND-NLP

Folders and files

Latest commit

History

Repository files navigation

AIND: Natural Language Processing

Setup

Data

Run

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages