Getting started with NLTK : The First Step installing the NLTK

May 31, 2018
By Pawan Prasad
0 Comments

Natural Language Processing can be implemented in python using Natural Language Toolkit (NLTK) which is a suite of python libraries for text analysis and human language data processing.

Before Starting with NLTK lets understand what is Natural Language Processing (NLP). NLP is extracting information from unstructured data like finding the name of the places, names and named entities from given text and analyzing the linguist structure in the text like semantics analysis.

NLTK is suites of open source libraries in python and using these libraries we do Natural Language Processing in python on human language data in text form. NLTK has over 50 corpora and lexical resources such as WordNet, along with a suite of text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning.

How to install NLTK

NLTK is compatible with python versions 2.7, 3.4, 3.5, and 3.6.

Mac/Unix

Install NLTK: run sudo pip install -U nltk
Install Numpy (optional): run sudo pip install -U numpy
Test installation: run python then type import nltk

For older versions of Python it might be necessary to install setuptools (see http://pypi.python.org/pypi/setuptools) and to install pip (sudo easy_install pip).

Windows

These instructions assume that you do not already have Python installed on your machine.

32-bit binary installation

Install Python 3.6: http://www.python.org/downloads/ (avoid the 64-bit versions)
Install Numpy (optional): http://sourceforge.net/projects/numpy/files/NumPy/ (the version that specifies python3.5)
Install NLTK: http://pypi.python.org/pypi/nltk or if you have pip then type pip install nltk
Test installation: Start>Python35, then type import nltk

Installing NLTK data

NLTK comes with corpora in form of books, labeled training data, training models and stop words corpus. There are 106 corpus and trained model which can be downloaded to make our language processing task easier.

To download all this data run the following commands in the console.

>>import nltk

>>nltk.download()

the last command will open a window like this will be open.

as I have already done that's why the status is "installed". In case of fresh installation, this would be set to "not installed".

select all and click on download button. It will start downloading all the corpus packages and models.

However, if you wish to change the download directory click on file and then click on change the download direcroty before hitting the download.

If you are facing the issue in downloading then maybe your web connection is using the proxy. To solve this, before hitting nltk.download() you have the set the proxy for nltk. This can be done using :

>>> nltk.set_proxy('http://proxy.example.com:3128', ('USERNAME', 'PASSWORD'))

>>> nltk.download()

Now you have successfully installed the nltk and nltk data. If you still facing problems in installation please do comment below I will try to help you.

#AI

#NLP

#NLTK

#Python

Pages

Machine Learning and Data Science Blog

Getting started with NLTK : The First Step installing the NLTK

How to install NLTK

Mac/Unix

Windows

Installing NLTK data

0 comments

Popular Posts

Labels

recent posts

Blog Archive

Pages

Machine Learning and Data Science Blog

Getting started with NLTK : The First Step installing the NLTK

How to install NLTK

Mac/Unix

Windows

Installing NLTK data

Share This Story

You Might Also Like

0 comments

Popular Posts

Labels

recent posts

Blog Archive