Spacy is a free, open-source library used for advanced natural language processing (NLP), written in the programming languages Python and Cython. Spacy is incredible fast as it’s written in CPython language.
Spacy is mainly used in the development of production software and also supports deep learning workflow via statistical models of PyTorch and TensorFlow.
If you are working with text data, you’ll eventually want to know more about spacy. For example, how many keywords related to the product are there in the text, What does the word means in the context, and many more.
What spacy can do?
Spacy provides accurate syntactic analysis and offers many things listed below:
- Part-of-speech (POS) Tagging,
- Named Entity Recognition (NER),
- Syntactic parsing,
- Word vectors and similarity,
- Many convenient methods for cleaning and normalizing text and many more
Spacy has 3 different models small, medium, and large that we can use as per the use case. The large model will take few seconds to load the model. The size for the smaller model is 12MB, the medium model is 43MB, the larger model is 741MB
Install spacy library
We can install the spacy library with pip and anaconda.
Install a spacy library with pip installer to install your Python libraries, go to the command line and execute the following statement.
pip install -U spacy
Install the spacy library with anaconda, you need to execute the following command on the Anaconda prompt.
conda install -c conda-forge spacy
The next step is to download the language model, here, we are going to download an English model.
python -m spacy download en_core_web_trf
Now let’s import the spacy library.
Load the spacy model
To use the spacy model we first need to load the model into a variable, here, we have to use variable names as
nlp = spacy.load(“en_core_web_sm”)
Declaring the variable and downloading the model, spacy will take a couple of seconds to load the model. The
load function of the spacy library is to load the model. The model is stored in
Note: Here, we are downloading an English language model, there are other language models too which we can download as per the use-case.
Example with spacy
Here, we will see how to find the length of the string by using
len() function in spacy. Loading the small spacy model into a
nlp variable. Taking any string and passing that string into a
doc variable. Now we will pass the
len() function for finding the length of the string.
nlp = spacy.load("en_core_web_sm")
text = ' 2021 is far worse than 2020 due to covid'
doc = nlp(text)
We see what is spacy and what are the application of spacy and what spacy can do with the data. In the upcoming articles, we will see all the features of the spacy.