Introduction to Machine Learning for NLP

As we had discussed in the introductory article, Machine Learning, Artificial Intelligence, and NLP are interlinked together. We need to know Machine Learning if we efficiently want to solve NLP problems.

What is Machine Learning?

I’m sure you must have heard of Machine Learning because it has been a buzzword in recent years. The reason is that a lot of industries are now relying on this technology to complete routine tasks and many other tasks.

So let’s define Machine Learning first. Machine learning algorithms enable computers to learn from data, and even improve themselves, without being explicitly programmed. Now, let’s try to understand what this definition actually means.

Machine learning is a subset of Artificial Intelligence and it comprises algorithms that help computers or machines to learn from the data and perform tasks with explicitly programming them. In traditional algorithms, we have to set the rules and instructions that a machine should follow to complete a task.

But Machine Learning algorithms are different than traditional algorithms because we don’t have to write each and every rule that the machine must perform, instead, the machine learns from the data just like a human would learn by taking in information.

The learning bit in the term Machine Learning is evident when we say that we train a model. In my previous articles, whenever we trained a model, the machine was actually learning during the training phase. Hence, it is called machine learning.

In Natural Language Processing we work with a lot of data in the form of text or sometimes speech, images and videos, hence Machine Learning algorithms are applied so that we can get accurate results for the tasks we want to perform.

Categorizing ML algorithms

We can broadly categorize ML algorithms into 3 categories. The three categories are shown in the image below.

Machine Learning Algorithms

The ML algorithms can be broadly categorized into

  1. Supervised learning
    • These type of algorithms involve training the model using labeled data.
    • The data that is used for training here contains the feature set (independent variables) along with the labels or values of the dependent variables.
    • So basically, in this type of ML algorithms, the machine analyzes the dataset and tries to learn a function that maps or tries to find a relationship between the independent and dependent variables.
    • Some of the most popular supervised learning variables include linear regression, logistic regression, decision trees, random forests, Support Vector Machines (SVMs), Naive Bayes, etc.
  2. Unsupervised learning
    • In unsupervised learning the algorithms involve training the model on unlabeled data. So the data doesn’t consist of any labels at all.
    • Since there are no labels to relate the data to, the training process of the model consists of studying the dataset, analyzing the dataset to find some hidden structures in the data.
    • Because there are no labels for the data, the model tries to find clusters of similar data points within the data.
    • Some of the popular unsupervised learning algorithms include k-means, LDA, PCA, etc.
  3. Reinforcement learning
    • Reinforcement learning is the third type of ML algorithms where the model is trained and the model learns based on the rewards that it receives for performing certain actions.
    • Even though in refinforcement learning we don’t supply labels just like unsupervised learning, one major difference between the two is that in reinforcement learning we are also supplying a reward like positive reward and negative reward.
    • An analogy that can be given to understand reinforcement learning is that of a child touching a hot vessel and quickly witchdrawing it because it is a negative reward. But if we give him a toffee for doing something, he will keep doing it to get that reward.
    • Popular reinforcement learning algorithms include Q-learning, SARSA, etc.

Machine Learning for Natural Language Processing

Now that we have seen, what Machine Learning is, how it solves problems, and the three categories of algorithms it falls into, the question that arises is that how is Machine Learning used in the context of NLP?

If you remember we had discussed that machine learning algorithms teach the machines (computers) to learn from data to predict some output. So ML is essentially a very important part of NLP because we cannot hand-code each rule for the computer to learn.

Take an example. Suppose we are building a chatbot which is an advanced application of NLP. While building the chatbot, we have to teach the chatbot based on a lot of data. But while training, we cannot set each and every rule as to what the chabot should reply to a query. Because there is an infinite number of things a user can say.

So, implementing machine learning algorithms which in spite of being computationally expensive is critical to any NLP technique that is used to solve problems. So you can think of machine learning which includes deep learning as a backbone to modern NLP solutions.

Now that we have a good idea of why machine learning is necessary for good NLP solutions, we will focus on the subset of algorithms in ML that are most useful.

While it is true that of the above-discussed machine learning categories all three have found applications in text processing, the most popular category that has been used to implement NLP solutions is supervised learning. Supervised learning has been used to implement reliable and industry-wide tools for a long time.

In supervised learning, the two algorithms mostly used in the context of NLP are

  • Naive Bayes algorithm
  • SVM algorithm

These two supervised ML algorithms are mainly used in Natural Language Processing. Hence we will discuss them in detail in the upcoming articles.

Final Thoughts

In this article, we have introduced Machine Learning, its subtypes, and its importance and relevance in NLP.