Member-only story

What is NLP and what is BOW?

Fintelics
4 min readMar 28, 2019

--

Natural language processing (NLP) is a popular field in Artificial Intelligence. Simply put, it uses text, such as news, twitter, comments or any paragraphs/sentences that are made with words, as input to conduct further analysis. Just like how you read and understand an article, NLP algorithms do the very similar.

Image result for nlp meme

However, where the difference lies between human and NLP algorithms is that the algorithms do not input the words literally, but rather they can only input them as numbers. So when you hear about NLP related topics, people are mostly talking about the ways they transform those text into numbers. And this is the key component of a NLP algorithm.

From a macro perspective, there are 2 ways to transform text to numbers.

  • Word frequency
  • Underlying meaning of text

In this article, we will talk about the first one, word frequency.

Bag of Words (BOW), is a commonly used algorithm of word frequency based NLP. It is used to represent the text as a bag of its words, disregarding order, meaning but keeping multiplicity. What does this mean?

For any NLP algorithms, we always transform them into tokenized form first, meaning treating each word separately. BOW counts each of the tokens (words) and represent the sentence as a vector (a combination…

--

--

Fintelics
Fintelics

Written by Fintelics

Software consulting company that focuses on emerging technology such as AI, Blockchain, Cloud Computing, and Data Engineering, MERN Stack, and Fintech

No responses yet