NLP with Javascript Using winkJS

NLP with Javascript Using winkJS

Danielle Maxwell | Monday, Jun 10, 2024 |  JavaScript NLP

winkJS is a collection of open source libraries, created by Graype Systems, designed to make Natural Language Processing (NLP), machine learning, and statistical analysis accessible in JavaScript. Let’s take an in-depth look into their winkNLP library and how it can be a handy tool for developers, including those who are new to machine learning.

What is winkNLP?

In a blog post about machine learning libraries written in JavaScript, the winkNLP library by winkJS was recommended. It has a multilingual tokenizer, can work with sentence boundary detection (sbd), perform sentiment analysis, and provide part-of-speech tagging out the box. The library has additional tools that can provide a Flesch reading score, measure similarity, flag stop words, and so much more.

How does it work?

winkNLP has a readDoc method where text passed to it may be split into tokens, sentences, or entities. It’s possible define entities by using the learnCustomEntities method. Once the text is divided up, it can be accessed by the its or as helpers to make development a breeze.

Want to get the type for each character in a text collection? Use its.type and get an array specifying the type for each token.

Need to create a frequency table for sentences, tokens, or entities? as.freqTable will return a frequency table in descending order.

winkNLP also includes a vector tool that uses BM25 algorithm to compute weights for terms. If necessary, the vectorizer’s default configuration can be updated.

Let’s Code!

A major highlight of using winkNLP is how beginner friendly it is. Though I’m not the most experienced with NLP, I was able to create a quick demo that returns readability stats (Flesch reading score, sentiment, reading time, and complex word count) for a text input and highlight each complex word.

To get started, we’ll have to install the winkNLP package and the wink-eng-lite-web-model. Another model is available (wink-eng-lite-model), but it’s not recommended for use with browsers or Node.js versions lower than v16. No judgement here, but if you’re using a version of Node.js that’s lower than v16, I highly recommend updating to a later version.

$ npm or yarn install wink-nlp --save
$ npm or yarn install wink-eng-lite-web-mode --save

For this demo, I used winkNLP in the browser which requires the use of a bundler. I decided to use browserify, so if you’re following along, be sure to install that as well or the bundling tool of your choice.

$ npm or yarn install install browserify

Next, load the installed packages and pass the model to the nlp helper.

const winkNLP = require('wink-nlp');
const model = require('wink-eng-lite-web-model');
const nlp = winkNLP(model);

This will now allow us to access the its and as helpers.

const its = nlp.its;
// const as = nlp.as (as wasn't used in this demo. Here's how it can be accessed.)

In an index.html file, add a form with an input or text area along with a section to display the readability stats. Then let’s go ahead and access those elements by assigning them below.

const text = document.getElementById('text-input');
const submitBttn = document.getElementById('text-submit-bttn');
const readingStats = document.getElementById('reading-stats');
const highlightedText = document.getElementById('highlighted-text');

Let’s now get the text submitted via the form and use its.readability stats to perform a quick analysis. The readability stats will be returned in an object.

// Get text from the input field.
submitBttn.addEventListener('click', (e) => {
  e.preventDefault();
  const textInput = text.value;

  // Pass text as an argument to the readDoc method to create a new collection.
  const doc = nlp.readDoc(textInput);

  // Get readability stats for the text input.
  const readabilityStats = doc.out(its.readabilityStats);

  // Insert the readability stats into the DOM.
  readingStats.insertAdjacentHTML('afterbegin', `
  <h2>Readability Stats</h2>
  <ul>
    <li>Flesch Reading Ease Score: ${readabilityStats.fres}</li>
    <li>
      Reading Time: ${readabilityStats.readingTimeMins} minutes 
      and ${readabilityStats.readingTimeSecs} seconds
    </li>
    <li>Sentiment: ${readabilityStats.sentiment}</li>
    <li>Word Count: ${readabilityStats.numOfWords}</li>
    <li>Sentence Count: ${readabilityStats.numOfSentences}</li>
  </ul>
`);
});

We’re looking good so far. Now, we want to highlight the complex words inside the readabilityStats object. Let’s push all of the complex words to an empty array. Split the words in the text input into an array. This will allow us to compared each word in the text input against the complexWordsList array.

If there’s a match, we will wrap the complex word in a span element and let’s give it a class of complex word so that we can change the background color with a little CSS. Once the loop is complete, join the text and then insert it into the DOM.

const complexWords = readabilityStats.complexWords;
  let complexWordsList = [];
  for (let word in complexWords) {
    // Add the complex word to the complexWordsList array and make it lowercase to handle case insensitivity.
    complexWordsList.push(word.toLowerCase());
  };

  const textList = textInput.split(' ');
  textList.forEach((word, idx) => {
     // Remove any special characters from a word.
    const cleanWord = word.replace(/[.,\/#!$%\^&\*;:{}=\-_`~()]/g, '');
    if (complexWordsList.includes(word)) {
      textList[idx] = `<span class="complex-word">${word}</span>`
    };
  });

  const modifiedText = textList.join(' ');
  highlightedText.innerHTML = modifiedText;

And we’re done! If you followed these instructions, you’ll need to user the bundler to view the app locally. Run the following command, if you used browserify, or the one for your bundling tool of choice.

browserify index.js -o bundle.js

Take a look at the short video below to see how the app works. You may find the complete code at my GitHub.

Wrap Up

Now you see how easy it is to get started with using NLP in JavaScript with the winkNLP library created by winkJS. To expand on this demo, we could change the background based on the sentiment score or go further and allow users to input more than one text input and return a similarity score.

If you’ve created an app with winkNLP or become inspired after reading this post, let me know.

Image by fszalai from Pixabay

About This Post

Use the JavaScript machine learning library winkJS to make Natural Language Processing (NLP) beginner friendly.

Written by:

Share this post:

Recommended  Rotations

butterfly
View all

5 Javascript Libraries to Use for Machine Learning

Over the years, several JavaScript libraries have been created for machine learning. Let’s sort through the ones that can help you get started quickly, even if you don’t have much experience with machine learning or data …

Mar 11, 2024

What is Incremental Machine Learning?

Most of us got into data science because it’s exciting (if chaotic) and there’s a constant stream of new ideas, which is thrilling (if intimidating). But if learning is all about keeping up, why can’t our models do it?

Aug 21, 2023

Using River and Vowpal Wabbit to Build Real-Time Machine Learning Models

Real-time ML models continually learn on new data as soon as it arrives, so they’re less susceptible to concept drift and data drift. Read on to learn how to use River and Vowpal Wabbit to build real-time models in Python.

Aug 11, 2023
Enter Your Email To Subscribe