NLP sentiment analysis done within the browser
As I explore deeper into the realms of AI and ML modeling, I continually uncover new engaging methods for capturing user input. Building upon the insights shared in my previous article about sentiment analysis through facial recognition, I now shift the focus to another dimension of user interaction — speech and spoken words. Words, with their inherent meanings, have the power to convey nuanced context and evoke emotions in our presentations. In this article, we explore the significance of spoken language, emphasizing not just the subtleties of speech, such as pitch and tone, but the very essence and impact of the words themselves.
Common NLP solutions
There are numerous other solutions available on the web that provide sentiment scores based on your text, but most of these are backend services that generate the sentiment results for you. I have recently undertaken some work in this area at my workplace; however, like many existing solutions, it also necessitates a backend service for a sentiment analysis. If the requirement is for an on-demand, performant, and low-latency solution, the execution of the sentiment model needs to reside on the client-side, particularly within the browser for web applications.
Let explore what backend solution currently exists:
- Google’s NLP — https://cloud.google.com/natural-language?hl=en
- Amazon’s AWS Comprehend — https://aws.amazon.com/comprehend/
- OpenAI GPT 3.5/4 — https://openai.com/gpt-4 (using some prompt engineering to query for sentiment)
- AssemblyAI — https://www.assemblyai.com/
All of these solutions require a round-trip API call to the server API for the block of text that you want to analyze.
But what if I told you this is all possible without an API call to retrieve the sentiment score? Imagine the sequence below:
While exploring some TensorFlow models, I came across this sentiment example.
In this example, there is a trained model generated from IMDB reviews (with multiple thousands of them used to generate the trained model). You can view the live example from their site: https://storage.googleapis.com/tfjs-examples/sentiment/dist/index.html.
So, what is happening within that makes this demo possible? I dissected the code and tried to re-implement it as a React app in Codesandbox to gain a full understanding of what is going on here. (note: The sentiment score is has a range from 0–1. With 0 meaning extremely negative while 1 means extremely positive)
The solution is very “simple”. Kind of…
- Train a model based on sentiment score of your a lot of text content and how they map
- Using TensorFlow.js and load these models into the browser along with all the metadata required by the CNN model.
- The model that is utilized is a pre-trained CNN model trained with IMDB review.
- The browser will then run the text against the model and generate a sentiment score.
- All of this processing is done in just under milliseconds
Just for reference on what CNN model is:
Think of a CNN (Convolutional Neural Network) for sentiment analysis like a smart detective for text. It reads sentences like a detective scanning for important clues. Filters help it spot key words, and it learns patterns to understand feelings. Just as a detective pieces together evidence, the CNN pieces together words to decide if the text is positive, negative, or neutral. It’s like having a virtual detective that’s really good at figuring out how people feel from what they say.
From the demo, aside from having the application load the model and metadata from their respective url, there are no more interaction to the backend call. All sentiment analysis are purely done client-side within the browser as described in the sequence diagram above.
Parting Thoughts
This model is specifically trained on IMDB reviews and the text content within them, allowing for the generation of sentiment scores tailored to movie reviews. While the model can be further trained to accommodate the sentiment of any system’s text content, its current support is limited to English because it was only trained with English reviews. Expanding its language capabilities would necessitate additional training efforts, but the investment becomes justifiable if the system’s business requirements demand sentiment analysis in a more focused context and in multiple languages.
With NLP offering a straightforward means to acquire sentiment scores from spoken words through trained models directly in the browser, and when combined with the face detection sentiment analysis discussed in my earlier article, the AI development scene takes a significant stride towards achieving a more feature-rich Human-Computer Interface (HCI) for seamless interactions with computer systems.