Finding the Right Sentiment Analysis Model for You: VADER vs. Spark NLP

Malik Hasan and John McCartney

Finding the Right Sentiment Analysis Model for You: VADER vs. Spark NLP

Do you need a quick and easy way to filter customer feedback? Or perhaps you need to be able to monitor the perception of your brand on social media? Sentiment analysis answers these questions by automatically classifying text chunks by whether their sentiment is positive, negative, or neutral. Building your own sentiment analysis model can be a lot of work but can be simplified by existing plug-and-play technologies.

Choosing the right solution for your specific use case can be a daunting task with all of the options currently available. Spark NLP and VADER are two of the most popular and free sentiment analysis models, but they approach the problem in fundamentally different ways. Natural language processing (NLP) is inherently a domain-specific problem—different forms of media use widely different styles of grammar and jargon.

As a result, the same model often fails to perform equally well on different media such as social media posts versus news articles. Therefore, it is key to weigh the fundamental differences between these models when identifying the best solution. Credera has conducted this comparison analysis to determine which models to include in a full suite sentiment analysis solution.

Credera has created a social media analytics accelerator that combines a host of tools that leverage machine learning/artificial intelligence-based systems along with deep expertise in data science and data engineering, applying the most effective tool to the right business case. The tool does the work of scanning billions of inputs, transforming data into information, and information into insights in near real time.

The accelerator utilizes sentiment analysis to provide trends over time on brand perception, opinions about products, customer service experiences, and much more, taking luck and manual work out of insight discovery. It achieves this by combining multiple state-of-the-art NLP libraries such as VADER and Spark NLP to support sentiment analysis insights.

This blog post will break down the decision calculus Credera underwent while selecting between the top NLP tools on the market to help you understand how to choose the right tool for you.

What Is VADER?

Valence Aware Dictionary for sEntiment Reasoning (VADER) relies on a sentiment lexicon (a “dictionary”) that was manually created and annotated by human raters. Furthermore, the dictionary is “valence aware,” which means the terms have an associated weight indicating how positive or negative they are, rather than just a binary positive-negative classification.

Additionally, while VADER’s lexicon maps the sentiment of individual word tokens, most users are interested in analyzing full sentences or paragraphs of text rather than individual words. VADER would not be fully effective without also considering contextual information found in the sentence that can change the sentiment of a word. For this reason, VADER has five heuristics designed to detect contextual clues that affect the intensity or polarity of words in a sentence.

As an example of how this works, let’s breakdown the following sentence: “The service wasn’t that great, but the food was extremely GOOD!!”

First, there are three intensifying factors at play on the word “good”: the all-caps, the exclamation points, and the intensifier “extremely.” VADER’s heuristics account for these by increasing the valence score. Since “good” is a positive word, VADER will recognize that, in this context, it is even more positive than usual.
Next, the word “great,” which is generally a positive word, has been negated in this sentence. VADER considers the three words preceding the sentiment word, looking for negation words like “wasn’t” that flip the polarity of the sentiment word.
Finally, there is some truth in the quote, “Nothing someone says before the word ‘but’ really counts.” VADER weighs any sentiment declared after the word “but” more heavily than anything preceding it. More specifically, the sentiment of “The service wasn’t that great” is discounted by 50%, while the sentiment of “the food was extremely GOOD!!” is magnified by 50%.

The limitation of VADER is that as a rule-based sentiment classifier, it can only classify the scenarios the algorithm has explicitly defined. For example, it considers the contextual effects of “but,” but not similar conjunctions like “although” or “however.” Despite this limitation, it accounts for a broad set of syntactical possibilities and outperforms human raters, remaining a top choice in the industry.

What Is Spark NLP?

Spark NLP is an open-source, advanced NLP library that offers production-grade NLP models. Spark NLP provides a framework for pretrained models and customizable pipeline components, which allow for flexibility in your model-building process. In addition to rule-based models, the library also provides deep-learning-based sentiment analysis models that rely on a data-driven learning approach as opposed to manually created lexicons, which are labor intensive and ultimately limited in scope.

Spark NLP is built natively on Apache Spark, which is an open source computing engine designed for processing big data. Apache Spark allows for data processing in a distributed manner, with a variety of use cases including batch processing, real-time streaming, and advanced modeling and analytics. This means you can take advantage of the speed and scalability Apache Spark provides while using Spark NLP’s models and components.

While Spark NLP solutions are powerful and highly flexible, their limitation is that they are difficult to interpret due to the black box nature of neural networks. While there is limited knowledge of how the models are designed and what data they are trained on, the models perform quite well when put to the test. Spark NLP performs orders of magnitude faster than other NLP libraries while providing scalability, giving Spark NLP one of the best names in the industry.

Deeper Comparison Between VADER and Spark NLP

The four main factors that differentiate VADER and Spark NLP are sentiment intensity (valence), ease of use, breadth of vocabulary, and flexibility in context application.

Figure B: Comparison of VADER and Spark NLP

Figure B: Comparison of VADER and Spark NLP

Degree of Sentiment Intensity

If it is vital for you to know how positive or negative your input is, VADER is most likely your better option. Spark NLP is only able to judge categorically: positive, negative, or neutral, and does not have any associated valence score. While Spark NLP does not directly return valence, it does output an associated confidence score, which is often mistaken for valence. However, confidence only describes how accurate the model predicts its binary positive or negative assignment. It is not valuable in determining the intensity of the sentiment itself.

Interpretability and Ease of Use

VADER’s documentation is easy to understand and readily available online. The creators have provided a research paper that details how the model was designed and why it is effective. There is also plenty of information to get you started with using the model. This makes it easy for developers who have many resources to leverage in the development process as well as for end users who may have more confidence making decisions from a model they fully understand.

Spark NLP is not as well documented. The code is open source, but information about how the models were created and how they work is very limited. Working with Spark NLP’s sentiment analysis is in effect working with a pre-trained black box model. That being said, developing with Spark NLP is still supported with a decent bank of resources to help you use the models and integrate them with your architecture. For any major obstacles, John Snow Labs, the creator of Spark NLP, provides support and consulting for their products.

Breadth of Input Vocabulary

VADER’s lexicon contains about 7,500 lexical items, all of which are non-neutral, and it includes common slang, acronyms, and emoticons like “meh,” “wtf,” and “:)“. However, any word not found in the lexicon will be considered neutral. This is a potential shortcoming in VADER when analyzing a dataset that contains large amounts of unfamiliar vocabulary, as VADER may default to classifying most of it as neutral. The number of tokens Spark NLP’s deep learning model has seen during its training phase far exceeds VADER’s lexicon, so it is more likely to be able to handle obscure tokens. Any use case that may require classification of a large amount of niche jargon or vocabulary should be solved with Spark NLP.

Media Context Flexibility

VADER is a singular model which was designed to work best in microblog-like contexts. Spark NLP is a little more robust and offers a range of sentiment analysis models for various contexts. These model variants have been trained on different datasets like Twitter posts, IMDB reviews, and financial news sources. Furthermore, some Spark NLP models also come with pre-built pipelines to integrate better with architectures specific to different media contexts. Spark NLP’s models are more versatile since you can choose which of the model variants best suits your use case, so it should be your choice if you have a specialized use case.

Given that each model has clear pros and cons, you might ask how you can leverage the advantages of both. VADER’s nuanced set of rules complements Spark NLP’s deep learning models, but integrating both models seamlessly may be difficult, which is where Credera’s tool may prove useful.

How Credera Can Help

Credera’s social media analytics accelerator utilizes both VADER and Spark NLP’s pretrained sentiment analysis models as components of its sentiment analysis tool. Using Spark NLP allows Credera to take advantage of Spark’s highly performant structure to quickly process the large volumes of ingested data. VADER’s tried-and-true algorithm complements Spark NLP by providing a sanity check for the sentiment predictions that were outputed by Spark NLP’s machine learning model. These pretrained models provide baseline trends that Credera can refine to inform more complex sentiment analysis.

One of the largest skin care companies in the world engaged Credera after they noticed spikes in search volumes for their competitors. The client was blindsided by the sudden emergence of their competitors on the then new social media platform TikTok, so they looked to Credera to develop a solution to help them proactively respond to new changes in the market.

Credera was able to apply our methodology and tools to meld data from the client’s social listening tool, which tracked events on social media and sales results to provide real time insights on the efficacy of social media strategies. Furthermore, since implementing Credera’s approach and tools, the client has been able to immediately identify changes to sentiments around both their brand and competitors’ allowing them to respond rapidly and learn from their competitors’ tactics.

Leveraging Data With NLP To Unlock Business Value

NLP can be an excellent tool to help businesses process large amounts of language data such as user analysis, trend identification, and insight extraction. If you’re interested in how you can use VADER or Spark NLP’s sentiment analysis for data analytics or how Credera’s social analytics accelerator can help you unlock your data’s full potential with actionable insights, reach out to us at marketing@credera.com.