Today, we’ll be following up on our recent post on the Google Cloud Natural Language API. In this post, we’re going to take a second look at the service and compare it to the Stanford CoreNLP, a well-known suite for Natural Language Processing (NLP). We will walk you through how to get started using the Stanford CoreNLP, and then we’ll discuss the strengths and weaknesses of the two solutions.

AI and machine learning in the cloud

Artificial intelligence and machine learning are some of the hottest topics in IT. The major cloud platforms—Amazon Web Services, Google Cloud Platform, and Microsoft Azure—are increasingly exposing a variety of these functions in a way that makes it easy for developers to integrate them into their apps.

Whether it is for recognizing the content of images (see our posts about the Google Vision API, Amazon Rekognition and a comparison of the two), or for understanding words spoken in a recorded speech (Getting Started with Google Cloud Speech API), or crunching data using robust, standard algorithms, (Amazon Machine Learning: Use Cases and a Real Example in Python), today it’s easy to get started with a ready-to-go solution where all of the underlying complexity is conveniently hidden by the cloud.

The Stanford CoreNLP suite

The Stanford CoreNLP suite is a software toolkit released by the NLP research group at Stanford University. It offers Java-based modules for the solution of a range of basic NLP tasks, as well as the means to extend its functionalities with new ones. The evolution of the suite is related to cutting-edge Stanford research.

The Stanford CoreNLP is not a cloud-based service. Instead, you can:

  • Test the service on its Core NLP demo page
  • Download it as a package   (Java 1.8+ required; latest CoreNLP release 3.7.0 at the time of writing)

In the following sections, we’ll focus on the second option.

The CoreNLP suite can be accessed via command line, via the native Java programmatic API, or deployed as a web API server. For the sake of coding language freedom, we’ll stick with the server option for the purposes of this post. The examples will be based on the pycorenlp Python client, but many other clients exist for the most popular languages.

CoreNLP installation

First, let’s set up the CoreNLP server, which is accomplished through a few easy steps:

  1. If you don’t have it already, install the JDK, version 1.8 or higher
  2. Downloadthe stanford-corenlp-full zip file and unzip it in the folder of your choice
  3. From within that folder, launch

If no value is provided, the default port is 9000.

Now, let’s check if everything went well. Assuming the default configuration, you should be able to reach the page http://localhost:9000 on your machine and see an interface like this:


From there, you can already start playing around and submit small texts to the CoreNLP engine through the provided User Interface. The CoreNLP server runs the NLP analysis and graphically displays the results.


Submit an analysis via code

It’s time for our first call to the server via code. First, we need to make sure that we have the necessary libraries in the requirements (e.g., to be added to the requirements file and installed through pip), i.e.,

Next, we can use the following simple snippet of code to send the first piece of text to our local CoreNLP server.

Only a few parameters are needed:

  • The text to analyze.
  • The output format for your analysis results. In this case, we chose a JSON output.
  • A list of annotator keywords, which defines how you want the text to be analyzed or annotated(i.e., the NLP tasks to be performed on it). In this snippet, we are asking for basic grammar/syntax analysis (depparse), entity extraction (ner and entitymentions), and sentiment polarity detection (sentiment).

For a comprehensive description of all options, have a look at the official documentation (particularly the CoreNLP server guide and the available annotator overview ).

After launching the code, the server might take a few seconds (the very first analysis launched on a fresh server instance requires bootstrapping the chosen annotators). When the analysis is completed, the server returns a JSON result of the following type.

You can note a similarity between this output and that returned by Google Cloud Natural Language API (see our first post about NLP with Google API for a review of the basic concepts).

Let’s see where we can find the desired information:

  • Grammar and syntax information. This is contained in the “tokens” section and in the “Dependencies” sections (“basic,” “enhanced,” and “enhancedPlusPlus”). See the Stanford typed dependencies manual and the Enhanced Dependencies reference for details on their differences.
  • Named entity recognition. Information on relevant entities is provided in the “tokens” section, under the “ner” attribute, and more concisely in the “entitymentions” section.
  • Sentiment polarity. The “sentiment” attribute expresses the polarity (including the possibility of Neutral), with a corresponding numerical value in “sentimentValue” (high values for positive sentiment).

With this analysis at our disposal, we will make a few experiments to qualitatively compare the Google Cloud Natural Language API and the Stanford engine. Our analysis is limited to a few sample texts that we submitted to both the NLP tools and is not an exhaustive comparison.

Feature comparison

For the sake of our qualitative comparison, we’ll use the same text from ABC news we chose when testing the Google Cloud Natural Language API:

“Joshua Brown, 40, was killed in Florida in May when his Tesla failed to differentiate between the side of a turning truck and the sky while operating in autopilot mode.”

We can proceed category by category.

Grammar and syntax

Both engines seem to work well. Apart from dependency tree conventions that are a bit different in each, the two platforms returned roughly the same results. 

Entity extraction

The entity extraction function behaves as expected for both services and is able to detect the main entities: Joshua Brown, Florida, and Tesla.

Stanford also retrieves the number 40 and recognizes the month May, which might be very useful for several applications. However, Tesla is assigned the type PERSON. Instead, the Google API classifies Tesla as an ORGANIZATION (a bit better) but misses May as a DATE.

In this case, we will give a point to Google for its ability to link recognized entities to their Wikipedia page with quite good disambiguation capabilities. This task is backed by Google’s huge (and presumably ever-evolving) knowledge base that is likely to get even better over time.

Stanford CoreNLP also provides a similar feature for linking detected entities to their Wikipedia page. Qualitative tests suggest that the accuracy of Google’s disambiguation capabilities for this task are better. Another drawback comes from the fact that Stanford requires a separate model file for this feature (the english-models-kbp from CoreNLP Github page), and it is demanding in terms of memory.

Sentiment analysis

The sentiment analysis correctly detects a “negative” sentiment for this text in both engines. By default, CoreNLP returns only the sentiment class, while Google also provides two real numbers for polarity and magnitude. Both analyses show a separate sentiment value for all sentences in the text, but CoreNLP does not aggregate them in a single overall score.

As a comparison with our earlier sentiment experiment with Google, we can increasingly remove polarity-relevant words from the input text and see how the CoreNLP analysis changes.

Input text Returned sentiment (CoreNLP)

1. Removed “killed”

Joshua Brown, 40, was in Florida in May when his Tesla failed to differentiate between the side of a turning truck and the sky while operating in autopilot mode.

(score: 1)

2. Removed “failed to”

Joshua Brown, 40, was in Florida in May when his Tesla differentiate between the side of a turning truck and the sky while operating in autopilot mode.

(score: 1)
3. Removed

  • “ 40,”
  • “in May”
  • “the side of”
  • “while operating in autopilot mode”

Joshua Brown was in Florida when his Tesla differentiate between a turning truck and the sky.

(score: 2)

The Google Cloud Natural Language API returned “negative,” “positive,” and “positive” for these inputs.

The first result is probably correct for both providers. The other two results are a bit more unclear.

-“Negative” for case 2 is probably wrong, although “positive” is questionable, too.

-“Positive” for case 3 can also be questioned. “Neutral” is probably more correct in this case.

Overall, this very quick test does not highlight any huge quality gaps.

Extra features on Stanford CoreNLP

CoreNLP exposes several interesting functions in addition to those we explored above. One such feature is additional text annotators, which is not currently available on the Google Cloud service. Let’s take a look at some of the most interesting annotators.

Coreference annotator

Coreference resolution is the NLP task of identifying all words in a text that refer to the same entity. For example, in the following sentence, it understands that “Albert Einstein”, “he” and “his” all refer to the same person.

Albert Einstein was a smart guy. He received his Nobel Prize in 1921.”

This capability is quite relevant when trying to aggregate all of the available information about a specific entity in one or multiple texts, since references to that entity hidden that are behind a pronoun are easily lost. In the example above, we would know that Albert is smart, but not when he won the Nobel Prize.

The task is not an easy one, and the Stanford CoreNLP suite offers the dcoref annotator to take care of it. Let’s take a look at some examples, either via code or via GUI (your local one or the online demo available for quick tests on

Text1: John met Jane in 2013. He married her a few years later.” In this case, the engine correctly retrieves the associations he → John, her → Jane, likely helped by the gender hints.


Text2: “Right now I’m testing CoreNLP and its features. It provides an annotator for coreference resolution.” The task is not limited to person mentions, as seen here. “Its” and “It” are correctly retrieved as co-referring and pointing to CoreNLP.

It’s easy to come up with difficult or ambiguous cases.

Text3: “Right now I’m testing CoreNLP and its features. It‘s always a pleasure to submit tricky texts.”  This use of “it” to represent an entire (following) clause is one of the curses of coreference resolution, as in this case in which “CoreNLP” and “its” (correctly co-referenced) are also linked to “it” (wrong).

You can continue testing the feature with other sentences such as:

  • “Luke’s first master was Obi-Wan. He was a good teacher.” – involves semantics
  • “The Chicago Bulls could count on Michael Jordan during their six winning seasons, and they are considered one of the strongest teams ever.” – plural pronoun

At the time of this test, Google Natural Language has not exposed a feature like this one. It would definitely be a nice addition to the service (and we believe it’s likely that Google will add this at some point).

Cool Regex engines

A cool feature that CoreNLP offers is a set of specialized regex engines. These are based not only on plain text but also on a specific parsing structure detected in the text itself (e.g. the dependency tree of a sentence). Available engines are TokensRegex, Semgrex, and Tregex for matching patterns on tokens, on semantic relations, and on parse trees, respectively. Although relying on different structures, the logic behind the three engines is similar. (You can find more examples of Semgrex in the documentation.)

Semgrex is able to find portions of text that match a query related to the entity type or role of text tokens and/or their syntactical dependence on other tokens in the sentence. For example, let’s assume that I want to identify all of the singular nouns in the following sentence:

All men are mortal. Socrates is a man. Therefore Socrates is mortal.

Semgrex writes the expression

to look for all nodes ({}-brackets represent the concept of node) whose part-of-speech is NN (singular common noun, in Stanford’s notation). This returns a match with “man.” You can also compose more complex conditions, e.g.,

to get common nouns (both singular and plural) and adjectives, thus obtaining “men,” “mortal,” “man,” and again “mortal.”

As mentioned, we can retrieve nodes by their entity type. For example {ner:PERSON} returns the two instances of “Socrates.”

And, expressions can involve more nodes and the relations between them. If we look for

we get “men,”,“Socrates,” and “Socrates,” i.e. all nodes that are the subject of another node.

The two patterns can be combined, for example, to get all subjects that are also entities of type PERSON, with

which returns only the two “Socrates.”


The patterns can be complicated in interesting and potentially useful ways, and allow us to precisely isolate the desired information with a simple high-level language.


The features offered by Stanford CoreNLP are qualitatively comparable with those offered by Google, although the resources potentially available in a cloud environment represent a huge advantage. In several experiments the entity extraction and syntax parsing of Google API slightly outperformed CoreNLP. As an example of this, the task of linking detected entities to their Wikipedia page shows how it’s possible to take advantage of continuous updates of a cloud-based service without having to allocate one’s local resources to use it.

Still, CoreNLP showcases some very nice features that are missing in Google’s offering, and its language coverage is a bit wider. (CoreNLP provides pre-trained models for English, Arabic, Chinese, German, French, and Spanish out of the box, plus some community-made models. Google currently supports English, Spanish and Japanese supported by Google.)

You should also keep in mind that the two engines are of markedly different origin. CoreNLP comes from academia, which means it prioritizes “novel” over “stable/business-reliable,” while Google aims at exposing a dependable service, possibly at the expense of leaving some cool features out of the picture (at least for now).

To summarize

Choose StanfordCoreNLP if:

  • You need to deploy a fully functional NLP system on your local machine
  • You want to play with cutting-edge features that are not necessarily easy to find in industrial NLP platforms, and that you can extend with custom modules

Choose Google Cloud Natural Language API if:

  • You want to take advantage of large, Google-scale computing structure and data
  • You need a reliable/industrial-strength system

If we believe in the best embodiment of the academia-industry relationship, maybe we’ll see ideas flowing from one to the other to finally provide the most sophisticated AI and NLP features with industry-level stability.

Stay tuned for more discussion of NLP in upcoming posts!

Comments are closed.