Named Entity Recognition

Posted by

Named entities, categorized by type (people, organizations, locations, products, etc.), constitute the core factual information of any content. These engines extract named entities out of given text. All flavors described below are supporting the same list of entity types, see the following section.

Supported Entity Types

We are currently supporting the following complete list of Entity Types.

LOCATIONA city, state, country, region, building, monument, body of water, park, or address.
ORGANIZATIONA corporation, institution, agency, or other group defined by an organizational structure.
PERSONA human identified by name, nickname, or alias.
TITLEAppellation associated with an occupation, office, or status.
NATIONALITYReference to a country or region of origin.
RELIGIONReference to an organized religion or theology, as well as its followers.
IDENTIFIER_CREDIT_CARD_NUMCredit card numbers.
IDENTIFIER_EMAILEmail addresses.
IDENTIFIER_MONEYCurrencies.
IDENTIFIER_PERSONAL_ID_NUMPersonal identification numbers.
IDENTIFIER_PHONE_NUMBERPhone numbers.
IDENTIFIER_URLWeb addresses.
TEMPORAL_DATE Date.
TEMPORAL_TIMETime.
IDENTIFIER_DISTANCEDistance.
IDENTIFIER_LATITUDE_LONGITUDEGeographic locations in latitude and longitude coordinates.
HASHTAGHashtags found inside an article.

Usage Example

This is an example of calling the Entities Extractor on text in Romanian using simple curl:

curl -X 'POST' \
  'http://192.168.56.25:8989/rest/process' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '
  {
    "content": "Candidatul independent la Primăria Capitalei susţinut de USR, Nicuşor Dan, afirmă că bugetul Primăriei Municipiului Bucureşti pe 2020 \"nu are nicio legătură cu realitatea\", în condiţiile în care veniturile estimate – 7,2 miliarde de lei – sunt, ca şi în anii precedenţi, supraevaluate cu aproximativ 75%, transmite Agerpres.",
    "language": "ron"
  }

Calling the Entities Extractor as above will generate the simple JSON response below:

{
  "entities": [
    {
      "entity": "Primăria Capitalei",
      "type": "ORGANIZATION",
      "count": 1,
      "details": [
        {
          "score": 0.9728477001190186,
          "start": 26,
          "end": 44
        }
      ]
    },
    {
      "entity": "USR",
      "type": "ORGANIZATION",
      "count": 1,
      "details": [
        {
          "score": 0.9944616556167603,
          "start": 57,
          "end": 60
        }
      ]
    },
    {
      "entity": "Nicuşor Dan",
      "type": "PERSON",
      "count": 1,
      "details": [
        {
          "score": 0.9825072884559631,
          "start": 62,
          "end": 73
        }
      ]
    },
    {
      "entity": "Primăriei Municipiului Bucureşti",
      "type": "ORGANIZATION",
      "count": 1,
      "details": [
        {
          "score": 0.9880717992782593,
          "start": 93,
          "end": 125
        }
      ]
    },
    {
      "entity": "2020",
      "type": "TEMPORAL_DATE",
      "count": 1,
      "details": [
        {
          "score": 0.8982105255126953,
          "start": 129,
          "end": 133
        }
      ]
    },
    {
      "entity": "7,2 miliarde de lei",
      "type": "IDENTIFIER_MONEY",
      "count": 1,
      "details": [
        {
          "score": 0.9840089082717896,
          "start": 217,
          "end": 236
        }
      ]
    },
    {
      "entity": "Agerpres",
      "type": "ORGANIZATION",
      "count": 1,
      "details": [
        {
          "score": 0.9890756011009216,
          "start": 315,
          "end": 323
        }
      ]
    }
  ]
}

XLU – Cross Language Understanding (40+ languages)

The Entity Extractor XLU is a versatile engine that generalizes for a lot of languages. That means that you can throw any language on it, even if the language is not officially listed. So, you could even try Farsi, if needed.

HW Requirements

Due to the large size of the model, we recommend running this engine in an environment equipped with an NVIDIA Graphical Processing Unit (GPU). 

Minimum recommended requirements for the GPU:

  • Type: NVIDIA Quadro 
  • RAM: 4041MB

Other minimum recommended requirements:

  • 4xCPU
  • RAM: 4GB
  • HDD: 10GB

English Extended (XT)

This engine was trained specifically for the English language.

HW Requirements

Due to the large size of the model, we recommend running this engine in an environment equipped with an NVIDIA Graphical Processing Unit (GPU). 

Minimum recommended requirements for the GPU:

  • Type: NVIDIA Quadro 
  • RAM: 4041MB

Other minimum recommended requirements:

  • 4xCPU
  • RAM: 4GB
  • HDD: 10GB

Romanian Extended (XT)

This engine was trained specifically for the Romanian language.

HW Requirements

Due to the large size of the model, we recommend running this engine in an environment equipped with an NVIDIA Graphical Processing Unit (GPU). 

On the other hand, this engine was optimized for CPU-only environments using the ONNX accelerator for rapid inference. Please let us know your environment’s particularities to offer you the best engine flavor for it.

Minimum recommended requirements for the GPU:

  • Type: NVIDIA Quadro 
  • RAM: 4041MB

Other minimum recommended requirements:

  • 4xCPU
  • RAM: 4GB
  • HDD: 10GB

3 responses