Self-Contained Text Analytics Engines
Built for secure, hardware- optimized and high throughput document processing.
Key Capabilities
IntelliDockers is a complete Text Analytics software
solution designed to automate human cognitive tasks
through Natural Language Processing, extracting meaning
from unstructured data.
Production Ready
Each capability is delivered as a production-ready Docker container that can be deployed, integrated and used in no-time across various types of infrastructures: on-premise, private clouds and popular cloud platforms like Microsoft Azure, Amazon ECS or Google Cloud.
Unified REST API
REST API endpoints. All IntelliDockers generate a JSON response and have the same REST interface:
- GET /rest/info
- GET /rest/ready
- POST /rest/process
- POST /rest/process-file
Multi-language Support
Over 40 languages are supported, including: English, German, Italian, Polish, Hungarian, Arabic, Russian, Persian, French , Bulgarian, Romanian, Chinese, Dutch, Estonian, Finnish, Georgian, Greek, Hebrew, Hindi, Korean, Malay, Spanish, Serbian, Thai, Turkish, Ukrainian, Urdu, Vietnamese.
Optimized hardware use
IntelliDockers run on commodity harware and GPUs are not a requirement. You can get started with only 2 CPU cores and 4 GB of RAM and obtain incredible processing performance.
Secure Deployment
All engines come with an encrypted model and a licensing engine that restricts their use. Delivered as Docker containers, they are lightweight, stateless and OS-independent, and can be used in the most secure air-gaped environments.
Best-of-breed AI
IntelliDockers are built using state-of-the-art Deep Learning algorithms including Recurrent Neural Networks (RNN) and Large Language Models (LLM) – Transformers. All engines are trained using a broad range of data to best address all types of content.
Over 130 IntelliDockers engines are available on Docker Hub
Available Engines
Named Entity Recognition
Named entities, categorized by type (people, organizations, locations, products, other), constitute the core factual information of any content. Our engine extracts 17 types of named entities in over 40 languages.
Summarization
Understands the meaning of any text by reading only the key sentences of the whole content. You can optionally specify the size of the summary in percentage between 10% and 50% of the text.
Automated Translation
Automatically translates text or documents from a multitude of languages.
Automatic Classification
Automatic classification using standard IPTC and IAB taxonomies. Classification based on custom taxonomies (patents, cyber security, military intelligence or others) can be created on demand.
Semantic Comparison
The Semantic Similarity Engine finds identical meaning contained by the analyzed content pieces, ignoring syntax or grammar. You can even compare content pieces written in different languages. It can be used for clustering documents based on the information they contain.
Speech Transcription
Transcribes recorded or live audio into text.
Sentiment Analysis
Determines the polarity of any content: negative, neutral, positive. Entity-level sentiment analysis is also possible with the Context Splitter capability.
OCR
Extracts the text out of given image, scan (.jpg, .png, .tif) or PDF file using a text recognition model (Computer Vision).
Clickbait Detection
Determines the likelyhood of a news article to be written in a “clickbait” style: designed to attract attention and to entice users to follow that link and read, view, or listen to the linked piece of online content, being typically deceptive, sensationalized, or otherwise misleading