Neticle Wiki

Megmutatjuk mit gondol a Web!

Felhasználói eszközök

Eszközök a webhelyen


semantic_api_v2.0

Text analysis API

Description

An API for analyzing textual data with semantic and topic recognition.

Version History

Version Release Date Released byDescription
v2.0 2019.01.30. Zoltan Csikos Initial version for new Spring boot foundation

Changelog

v2.0:

  • New version of the text analysis API
    • Moved project under Spring Boot framework
    • Only accepting JSON request objects in the body of the message
    • Better character handling
    • Improved error handling
    • Added functionality for exclude and filter handling

Text analysis service

Description

Neticle Text Analysis provides market leading, humanly accurate sentiment and semantic analysis.

  • Entity oriented sentiment analysis: in a text only the phrases, entities and labels related to the set target entity are analyzed. The target is set by it's synonyms, spelling and misspellings.
  • Document level sentiment analysis: in a text every phrases, entities and labels are analyzed. In this case no synonyms are given as input parameter.
  • Attribute recognition: service and product attributes (for example: screen, bandwidth, etc.) are recognized.
  • Topic recognition: key topics (for example: 3G, mobile payment, etc.) are recognized.
  • Location recognition: related locations (for example: Hungary, Pécs, etc.) are recognized.
  • Brand recognition: related brands (for example: Audi, Mercedes, etc.) are recognized.
  • Emotion recognition: related emotions (for example: joy, etc.) are recognized.
  • Person recognition: related persons (for example: Bill Gates, etc.) are recognized.
  • Organization recognition: related organizations (for example: UNICEF, etc.) are recognized.
  • Event recognition: related events (for example: festivals, conferences, etc.) are recognized.
  • Business topic recognition: related business topics (for example: revenue, IPO, etc.) are recognized.
  • Legal topic recognition: related legal topics (for example: lawsuit, legislation, etc.) are recognized.
  • Medical topic recognition: related medical topics (for example: receipt, symptom, etc.) are recognized.
  • HR topic recognition: related HR topics (for example: job, salary, etc.) are recognized.

Base URL

Request object

Element Type Is required Parameter description Default value Example
token String X Token received from Neticle Labs to authenticate the user.
input String X The text to analyze. I like this nice apple.
language String The language of the input text. If this parameters is empty, the API will try to identify the language based on the text and process it accordingly. If you know the language, the preferred option is to provide it to the API.

Possible values:

bg - Bulgarian
de - German
en - English
ge - Georgian
hu - Hungarian
nl - Dutch
pl - Polish
ro - Romanian
ru - Russian
ua - Ukrainian
en
format String Format of the response. Currently only JSON is supported. json
replaceaccent Boolean If set to true, the API will remove all accents from the characters before processing.

For example it will convert the letters é á ű ó into e a u o.
false
lowercase Boolean If set to true, the API will convert the input text to lowercase before processing. false
callid String A field to be used by the client, to help tracking calls from different sources from the client's side. Length is max 255 characters. internal system 1
keywords Array of nested keyword objects Nested object array containing different keyword objects (see below), to provide keyword oriented sentiment and topic analysis that only analyze the parts related to the given keyword. If not set (or set to empty array) then the API will process a full text analysis on the provided text.

Keyword object

Element Type Is required Parameter description Default value Example
id String Id of the provided keyword for easier identification of segments in the response object. keyword_1
synonyms Array Synonyms are case-sensitive spellings, misspellings, synonyms of the target entity of the sentiment analysis. For example to analyze comments about the internet coverage, the following synonyms should be set: internet, Internet, INTERNET, net, Net, NET

If synonyms are set, then only the phrases and labels related to the target represented by synonyms will be recognized (aka. entity oriented sentiment analysis): I like this neighborhood, but the net is terrible.

If no synonyms are set, then the full text will be analyzed (aka. document level sentiment analysis). This version should be use only in special cases, because often gives bad results for precise analysis: I like this neighborhood, but the net is terrible.
internet, Internet, INTERNET, net, Net, NET
excludes Array If the input text contains any of provided exclude words, the text won't be analyzed.

For example, when you want to analyze comments about the internet coverage with the previous synonyms you might want to skip every sentence with fish net in it, as they are probably not related to the topic.
fish, Fish, FISH
filters Array If you want to narrow down a topic to a specific subtopic without changing the defined keywords, you can use filter words, that will provide a more granular way of filtering, for example you only want to analyze 3g and 4g internet coverage and not wired ones. 3g, 3G, 4g, 4G

Example

{
  "language": "en",
  "token": "token_from_neticle",
  "input": "I like this neighborhood, but the 4g net is terrible here.",
  "callid": "internal system 1",
  "format": "json",
  "stem": false,
  "replaceaccent": false,
  "lowercase": false,
  "keywords": [
    {
      "id": "net_keyword_1",
      "synonyms": ["net","Net","NET"],
      "excludes": ["fish","Fish","FISH"],
      "filters": ["3g","3G","4g","4G"]
    }
  ]
}

Response object

Element Type Description
error_code Integer 0 if no error occurred, in any other case please see the errors section below.
error_message String Empty if there was no error, otherwise a written explanation of the error.
call_id String A field to be used by the client, to help tracking calls from different sources from the client's side.
input_length Integer The number of characters in the received text.
total_processing_time_in_ms Integer Number of milliseconds of the total processing time.
results Array of nested result objects The results after processing the text. If no keyword has been sent this will contain only one element. If multiple keywords are sent this will contain one text analysis result for every keyword in the same order as the keywords were sent.

The result object

Element Type Description
analyzed Boolean True, if the request was analyzed successfully. If no keyword was sent then the text will always be analyzed. If the request was sent with a keyword then the text must contain at least one of the synonyms, must not contain any of the excludes, and if filters are set, must also contain at least one of the filter words. You can see what words has the system recognized in the recognized_synonyms, recognized_excludes and recognized_filters arrays.
keyword String The id from the keyword object in the request for easier identification if multiple keywords were sent.
processing_time_in_ms Integer The time needed to process the request in milliseconds with full analysis if no keyword was used or with the used keyword.
opinion_index Double A score that represents how positive or negative is the text. 0 means a neutral, a negative value means negative and a positive value means positive opinion.
summarized_positive_opinion_index Double The summarized value of all the positive opinion indexes in the text.
summarized_negative_opinion_index Double The summarized value of all the negative opinion indexes in the text.
html_formatted_text String The HTML and CSS formatted text.

If a sentence part contains a synonym it is surrounded by contain_keyword class span tag.

Recognized phrases are surrounded by polarity_item class span tags.

Recognized synonyms are surrounded by synonym class span tags.
recognized_synonyms Array of strings If no keyword was sent, this list will be empty. If a keyword was provided for the analysis and the keyword had synonyms, this list will contain all the synonyms that were found in the text. If synonyms were provided but none was found, the analysis will be skipped.
recognized_excludes Array of strings If no keyword was sent, this list will be empty. If a keyword was provided for the analysis and the keyword had excludes, this list will contain all the excludes that were found in the text. If excludes were provided and any of them was found in the text, the analysis will be skipped.
recognized_filters Array of strings If no keyword was sent, this list will be empty. If a keyword was provided for the analysis and the keyword had filters, this list will contain all the filters that were found in the text. If filters were provided but none was found, the analysis will be skipped.
keyword_stats Nested keyword stats object Contains statistics about keyword hits in the text if a keyword was provided.
recognized_negative_phrases Array of nested recognized negative phrase objects Negative phrases recognized in the text.

(If a keyword was provided, only the negative phrases related to the synonyms are recognized. If no synonyms set then every negative phrase is recognized in the text.)
recognized_positive_phrases Array of nested recognized positive phrase objects Positive phrases recognized in the text.

(If a keyword was provided, only the positive phrases related to the synonyms are recognized. If no synonyms set then every positive phrase is recognized in the text.)
entities Array of nested entity objects Recognized labels and entities (topics, attributes, brands, locations, etc.

(If keywords are sent with the request then only the labels, entities and phrases related to the provided keyword are recognized. If no keywords are sent then every label, entity and phrase is recognized in the text.)

The keyword stats object

Element Type Description
total_keyword_hit_number Integer Number of how many times the provided keyword was mentioned in the input text. Every synonym hit is counted as one. If no synonym is set the result is 0.
total_synonym_hit_numbers Array of nested total synonym hit number objects Detailed list of how many times each individual synonym was mentioned in the text.

The total synonym hit number object

Element Type Description
synonym_hit_number Integer Number of how many times this synonym was mentioned in the input text. Every synonym hit is counted as one.
synonym_hit String A synonym that was at least once mentioned in the text.

The entity object

Element Type Description
entity_name String The recognized entity.
entity_type String The recognized entity's type.
entity_opinion_index Double A score that represents how positive or negative this phrase is.
mention_number Integer The number of how many times this entity was mentioned in the text.
mentions Array of strings The various mentions where this entity was used.
related_pos_phrases Array of nested related phrase objects Other positive phrases that are related to this entity.
related_neg_phrases Array of nested related phrase objects Other negative phrases that are related to this entity.
related_entities Array of nested related entity objects Other entities that are related to this entity.

The recognized positive/negative phrase object

Element Type Description
phrase String The recognized positive or negative phrase in the text.
mention_number Integer The number of how many times this phrase occurred in the text.
entity_opinion_index Double A score that represents how positive or negative this phrase is.
mentions Array of strings The various mentions where this phrase was used.
related_pos_phrases Array of nested related phrase objects Other positive phrases that are related to this positive or negative phrase.
related_neg_phrases Array of nested related phrase objects Other negative phrases that are related to this positive or negative phrase.
related_entities Array of nested related entity objects Other entities that are related to this positive or negative phrase.
Element Type Description
phrase String The recognized related phrase in the text.
related_mention_number Integer The number of how many times this phrase was mentioned together with the parent phrase in the text.
mentions Array of strings The various mentions where this related phrase was used together with the parent phrase.
Element Type Description
entity_name String The recognized related entity.
entity_type String The recognized related entity's type.
related_mention_number Integer The number of how many times this entity was mentioned together with the parent phrase/entity in the text.
mentions Array of strings The various mentions where this related entity was used together with the parent phrase/entity.

Sample response

{
    "results": [
        {
            "analyzed": true,
            "keyword": "net_keyword_1",
            "processing_time_in_ms": 325,
            "html_formatted_text": "I like this neighborhood, <span class=\"contain_keyword\">but the 4g <span class=\"synonym\">net</span> <span class=\"phrase_neg_lvl2 polarity_item\" title=\"-2 \">is terrible</span> here</span>.",
            "recognized_negative_phrases": [
                {
                    "mentions": [
                        " but the 4g net is terrible here"
                    ],
                    "related_pos_phrases": [],
                    "related_neg_phrases": [],
                    "related_entities": [
                        {
                            "entity_name": "4g",
                            "entity_type": "topic",
                            "related_mention_number": 1,
                            "mentions": [
                                " but the 4g net is terrible here"
                            ]
                        }
                    ],
                    "mention_number": 1,
                    "entity_opinion_index": -2.0,
                    "phrase": "is terrible"
                }
            ],
            "recognized_positive_phrases": [],
            "entities": [
                {
                    "mentions": [
                        " but the 4g net is terrible here"
                    ],
                    "related_pos_phrases": [],
                    "related_neg_phrases": [],
                    "related_entities": [
                        {
                            "entity_name": "is terrible",
                            "entity_type": "neg_phrase",
                            "related_mention_number": 1,
                            "mentions": [
                                " but the 4g net is terrible here"
                            ]
                        }
                    ],
                    "mention_number": 1,
                    "entity_opinion_index": -2.0,
                    "entity_name": "4g",
                    "entity_type": "topic"
                }
            ],
            "recognized_synonyms": [
                "net"
            ],
            "recognized_excludes": [],
            "recognized_filters": [
                "4g"
            ],
            "keyword_stats": {
                "total_keyword_hit_number": 1,
                "total_synonym_hit_numbers": [
                    {
                        "synonym_hit_number": 1,
                        "synonym_hit": "net"
                    }
                ]
            },
            "opinion_index": -2.0,
            "summarized_positive_opinion_index": 0.0,
            "summarized_negative_opinion_index": -2.0
        }
    ],
    "error_code": 0,
    "error_message": "",
    "call_id": "internal system 1",
    "input_length": 58,
    "total_processing_time_in_ms": 325
}

Error codes

Neticle error code HTML error code Description
1 403 This IP has been blacklisted. Please contact us at dev at neticle dot com for further details.
2 501 This language is not implemented yet. Further details can be found on the https://api.neticle.com website. If you would like us to implement this language, please contact us on the sales at neticle dot com email address.
3 400 Missing token parameter. This is a mandatory parameter, please always include it in the request. Further details can be found on the https://api.neticle.com website.
4 403 This token is not valid. If you would like to renew it, please refer to the https://api.neticle.com website or contact us at dev at neticle dot com for further details.
5 400 The input parameter is missing or empty. Please refer to the https://api.neticle.com website for further details.
6 400 The format parameter is incorrect. For supported formats please refer to the https://api.neticle.com website for further details.
7 400 The provided version is not supported. Please refer to the https://api.neticle.com website for further details.
8 400 The callId parameter is too long, it must be less than 255 characters. Please refer to the https://api.neticle.com website for further details.
100 500 Something went wrong during the process. Please contact us at dev at neticle dot com for further details.

Sample codes

CURL call example

curl --header "Content-Type: application/json" --request POST --data '{"language":"en","token":"token_from_neticle","input":"I like this neighborhood, but the 4g net is terrible here.","callid": "internal system 1","format":"json","replaceaccent":false,"lowercase":false,"keywords":[{"id": "net_keyword_1","synonyms":["net","Net","NET"],"excludes":["fish","Fish","FISH"],"filters":["3g","3G","4g","4G"]}]}' https://textanalysis.neticle.com/2.0/text_analysis

JAVA code example

Maven dependencies:

<dependency>
    <groupId>com.mashape.unirest</groupId>
    <artifactId>unirest-java</artifactId>
    <version>1.4.9</version>
</dependency>
<dependency>
    <groupId>com.google.code.gson</groupId>
    <artifactId>gson</artifactId>
    <version>2.8.5</version>
</dependency>

Java code example:

package com.neticle;

import com.google.gson.Gson;
import com.mashape.unirest.http.HttpResponse;
import com.mashape.unirest.http.JsonNode;
import com.mashape.unirest.http.Unirest;
import java.util.ArrayList;
import java.util.Arrays;
import java.util.List;

public class TestApplication {
    
    public static void main(String[] args) {
        TestApplication obj = new TestApplication();
        obj.run();
    }
    
    public void run(){
        try {
            // Creating google json parser.
            Gson gson = new Gson();
            
            TextAnalysisKeyword textAnalysisKeyword = new TextAnalysisKeyword();
            textAnalysisKeyword.setId("net_keyword_1");
            textAnalysisKeyword.setSynonyms(Arrays.asList("net","Net","NET"));
            textAnalysisKeyword.setExcludes(Arrays.asList("fish","Fish","FISH"));
            textAnalysisKeyword.setFilters(Arrays.asList("3g","3G","4g","4G"));
            
            // Creating request object.
            TextAnalysisApiRequest textAnalysisApiRequest = new TextAnalysisApiRequest();
            textAnalysisApiRequest.setCallid("internal system 1");
            textAnalysisApiRequest.setFormat("json");
            textAnalysisApiRequest.setInput("I like this neighborhood, but the 4g net is terrible here.");
            textAnalysisApiRequest.setLanguage("en");
            textAnalysisApiRequest.setLowercase(false);
            textAnalysisApiRequest.setReplaceaccent(false);
            textAnalysisApiRequest.setToken("token_from_neticle");
            textAnalysisApiRequest.setKeywords(Arrays.asList(textAnalysisKeyword));
            
            // Sending request object in body as json.
            // Receiving json object as answer.
            HttpResponse<JsonNode> jsonResponse = Unirest.post("https://textanalysis.neticle.com/2.0/text_analysis")
                .header("content-type", "application/json")
                .body(gson.toJson(textAnalysisApiRequest))
                .asJson();
            
            // Printing result.
            System.out.println(jsonResponse.getBody().getObject().toString());
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }    
    }
    
    public class TextAnalysisApiRequest {
        private String language = "";
        private String token = "";
        private String input = "";
        private String callid = "";
        private String format = "json";
        private boolean replaceaccent = false;
        private boolean lowercase = false;
        private List<TextAnalysisKeyword> keywords = new ArrayList<>();

        public TextAnalysisApiRequest() {}
        public String getLanguage() {return language;}
        public void setLanguage(String language) {this.language = language;}
        public String getToken() {return token;}
        public void setToken(String token) {this.token = token;}
        public String getInput() {return input;}
        public void setInput(String input) {this.input = input;}
        public boolean isReplaceaccent() {return replaceaccent;}
        public void setReplaceaccent(boolean replaceaccent) {this.replaceaccent = replaceaccent;}
        public String getCallid() {return callid;}
        public void setCallid(String callid) {this.callid = callid;}
        public String getFormat() {return format;}
        public void setFormat(String format) {this.format = format;}
        public boolean isLowercase() {return lowercase;}
        public void setLowercase(boolean lowercase) {this.lowercase = lowercase;}
        public List<TextAnalysisKeyword> getKeywords() {return keywords;}
        public void setKeywords(List<TextAnalysisKeyword> keywords) {this.keywords = keywords;}
    }
    
    public class TextAnalysisKeyword {
        private String id = "";
        private List<String> synonyms = new ArrayList();
        private List<String> excludes = new ArrayList();
        private List<String> filters = new ArrayList();

        public TextAnalysisKeyword() {}
        public String getId() {return id;}
        public void setId(String id) {this.id = id;}
        public List<String> getSynonyms() {return synonyms;}
        public void setSynonyms(List<String> synonyms) {this.synonyms = synonyms;}
        public List<String> getExcludes() {return excludes;}
        public void setExcludes(List<String> excludes) {this.excludes = excludes;}
        public List<String> getFilters() {return filters;}
        public void setFilters(List<String> filters) {this.filters = filters;}
    }
}

PHP code example

Composer dependency:

php composer.phar require guzzlehttp/guzzle:~6.0
<?php
require 'vendor/autoload.php';

$client = new \GuzzleHttp\Client();
$httpResponse = $client->post('https://textanalysis.neticle.com/2.0/text_analysis', [
    'json' => [
        'language' => 'en',
        'token' => 'token_from_neticle',
        'input' => 'I like this neighborhood, but the 4g net is terrible here.',
        'callid' => 'internal system 1',
        'format' => 'json',
        'replaceaccent' => false,
        'lowercase' => false,
        'keywords' => [
            'id' => 'net_keyword_1',
            'synonyms' => ['net', 'Net', 'NET'],
            'excludes' => ['fish', 'Fish', 'FISH'],
            'filters' => ['3g', '3G', '4g', '4G'],
        ],
    ],
]);

var_dump(json_decode($httpResponse->getBody()->getContents()));

Python code example

import requests

url = 'https://textanalysis.neticle.com/2.0/text_analysis'
body = {
  'language': 'en',
  'token': 'zurvey3435xsd',
  'input': 'I like this neighborhood, but the 4g net is terrible here.',
  'callid': 'internal system 1',
  'format': 'json',
  'replaceaccent': False,
  'lowercase': False,
  'keywords': [
    {
      'id': 'net_keyword_1',
      'synonyms': ['net','Net','NET'],
      'excludes': ['fish','Fish','FISH'],
      'filters': ['3g','3G','4g','4G']
    }
  ]
}
headers = {'Content-type': 'application/json'}

response = requests.post(url, json=body, headers=headers)

print response.json()
semantic_api_v2.0.txt · Utolsó módosítás: 2019/04/09 12:32 szerkesztette: zoltancsikos