Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 5231. Отображено 200.
08-06-2018 дата публикации

СЕНТИМЕНТНЫЙ АНАЛИЗ НА УРОВНЕ АСПЕКТОВ С ИСПОЛЬЗОВАНИЕМ МЕТОДОВ МАШИННОГО ОБУЧЕНИЯ

Номер: RU2657173C2

Изобретение относится к извлечению информации из текстов на естественных языках. Техническим результатом является повышение точности сентиментного анализа текстов на естественном языке, осуществляемого на уровне аспектов текстов. В способе сентиментного анализа текстов на естественном языке выполняют синтактико-семантический анализ части текста на естественном языке для получения множества синтактико-семантических структур. Выполняют интерпретацию синтактико-семантических структур с помощью набора продукционных правил для обнаружения в части текста на естественном языке аспектного термина, представляющего аспект, связанный с целевой сущностью. Вычисляют с помощью характеристик текста, полученных при синтактико-семантическом анализе, значение функции классификатора для определения тональности, связанной с аспектным термином. Создают отчет, содержащий иерархический список аспектных терминов, включающих выявленные аспекты и тональности выявленных аспектов. Определяют с использованием обучающей ...

Подробнее
15-07-2005 дата публикации

MACHINE-AIDED TRANSLATION TOOLS

Номер: AT0000298111T
Принадлежит:

Подробнее
06-07-2017 дата публикации

Techniques for graph based natural language processing

Номер: AU2014415625A1
Принадлежит: Cotters Patent & Trade Mark Attorneys

Techniques for graph based natural language processing are described. In one embodiment an apparatus may comprise a client service component operative on the processor circuit to receive a natural language user request from a device and to execute the natural language user request based on matched one or more objects and a social object relation component operative on the processor circuit to match the natural language user request to the one or more objects in an object graph, the object graph comprising token mappings for objects within the object graph, the token mappings based on data extracted from a plurality of interactions by a plurality of users of the network system, wherein the one or more objects are matched with the natural language user request based on the token mappings. Other embodiments are described and claimed.

Подробнее
28-02-2019 дата публикации

Method of and system for inferring user intent in search input in a conversational interaction system

Номер: AU2013292377B2
Принадлежит: Spruson & Ferguson

A method of inferring user intent in search input in a conversational interaction system is disclosed. A method of inferring user intent in a search input includes providing a user preference signature that describes preferences of the user, receiving search input from the user intended by the user to identify at least one desired item, and determining that a portion of the search input contains an ambiguous identifier. The ambiguous identifier is intended by the user to identify, at least in part, a desired item. The method further includes inferring a meaning for the ambiguous identifier based on matching portions of the search input to the preferences of the user described by the user preference signature and selecting items from a set of content items based on comparing the search input and the inferred meaning of the ambiguous identifier with metadata associated with the content items.

Подробнее
17-08-2017 дата публикации

SYSTEM AND ENGINE FOR SEEDED CLUSTERING OF NEWS EVENTS

Номер: AU2017200585A1
Принадлежит: AJ PARK

The present invention provides a seeded news event clustering and retrieval system configured to first create a candidate data set of documents, second create a set of initial clusters based on nearness or duplicate similarity status, and third create an aggregate cluster by merging initial clusters with seed documents. The invention generates top-level clusters for news events based on an editorially supplied topical label or "seed" component and generates sub-topic-focused clusters based on algorithm. The system uses an agglomerative clustering algorithm to gather and structure documents into distinct result sets. Decisions on whether to merge related documents or clusters are made according to similarity of evidence derived from two distinct sources, one, relying on a digital signature based on the unstructured text in the document, the other based on the presence of named entity tags that have been assigned to the document by an event or named entity tagger such as the Thomson Reuters ...

Подробнее
27-06-2019 дата публикации

Method and system for extraction of relevant sections from plurality of documents

Номер: AU2018279013A1
Принадлежит: IP SOLVED (ANZ) PTY. LTD.

METHOD AND SYSTEM FOR EXTRACTION OF RELEVANT SECTIONS FROM PLURALITY OF DOCUMENTS 5 Embodiments of the present disclosure, implements method of extracting relevant sections from a plurality of documents by (a) receiving an input document from a user; (b) converting, the input document to a standard text file; (c) classifying, the standard text file to obtain a labelled text file associated with at least one cluster from a plurality of clusters; (d) extracting, from the labelled text file to obtain a plurality of relevant 10 entities associated with at least one cluster in the plurality of clusters; (e) annotating, the standard text file by the extracted plurality of relevant entities to obtain an annotated enriched text file; (f) identifying, a plurality of section boundaries to obtain a sectioned data; and (g) extracting, relevant sections of the plurality of documents based on the plurality of relationship associated with the set of relevant entities. as 0 /\A /\ oCIl ...

Подробнее
08-01-2019 дата публикации

Data processing method and apparatus, and mobile terminal

Номер: CN0109165292A
Автор: DUAN YAOHUI
Принадлежит:

Подробнее
07-12-2018 дата публикации

A named entity identification method and apparatus

Номер: CN0108959262A
Принадлежит:

Подробнее
23-10-2014 дата публикации

CJK NAME DETECTION

Номер: KR0101453937B1
Автор:
Принадлежит:

Подробнее
18-09-2019 дата публикации

Номер: KR0102022343B1
Автор:
Принадлежит:

Подробнее
13-12-2019 дата публикации

METHOD, APPARAUTS AND SYSTEM FOR NAMED ENTITY LINKING AND COMPUTER PROGRAM THEREOF

Номер: KR1020190138623A
Автор:
Принадлежит:

Подробнее
28-07-2016 дата публикации

APPARATUS OF IDENTIFYING ENTITY′S URI IN TEXT BASED ON URI DEFINITION STATEMENT AND ADDITIONAL INFORMATION, METHOD OF ESTIMATING TOPIC DISTRIBUTION, AND METHOD OF SELECTING URI

Номер: KR1020160089847A
Принадлежит:

An apparatus of identifying entity′s uniform resource identifier (URI) in text based on a URI definition statement and additional information, a method of estimating topic distribution, and a method of selecting a URI are disclosed. Here, the apparatus includes: a topic distribution estimating unit generating the topic distribution of a separate URI document from a topic distribution estimated with respect to a URI document set that includes a definition statement and additional information on all URI′s; and a URI selecting unit extracting an entity surface type from query text when the query text including the entity surface type, which is a target to be identified for the URI, is input, estimating the topic distribution of the query text, and selecting a URI corresponding to the entity surface type, based on the topic distribution. COPYRIGHT KIPO 2016 (100) Input unit (200) Topic distribution estimation unit (300) URI selection unit (400) Output unit (AA) URI identification unit ...

Подробнее
10-08-2006 дата публикации

SHORT QUERY-BASED SYSTEM AND METHOD FOR CONTENT SEARCHING

Номер: WO2006083974A2
Принадлежит:

Embodiments of the invention provide systems and methods for processing queries provided as short messages. Keywords can be extracted from short messages and specific meaning can be derived and attributed to the short messages based on various attributes and context associated with the message, time of day and a user. Responses are provided that comprise content from information sources identified as best-fit search result. Responses are provided that comprise a menu having options identifying plural high probability search results. An alert system is disclosed for generating and managing alerts based on search results. Based on search results, a user can be connected to an information service or a transactional system and can be provided with advertising, marketing and help information.

Подробнее
06-08-2009 дата публикации

FINANCIAL EVENT AND RELATIONSHIP EXTRACTION

Номер: WO2009097558A9
Принадлежит:

For automated text processing, the inventors devised, among other things, an exemplary system (100) that automatically extracts financial events from various unstructured text based sources, such as press releases and news articles. Extracted events, such as mergers & acquisitions, earnings guidance reports, and actual earnings announcements, are represented as structured data records which can be linked, searched, and displayed and used as a basis for controlling accessing to the source documents and other related financial documents for named entities.

Подробнее
25-09-2003 дата публикации

NAMED ENTITY TRANSLATION

Номер: WO0003079225A1
Принадлежит:

Translating named entities (110) from a source language to a target language. In general, in one implementation, the technique includes: generating potential translations of a named entity from a source language to a target language using a pronunciation-based and spelling-based transliteration model (210), searching a monolingual resource (220) in the target language for information relating to usage frequency, and providing output including at least one of the potential translations based on the usage frequency (230).

Подробнее
07-01-2015 дата публикации

System and method for mapping text phrases to geographical locations

Номер: EP2631814A3
Автор: Bier, Eric A., Wu, Anna
Принадлежит:

A system and method for mapping text phrases to geographical locations is provided. Entities, each comprising one of a location, person, and place, are identified in one or more documents. Possible candidate locations associated with each entity are determined. An initial score is assigned to each location. The initial scores are adjusted and the candidate location with the highest adjusted score is selected for each entity. The selected candidate location is applied to all occurrences of the entity in the documents.

Подробнее
07-04-2010 дата публикации

Semantically-driven extraction of relations between named entities

Номер: EP2172849A1
Принадлежит:

A system and method of developing rules for text processing enable retrieval of instances of named entities in a predetermined semantic relation (such as the DATE and PLACE of an EVENT) by extracting patterns from text strings in which attested examples of named entities satisfying the semantic relation occur. The patterns are generalized to form rules which can be added to the existing rules of a syntactic parser and subsequently applied to text to find candidate instances of other named entities in the predetermined semantic relation.

Подробнее
15-09-2010 дата публикации

CJK NAME DETECTION

Номер: EP2227757A1
Автор: WU, Jun, XU, Hui, ZHANG, Yifei
Принадлежит:

Подробнее
15-11-2008 дата публикации

PROCEDURE FOR THE NORMALIZATION OF LARGE AND SMALL LETTERS.

Номер: AT0000413651T
Автор: EJERHED EVA, EJERHED, EVA
Принадлежит:

Подробнее
15-11-2011 дата публикации

UNIFORM TREATMENT OF DATA SCARCENESS AND DATA OVER ADJUSTMENT WITH THE MAXIMUM ENTROPY MODELING

Номер: AT0000531034T
Принадлежит:

Подробнее
07-02-2019 дата публикации

Systems and methods for disambiguating a term based on static and temporal knowledge graphs

Номер: AU2017302650A1
Принадлежит: Spruson & Ferguson

Systems and methods provided herein for a determining a meaning of an ambiguous term in a text segment based on a context term, a static knowledge graph and a temporal knowledge graph. These systems and methods access a first knowledge graph associated with the context term to determine a potential term that is the meaning of the unknown term. Upon determining that there are multiple potential terms in the first knowledge graph that could be the meaning of the unknown term, the systems and methods take into account the temporal dimension of relationships between entities for disambiguating the meaning of the unknown term. The systems and methods achieve this by determining a time stamp of the text segment and accessing a second knowledge graph associated with the first context term and related to the time stamp to determine the potential term that is the meaning of the first unknown term.

Подробнее
05-09-2019 дата публикации

Systems and methods for automatic semantic token tagging

Номер: AU2018214675A1

A computing system can receive a request to apply semantic token tagging on a specified domain, and can retrieve a set of data associated with the specified domain from a data storage facility. Canonical sequences can be formed from strings included in the data set. Each canonical sequence can be permutated to form sequence variations and each sequence variation can be verified against a generalized domain. Semantic token tagging can be applied to the specified domain using a subset of the sequence variations that are successfully verified as training data.

Подробнее
26-10-2000 дата публикации

SYSTEM AND METHOD FOR ENHANCING DOCUMENT TRANSLATABILITY

Номер: CA0002371120A1
Принадлежит:

A teletranslation system and method for enhancing document translatability. The teletranslation system translates a document from one natural language to another. The system comprises an aggregate filter having a plurality of sections, each section performing a specific process or processes on the document in a predetermined order, each section having at least one atomic filter, and at least one MT engine for translating the processed document. The aggregate filter comprises a format conversion section, a text improvement section, a word tagging section, and a translation section. The aggregate filter analyzes the document based on a source text, format information, and a target language. The method further comprises the step of gathering specific data on the document at some atomic filters during the preprocessing step of their first pass, and using such specific data during the post-processing step of their second pass.

Подробнее
09-08-2018 дата публикации

SYSTEMS AND METHODS FOR AUTOMATIC SEMANTIC TOKEN TAGGING

Номер: CA0003052638A1
Принадлежит: CASSAN MACLEAN IP AGENCY INC.

A computing system can receive a request to apply semantic token tagging on a specified domain, and can retrieve a set of data associated with the specified domain from a data storage facility. Canonical sequences can be formed from strings included in the data set. Each canonical sequence can be permutated to form sequence variations and each sequence variation can be verified against a generalized domain. Semantic token tagging can be applied to the specified domain using a subset of the sequence variations that are successfully verified as training data.

Подробнее
22-03-2019 дата публикации

Method and device for identifying legal entity

Номер: CN0109508458A
Автор: CHEN HUAJIE
Принадлежит:

Подробнее
15-02-2019 дата публикации

Named entity recognition model training method, named entity recognition method and device

Номер: CN0109344401A
Автор: LIU JUN, CHEN ZI'AN
Принадлежит:

Подробнее
06-11-2018 дата публикации

Smart searching method and device and computer readable memory medium

Номер: CN0108763529A
Автор: ZHANG MIN, DONG XIAOZHENG
Принадлежит:

Подробнее
29-03-2019 дата публикации

A named entity model and system based on the combination of active learning and in-depth learning

Номер: CN0109543181A
Автор: ZHANG LIWEN, CHENG GUOGEN
Принадлежит:

Подробнее
21-12-2018 дата публикации

Trade name recognition method based on full-text attention mechanism

Номер: CN0109062893A
Принадлежит:

Подробнее
29-11-2013 дата публикации

Natural language input processing method for recognition of language, involves providing set of contextual equipments, and validating and/or suggesting set of solutions that is identified and/or suggested by user

Номер: FR0002991077A1
Автор: LIGER FRANCOIS
Принадлежит: ERGONOTICS SAS

La présente invention concerne un procédé de traitement d'une saisie en langage naturel caractérisé en ce qu'il comprend des étapes de : (a) Saisie d'une phrase via une interface de saisie d'un équipement informatique ; (b) Découpage par des moyens de traitement de l'équipement de la phrase de façon à en extraire une pluralité d'entités sémantiques (E), l'une des entités sémantiques (E) étant identifiée comme une action (A), et les autres étant identifiées comme des valeurs (Bs) d'au moins un attribut (B) associé à l'action (A) identifiée ; (c) Proposition d'une valeur (Bs) par les moyens de traitement pour au moins un attribut (B) associé à l'action (A) pour lequel aucune valeur (Bs) n'a encore été identifiée et/ou proposée, en fonction des valeurs (Bs) déjà identifiées et/ou proposées et d'éléments contextuels dont dispose l'équipement ; (d) Validation par l'utilisateur des solutions (Bs) identifiées et/ou proposées. L'invention concerne en outre un produit programme d'ordinateur.

Подробнее
27-06-2013 дата публикации

DEFINITION EXTRACTION

Номер: KR0101279707B1
Автор:
Принадлежит:

Подробнее
27-08-2009 дата публикации

TERM IDENTIFICATION METHODS AND APPARATUS

Номер: WO2009104023A2
Принадлежит:

A method of assigning an identifier to a mention of an entity in a document carried out by computing apparatus including a display and one or more user operable input devices. A plurality of candidate identifiers are received from a term identification module in respect of a mention of an entity in a document, each candidate identifier being a reference to an entity in connection with which entity property data is stored in one or more entity databases. A list is displayed in a first region of the display, the list having a plurality of user-selectable entries, each entry in the list concerning the entity referred to by one of the said plurality of candidate identifiers, each entry comprising properties of the respective entity. At least one of the said properties is retrieved from the said one or more entity databases. In response to the selection by a user of an entry in the list, additional properties of the entity which the selected entry concerns are displayed in a second region of ...

Подробнее
13-03-2008 дата публикации

Support apparatus for object-oriented analysis and design

Номер: US20080065370A1
Автор: Takashi Kimoto
Принадлежит:

A support apparatus for object-oriented analysis and design comprising: a sentence input section that receives as input a plurality of sentences of the requirement and the design paper of a system as scenario; a sentence pickup section that picks up the input plurality of sentences one by one sequentially from the sentence input section; a subject extracting section that extracts the noun that makes the subject from one of the sentences picked up as target; an object extracting section that extracts the noun as object object; a verb extracting section that extracts the verb of the sentence; and a continuous sentences linking condition confirming section that checks the link of the first sentence and the second sentence; and a class figure preparing section that prepares a class figure from the subject object of the first sentence and the subject object of the second sentence and the link condition.

Подробнее
22-03-2018 дата публикации

INTERNET CLOUD-HOSTED NATURAL LANGUAGE INTERACTIVE MESSAGING SYSTEM WITH ENTITY-BASED COMMUNICATION

Номер: US20180083893A1
Принадлежит: Oracle International Corporation

Provided are methods, systems, and computer-program products for responding to a natural language communication, sending a response to request additional information from a user, and exposing an invocable method for accessing a virtual database. Some examples relate to a bot server that can respond to natural-language messages (e.g., questions or comments) through a messaging application using natural-language messages. Other examples relate to storage of event data associated with a web page or a mobile application. Event data can describe one or more actions performed in relation to the web page and/or the mobile application. Other examples relate to behavioral analytics of the event data.

Подробнее
09-09-2019 дата публикации

ОБНАРУЖЕНИЕ ТЕКСТОВЫХ ПОЛЕЙ С ИСПОЛЬЗОВАНИЕМ НЕЙРОННЫХ СЕТЕЙ

Номер: RU2699687C1

Группа изобретений относится к области вычислительной техники и может быть использована для обнаружения текстовых полей в электронных документах с использованием нейронных сетей. Техническим результатом является повышение точности обнаружения текстовых полей. Способ содержит этапы, на которых извлекают из электронного документа множество признаков слов, где множество признаков включает множество символьных векторов, представляющих слова, имеющиеся на изображении; обрабатывают множество признаков слов с использованием нейронной сети, включающей множества слоев нейронной сети; обнаруживают процессорным устройством множество текстовых полей в электронном документе исходя из результата работы нейронной сети, на основании пространственной информации, указывающей на расположение указанных текстовых полей в электронном документе; и присваивают процессорным устройством каждое из текстовых полей одному из множества типов полей исходя из результата работы нейронной сети, с учетом отнесения на основании ...

Подробнее
04-04-2002 дата публикации

A METHOD AND SYSTEM FOR DESCRIBING AND IDENTIFYING CONCEPTS IN NATURAL LANGUAGE TEXT FOR INFORMATION RETRIEVAL AND PROCESSING

Номер: CA0002423964A1
Принадлежит:

A method for information retrieval that matches occurrences of concepts in natural language text documents against descriptions of concepts in user queries. Said method, implemented in a computer system, includes a preferred version of the method that comprises (1) annotating natural language text in documents and other text-forms with linguistic information and Concepts and Concept Rules expressed in a Concept Specification Language (CSL) for a particular domain, (2) pruning and optimizing synonyms for a particular domain, (3) defining and learning said CSL Concepts and Concept Rules, (4) checking user-defined descriptions of Concepts represented in CSL (including user queries), and (5) retrieval by matching said user-defined descriptions (and queries) against said annotated text. CSL is a language for expressing linguistically-based patterns. Said patterns can represent the linguistic manifestations of concepts in text. Said concepts may derive from the sublanguages used by experts to ...

Подробнее
23-10-2003 дата публикации

METHOD AND SYSTEM FOR DETECTING AND EXTRACTING NAMED ENTITIES FROM SPONTANEOUS COMMUNICATIONS

Номер: CA0002481080A1
Принадлежит:

The invention concerns a method and system for detecting and extracting named entities from spontaneous communications (Fig.1). The method may recognizing input communications from a user (150), detecting contextual named entities (160) from the recognized input communications (150) and outputting the contextual named entities to a language understanding unit (170).

Подробнее
07-06-1994 дата публикации

MACHINE TRANSLATION APPARATUS HAVING A PROCESS FUNCTION FOR PROPER NOUNS WITH ACRONYMS

Номер: CA0002020058C
Принадлежит: SHARP KK, SHARP KABUSHIKI KAISHA

A machine translation apparatus in which the sentence construction of a source language entered by means of an input device is analyzed in order to generate the corresponding translated text after being converted into a sentence construction in a target language, wherein the machine translation apparatus comprises a device for determining whether or not a word string obtained from a sentence construction analysis is a proper noun with an acronym, a device for examining whether or not the number of first letters of each of a certain number of words corresponds to the number of letters of the acronym, and also for examining whether or nor these words are registered in a dictionary, and a device for outputting the corresponding term after it is translated into a target language, when the words are registered in the dictionary, and for outputting directly the words, whose number of first letters corresponds to the number of the letters of the acronym, without translating them, when the words ...

Подробнее
04-12-2018 дата публикации

Man-machine conversation method and system based on semantic framework

Номер: CN0108932278A
Принадлежит:

Подробнее
12-08-2019 дата публикации

Номер: KR1020190094078A
Автор:
Принадлежит:

Подробнее
20-04-2018 дата публикации

단어 자질 가중치를 적용한 딥 러닝 기반 개체 유형 분류 방법

Номер: KR0101837262B1
Автор: 맹성현, 김부근, 강준영
Принадлежит: 한국과학기술원

... 일 실시예에 따른 개체의 유형을 분류하는 방법은, 문맥 단어로부터 단어의 자질을 기반으로 가중치를 계산하는 단계; 상기 단어의 자질을 기반으로 상기 문맥 단어의 가중치를 계산함에 따라 상기 문맥 단어를 가중치 값으로 변환하는 단계; 상기 문맥 단어의 가중치 값을 각 단어의 벡터와 결합하여 가중치 반영 벡터를 계산하는 단계; 및 상기 문맥 단어로부터 계산된 가중치 반영 벡터를 기학습된 인공신경망에 기반한 개체 유형으로 분류하는 단계를 포함할 수 있다.

Подробнее
26-10-2006 дата публикации

SYSTEM AND METHOD FOR PARSING MEDICAL DATA

Номер: WO000002006113298A3
Принадлежит:

Certain embodiments of the present invention provide a system and method for processing medical data. The method includes the steps of identifying text strings in medical data, associating the text strings with standardized identifiers from a library, and outputting the standardized identifiers associated with the text strings. In an embodiment, a report and/or an order including the standardized identifiers associated with the text strings may be printed and/or stored. In an embodiment, the library may be modified to accommodate the text strings. A user and/or software program may be used to review the text strings to associate the standardized identifiers with the text strings, for example. In an embodiment, the text strings may be deconstructed into plurality of sub-strings. A standardized identifier is then associated with each sub-string. The standardized identifiers may be numeric values, for example.

Подробнее
19-05-2011 дата публикации

METHOD AND SYSTEM FOR REDACTING AND PRESENTING DOCUMENTS

Номер: WO2011059510A9
Автор: AUMANN, Yehonatan
Принадлежит:

A method and system that automatically analyzes documents and redacts key elements of importance to potential purchasers of the document is disclosed The present invention serves the dual purpose of presenting potential purchasers, or others, with versions of the document that more fully reflect the particular information contained in the document while not disclosing the critical key data points that are of most importance to the purchaser Upon reviewing a redacted version of the subject document, the viewer may elect to purchase or otherwise obtain the original document or report with information un-redacted Alternatively, the invention enables viewers to selectively purchase portions of the document in an a Ia carte fashion.

Подробнее
17-04-2008 дата публикации

MULTI-TIERED CASCADING CRAWLING SYSTEM

Номер: WO000002008046098A3
Принадлежит:

Provided is a multi-tiered cascading crawling system for finding on a network information related to one or more predetermined topics or subtopics of interest. In general, embodiments of the present invention provide a system that operates in multiple "tiers," where at least some of the output of one tier is used to comprise the input of the next tier. Each tier generally analyzes collections of documents on the network using successively more restrictive criteria about the subject matter of each collection and/or about which collections may be related to the one or more topics or subtopics. In general, only the final tier performs an exhaustive crawl of all of the documents of the collections that are identified by the system as being relevant to the topic or subtopic of interest.

Подробнее
28-11-2013 дата публикации

SYSTEMS AND METHODS FOR DETECTING REAL NAMES IN DIFFERENT LANGUAGES

Номер: WO2013177359A2
Принадлежит:

Systems and methods for detecting real names in different languages are described, including receiving a candidate name; determining a human language of the candidate name; disassembling a structure of the candidate name by applying a rule base for at least one of a character set, a meaning, and a format of the candidate name, wherein the rule base is unique to the determined human language; verifying at least a part of the disassembled structure of the candidate name with respect to actual real name information to generate a degree of confidence that the candidate name is the an actual real name; and performing an action based on the generated degree of confidence that the candidate name is the actual real name.

Подробнее
22-06-2006 дата публикации

Bi-dimensional rewriting rules for natural language processing

Номер: US20060136196A1
Принадлежит: Xerox Corporation

A linguistic rewriting rule for use in linguistic processing of an ordered sequence of linguistic tokens includes a token pattern recognition rule that matches the ordered sequence of linguistic tokens with a syntactical pattern. The token pattern recognition rule incorporates a character pattern recognition rule to match characters contained in an ambiguous portion of the ordered sequence of linguistic tokens with a character pattern defining a corresponding portion of the syntactical pattern.

Подробнее
03-10-2002 дата публикации

Automatically adding proper names to a database

Номер: US20020143828A1
Принадлежит: Microsoft Corporation

The correct spelling of resolved email names is automatically stored in a custom dictionary. Thereafter, a spell checker will recognize the name during the spell checking process, because the name is stored in the custom dictionary and the name will not be incorrectly marked as a spelling error. When an email editor resolves an entered email name, the email editor checks an address book or email name cache to determine whether there is an email address and/or an email display name corresponding to the name. The display name will be used to replace the email name that the user entered into the TO field. Once an email name has been resolved and the display name is provided in the TO field, the email editor will make the name available for adding to a custom dictionary. Once the display name string has been added to the custom dictionary, the email editor and any other client of the custom dictionary (e.g., a word processor) will recognize the name as being properly spelled and the name will ...

Подробнее
05-12-2019 дата публикации

ARTIFICIAL INTELLIGENCE BASED-DOCUMENT PROCESSING

Номер: US20190370397A1
Принадлежит: ACCENTURE GLOBAL SOLUTIONS LIMITED

An AI-based data processing system analyzes a received information request to generate an interactive visualization including data responsive to the information request. The information request is processed to obtain the primary entity and one or more informational items related to the primary entity. Auxiliary entities and informational items related to the primary entity are identified and searches are executed on a knowledge base and the internet. The results from the searches are analyzed to obtain knowledge nuggets which are included into a selected one of a visualization template to generate the interactive visualization. If it is determined via user interactions with the interactive visualization that an informational gap exists between the information request and the data in the interactive visualization, the interactive visualization can be updated to address the informational gap.

Подробнее
16-08-2018 дата публикации

CONVERSATIONAL VIRTUAL ASSISTANT

Номер: US20180232376A1
Принадлежит: Microsoft Technology Licensing, LLC

Conversational virtual assistance for delivering relevant query solutions is provided. A virtual assistant system comprises various components associated with developing a knowledge database that can be searched for finding documents that fulfill the user's intent. The virtual assistant system further comprises components for receiving a query from a user, extracting entities for understanding the user's intent, and for searching a knowledge database for documents responsive to the query. When additional information is needed for determining more relevant results, a conversation strategy is determined, and a question is formulated for generating a conversation with the user for clarifying the user's intent, confirming a solution, or obtaining additional information. The user is enabled to provide a follow-up response that is related to a previously identified entity. The entity is edited in the query, and responses are refined responsive to the edited query.

Подробнее
30-01-1995 дата публикации

Номер: JP0007007419B2
Автор:
Принадлежит:

Подробнее
28-08-2018 дата публикации

Автоматическое извлечение именованных сущностей из текста

Номер: RU2665239C2

Изобретение относится к средствам распознавания именованных сущностей из неразмеченного текстового корпуса. Технический результат заключается в повышении эффективности распознавания и разметки именованных сущностей в текстах. Выбирают обучающий набор текстов на естественном языке. Извлекают процессором соответствующего набора признаков для каждой категории именованных сущностей. Обучают процессором модели классификации с использованием обучающего набора текстов и наборов признаков для каждой категории именованных сущностей. Извлекают процессором токенов из неразмеченного текста. Формируют процессором набора атрибутов для каждого токена неразмеченного текста на основании по меньшей мере глубокого семантико-синтаксического анализа. Определяют возможные синтаксические связи по меньшей мере в одном предложении неразмеченного текста, включающее получение множества синтаксических атрибутов. Формирование независимой от языка семантической структуры, включающее определение семантических связей ...

Подробнее
27-09-2018 дата публикации

JOB ALLOCATION

Номер: AU2018201691A1
Принадлежит: Murray Trento & Associates Pty Ltd

Examples of job allocation are described hereon. In an example, a job for allocation may be received. The job may be analyzed to obtain information pertaining to the job. The information may include at least one of a domain of the job and a priority level of the job. Further, performance of resources may be determined to provide resource information. The resource information may be determined using a supervised learning model including a job vector for each job type and a resource vector corresponding to each resource. The resource information may include a list of resources with at least one of a corresponding probability of each resource completing the job and a performance score of each resource. Based on the job information and the resource information, the resource may be recommended for the job using an expertise-estimation modeling technique and the job may be assigned to the recommended resource, accordingly. JOB ANALYZER JOB SCHEDULER RESOURCE ANALYZER JOB ASSIGNOR JOB DATA RESOURCE ...

Подробнее
23-12-2008 дата публикации

NAMED ENTITY TRANSLATION

Номер: CA0002475857C
Принадлежит: UNIVERSITY OF SOUTHERN CALIFORNIA

Translating named entities (110) from a source language to a target language. In general, in one implementation, the technique includes: generating potential translations of a named entity from a source language to a target language using a pronunciation-based and spelling-based transliteration model (210), searching a monolingual resource (220) in the target language for information relating to usage frequency, and providing output including at least one of the potential translations based on the usage frequency (230).

Подробнее
16-02-2019 дата публикации

EXPERT KNOWLEDGE PLATFORM

Номер: CA0003014309A1

An expert system processes communication data to extract entities and topics. The expert system generates relationship graphs and relationship scores between the entities and topics. The system can identify entities that are expert in a given topic. The expert system uses a knowledge engine to provide different services and applications.

Подробнее
25-09-2003 дата публикации

NAMED ENTITY TRANSLATION

Номер: CA0002475857A1
Принадлежит:

Translating named entities (110) from a source language to a target language. In general, in one implementation, the technique includes: generating potential translations of a named entity from a source language to a target language using a pronunciation-based and spelling-based transliteration model (210), searching a monolingual resource (220) in the target language for information relating to usage frequency, and providing output including at least one of the potential translations based on the usage frequency (230).

Подробнее
05-06-2018 дата публикации

Based on the pseudo-data improved interrelated data to deal with natural language

Номер: CN0108124477A
Автор:
Принадлежит:

Подробнее
14-06-2019 дата публикации

DYNAMIC ENRICHMENT OF COMMUNICATION ITEMS

Номер: CN0109891444A
Принадлежит:

Подробнее
27-11-2013 дата публикации

Conversation Managemnt System and Method Thereof

Номер: KR1020130128717A
Автор:
Принадлежит:

Подробнее
15-03-2016 дата публикации

METHOD AND APPARATUS FOR VOICE OUTPUTTING FULL NAME OF UNIT OR ABBREVIATION

Номер: KR1020160029587A
Принадлежит:

A method and an apparatus for voice outputting full name of a unit or an abbreviation are disclosed. The method for voice outputting full name of unit or abbreviation may comprise: detecting a unit or an abbreviation from text to be voice outputted; obtaining a full name of the detected unit or abbreviation by searching the detected unit or abbreviation in a full name (original word) database; and converting the obtained full name of the unit or abbreviation into a voice format and outputting the same. According to the present invention, the context of text content is analyzed and voice converted into appropriate terms, so precise contextual meaning information can be provided. It is very helpful for a visually impaired person having a poor accessibility to the web page as well as a general user. It can also provide a smart talkback service for the accessibility of a visually impaired person in a mobile as well as a web page, and read even an application of a third party within an appropriate ...

Подробнее
10-01-2020 дата публикации

Proactive integration of - unsolicited content to human -to-human computer conversations

Номер: KR1020200003871A
Автор:
Принадлежит:

Подробнее
19-01-2005 дата публикации

RECOGNIZING WORDS AND THEIR PARTS OF SPEECH IN ONE OR MORE NATURAL LANGUAGES

Номер: KR1020050007547A
Автор: PARK YOUNGJA
Принадлежит:

The present invention is disclosed for recognizing one or more words not listed in a dictionary data base. If there is no valid word obtained, a statistical process checks one or more sequences of a sub set of two or more characters in the word to determine a probability that the word is a valid word. In alternative embodiments, the invention includes a prefix removal process, a suffix removal process, a root process, and/or a combination process. © KIPO & WIPO 2007 ...

Подробнее
04-12-2008 дата публикации

DETECTING NAME ENTITIES AND NEW WORDS

Номер: WO000002008144964A1
Принадлежит:

Various aspects can be implemented for detecting name entities and/or new words from input entries. In general, one aspect can be a method that includes receiving an input entry comprising a text string. The method also includes identifying segmentation information from the input entry. The method further includes generating a candidate text string from the text string of the input entry based on the segmentation information. Other implementations of this aspect includes corresponding systems, apparatus, and processing engines.

Подробнее
06-05-2010 дата публикации

METHOD OF COMPUTERIZED SEMANTIC INDEXING OF NATURAL LANGUAGE TEXT, METHOD OF COMPUTERIZED SEMANTIC INDEXING OF COLLECTION OF NATURAL LANGUAGE TEXTS, AND MACHINE-READABLE MEDIA

Номер: WO2010050844A1
Принадлежит:

The present invention relates to the information technologies field, namely, to methods of computerized semantic indexing of natural language texts. The use of the present invention permits for extending the set of methods for indexing the natural languages texts by means of employing techniques of the computerized linguistic analysis thereof and further usage of obtained results for building indices, which ensures the semantic navigation through documents and document collections, as well as the highly-precise and quick search of facts and documents relevant to the user's information needs, particularly, in reference to the high-inflectional language texts. The method of computerized semantic indexing of natural language text comprises steps of: segmenting the text in the electronic form into tokens; identifying stable phrases; forming sentences; by addressing the linguistic and heuristic rules formed in the database in the predetermined linguistic environment, identifying the semantically ...

Подробнее
28-04-2011 дата публикации

用語識別方法および装置

Номер: JP2011513810A
Принадлежит:

... ディスプレイとユーザ操作可能な1以上の入力デバイスを含むコンピュータ装置によって実行される、識別子を文書中のエンティティの言及に割当てる方法である。複数の候補識別子は、文書中のエンティティの言及に関する用語識別モジュールから受信されて、各候補識別子は、エンティティ特性データが1以上のエンティティデータベースに記憶されることに関するエンティティへの参照である。リストがディスプレイの第1の領域に表示され、前記リストは、複数のユーザ選択可能なエンティティを有し、前記リスト中の各エントリは、前期複数の候補識別子の1つによって参照されるエンティティに関し、各エントリは、それぞれのエンティティの特性を備える。前記特性の少なくとも1つが前記1以上のエンティティデータベースから読み出される。リスト中のエントリのユーザによる選択に応答して、選択されたエントリに関するエンティティの付加的特性がディスプレイの第2の領域に表示されてその付加的特性は、少なくも一部において1以上の前記データベースから読み出される。リストエントリに関する選択されたエンティティに関するユーザから受けた識別子割当命令に応答して、選択されたエンティティの識別子が、エンティティの言及の識別子として割当てられる。ユーザが特定した基準を満たすことにリストエントリが与えられることに関するエンティティをユーザが制限できるためにフィルタが提供される。ディスプレイの第1および第2の領域に表示される特性は、異なる領域およびアプリケーションのためにカスタマイズできる。 ...

Подробнее
21-12-2017 дата публикации

INPUT ENTITY IDENTIFICATION FROM NATURAL LANGUAGE TEXT INFORMATION

Номер: AU2016269573B2
Принадлежит: Murray Trento & Associates Pty Ltd

A device may include one or more processors. The device may receive text to be processed to identify input entities included in the text. The device may identify text sections of the text. The device may generate a list of terms included in the text sections of the text. The device may perform one or more feature extraction techniques, on the terms included in the text sections, to identify the input entities included in the text. The device may generate information that identifies the input entities included in the text, based on performing the one or more feature extraction techniques. The device may provide the information that identifies the input entities included in the text. 0) x (D~ (D.~ ~ u (D 0) 4a IC Q0 ~ a I -~ -0 U) z _- _ 0 0 0'~ (0 U)~- I 0 ~ 0 CF --- 0= QoJ 0 0 0 U). U) l 0 0 0 0) aCM 0 0 -x 0)N 0a) E ( (n 0 0 EE ICm 00) 0 0)(D ( 00( ...

Подробнее
14-11-2019 дата публикации

NATURAL LANGUAGE PROCESSING AND ARTIFICIAL INTELLIGENCE BASED SEARCH SYSTEM

Номер: AU2019201244A1
Принадлежит: Murray Trento & Associates Pty Ltd

In some examples, natural language processing (NLP) and artificial intelligence based searching may include identifying named entities in text from a corpus of documents. References in the text may be resolved with the identified named entities. Links between the named entities may be determined, and a bi-direction rootless graph may be generated. Semantic relationships may be determined from text of the named entities, and blacklist keywords may be identified. Machine learning classification may be performed based on a pair of the named entities and a blacklist keyword. A classification may be determined based on the pair of named entities and the blacklist keyword, and a rule may be identified that specifies which named entity in the pair is to be flagged. Further, a node in the graph may be flagged based on an association with the named entity identified according to the rule. RECEIVE DOCUMENTS FOR CORPUS IDENTIFY NAMED ENTITLES PERFORMCOREREFERENCE DETERMINE LINKS BETWEEN NAMED BUILD ...

Подробнее
02-05-2019 дата публикации

Method for recognizing network text named entity based on neural network probability disambiguation

Номер: AU2017416649A1
Принадлежит: WRAYS PTY LTD

A method for recognizing network text named entity based on neural network probability disambiguation. The method comprises: carrying out word segmentation on an unlabeled corpus, using Word2Vec to extract a word vector; converting a sample corpus into a word feature matrix and windowing same; building a deep neural network to carry out training, and adding a softmax function into an output layer of the neural network to carry out normalization processing, so as to obtain a probability matrix of the named entity category corresponding to each word; and re-windowing the probability matrix, and using a conditional random field model to carry out disambiguation, so as to obtain a final named entity annotation. In a named entity recognition task of network text, a word vector increment learning method without changing the structure of a neural network is provided, according to the characteristic that a network vocabulary and a new vocabulary exist therein, and a probability disambiguation method ...

Подробнее
07-11-2003 дата публикации

METHOD, SYSTEM, AND APPARATUS FOR CONVERTING NUMBERS BETWEEN MEASUREMENT SYSTEMS BASED UPON SEMANTICALLY LABELED STRINGS

Номер: CA0002427797A1
Принадлежит:

A method, system, and apparatus are provided for converting numbers between different units of measurement. When a string of text is entered into an application program, the string is analyzed to determine whether the string of text includes a number having an associated unit name. If the string of text includes a number having an associated unit name, the number is semantically labeled with schema information that identifies the unit name as a source unit of measurement. When a selection is received of the number or an indication that the number is labeled with schema information, a list of actions may be provided identifying conversion options available for the number. One of the conversion options may then be selected resulting in the conversion of the selected number to the destination unit of measure identified by the selected action. The converted number may then be inserted into the string of text to replace the selected number along with a unit name corresponding to the destination ...

Подробнее
11-11-1999 дата публикации

MACHINE-ASSISTED TRANSLATION TOOLS

Номер: CA0002331184A1
Принадлежит:

The present invention provides an improved method and apparatus for translating a source language to a target language. The invention uses placeables (e.g., proper nouns, titles and names, dates, times, units and measurements, numbers, formatting information, such as tags or escape sequences, styles, graphics, hyperlinks) to assist a translator by not having to retype information that does not need to be translated and to provide conversions to the target locale if necessary like for speeds.

Подробнее
06-12-2018 дата публикации

METHOD FOR RECOGNIZING NETWORK TEXT NAMED ENTITY BASED ON NEURAL NETWORK PROBABILITY DISAMBIGUATION

Номер: CA0003039280A1
Принадлежит: SMITHS IP

A method for recognizing network text named entity based on neural network probability disambiguation. The method comprises: carrying out word segmentation on an unlabeled corpus, using Word2Vec to extract a word vector; converting a sample corpus into a word feature matrix and windowing same; building a deep neural network to carry out training, and adding a softmax function into an output layer of the neural network to carry out normalization processing, so as to obtain a probability matrix of the named entity category corresponding to each word; and re-windowing the probability matrix, and using a conditional random field model to carry out disambiguation, so as to obtain a final named entity annotation. In a named entity recognition task of network text, a word vector increment learning method without changing the structure of a neural network is provided, according to the characteristic that a network vocabulary and a new vocabulary exist therein, and a probability disambiguation method ...

Подробнее
22-03-2018 дата публикации

SYSTEMS AND METHODS FOR ADAPTIVE PROPER NAME ENTITY RECOGNITION AND UNDERSTANDING

Номер: CA0003036998A1
Принадлежит: SMITHS IP

Various embodiments contemplate systems and methods for performing automatic speech recognition (ASR) and natural language understanding (NLU) that enable high accuracy recognition and understanding of freely spoken utterances which may contain proper names and similar entities. The proper name entities may contain or be comprised wholly of words that are not present in the vocabularies of these systems as normally constituted. Recognition of the other words in the utterances in question, e.g. words that are not part of the proper name entities, may occur at regular, high recognition accuracy. Various embodiments provide as output not only accurately transcribed running text of the complete utterance, but also a symbolic representation of the meaning of the input, including appropriate symbolic representations of proper name entities, adequate to allow a computer system to respond appropriately to the spoken request without further analysis of the user's input.

Подробнее
12-10-2010 дата публикации

METHOD AND SYSTEM FOR DETECTING AND EXTRACTING NAMED ENTITIES FROM SPONTANEOUS COMMUNICATIONS

Номер: CA0002481080C
Принадлежит: AT&T CORP.

The invention concerns a method and system for detecting and extracting named entities from spontaneous communications (Fig.1). The method may recognizing input communications from a user (150), detecting contextual named entities (160) from the recognized input communications (150) and outputting the contextual named entities to a language understanding unit (170).

Подробнее
02-03-2016 дата публикации

Entity extraction feedback

Номер: CN0105378706A
Автор: BLANCHFLOWER SEAN
Принадлежит:

Подробнее
07-05-2019 дата публикации

INFORMATION PROCESSING METHOD, INFORMATION PROCESSING APPARATUS, AND PROGRAM

Номер: CN0109726269A
Принадлежит:

Подробнее
04-06-2019 дата публикации

System and method for visually understanding and programming conversational agents of electronic devices

Номер: CN0109840089A
Автор: YAO XUCHEN, CHEN GUOGUO
Принадлежит:

Подробнее
08-01-2019 дата публикации

Electronic medical record information extraction method, device and apparatus

Номер: CN0109166608A
Автор: FAN FANGLI
Принадлежит:

Подробнее
11-06-2014 дата публикации

Definition extraction

Номер: CN101233484B
Принадлежит:

Подробнее
30-09-2016 дата публикации

GEOLOCATION SYSTEM BY LINGUISTIC ANALYSIS

Номер: FR0002948791B1
Принадлежит: GEOLSEMANTICS

Подробнее
06-07-2017 дата публикации

METHOD FOR AUTOMATICALLY EXTRACTING FOOD HAZARD EVENT FROM NEWS AND SOCIAL NETWORK SERVICE (SNS) DATA IN REAL TIME, AND SYSTEM THEREFOR

Номер: KR1020170077397A
Принадлежит:

Disclosed are a method for automatically extracting a food hazard event from news and social network service (SNS) data in real time, and a system therefor. In order to automatically extract and share a food hazard event from a vast amount of news and SNS, according to the present invention, a real-time food hazard event generated in the news of each country and the SNS is automatically extracted and shared by analyzing a food hazard event property, defining and automatically expanding a food hazard event template, automatically generating and expanding a food hazard information search term, extracting food hazard event information, and a sharing module. Accordingly, information on the food hazard event occurring in anytime and anywhere is extracted in real time, thereby minimizing the damage caused by the occurrence of a food hazard factor, and also sharing the extracted information with a related organization, company, and the like to be used for a precaution against food safety. COPYRIGHT ...

Подробнее
11-10-2007 дата публикации

DISAMBIGUATION OF NAMED ENTITIES

Номер: WO2007115266A2
Принадлежит:

Named entities are disambiguated in search queries and other contexts using a disambiguation scoring model. The scoring model is developed using a knowledge base of articles, including articles about named entities. Various aspects of the knowledge base, including article titles, redirect pages, disambiguation pages, hyperlinks, and categories, are used to develop the scoring model.

Подробнее
24-12-2003 дата публикации

RECOGNIZING WORDS AND THEIR PARTS OF SPEECH IN ONE OR MORE NATURAL LANGUAGES

Номер: WO0003107217A1
Автор: PARK, Youngja
Принадлежит:

The present invention is disclosed for recognizing one or more words not listed in a dictionary data base. If there is no valid word obtained, a statistical process checks one or more sequences of a sub set of two or more characters in the word to determine a probability that the word is a valid word. In alternative embodiments, the invention includes a prefix removal process, a suffix removal process, a root process, and/or a combination process.

Подробнее
14-12-2017 дата публикации

EVENT EXTRACTION FROM DOCUMENTS

Номер: US20170357625A1
Принадлежит: NORTHROP GRUMMAN SYSTEMS CORPORATION

Systems and methods are provided for indexing a document according to identified events. An event-based indexing system includes a source interface configured to receive the document from an associated data source and format the document for processing and an indexer configured to extract event mentions from the document, with a given event mention comprising a verb and at least one of a subject and an object of the verb. A document index is configured to store the extracted event mentions such that a given document from an associated document corpus can be retrieved according to its associated event mentions ...

Подробнее
28-11-2019 дата публикации

SYSTEMS AND METHODS FOR AUTO DISCOVERY OF FILTERS AND PROCESSING ELECTRONIC ACTIVITIES USING THE SAME

Номер: US20190361929A1
Принадлежит: People.ai, Inc.

The present disclosure relates to systems and methods for filtering electronic activities. Exemplary implementations may include ingesting a first electronic activity; identifying an associated entity; and selecting a first filtering model based on the entity, the first filtering model trained to indicate whether to restrict further processing of ingested electronic activities. The method may further include generating a plurality of structured data tags for the first electronic activity; applying the selected first filtering model to the plurality of structured data tags for the first electronic activity to determine whether the first electronic activity satisfies a first restriction condition; and responsive to the first electronic activity satisfying the first restriction condition, restricting the first electronic activity from further processing; or responsive to the first electronic activity not satisfying the first restriction condition, further processing, by the one or more processors ...

Подробнее
16-12-2009 дата публикации

NAME INDEXING FOR NAME MATCHING SYSTEMS

Номер: EP2132648A2
Принадлежит:

Подробнее
03-10-2007 дата публикации

LOCAL ITEM EXTRACTION

Номер: EP0001839211A1
Автор: RILEY, Michael Dennis
Принадлежит:

Подробнее
04-08-2010 дата публикации

NLP-BASED CONTENT RECOMMENDER

Номер: EP2212772A1
Принадлежит:

Подробнее
16-01-1998 дата публикации

INFORMATION RECOGNITION DEVICE

Номер: JP0010011434A
Автор: SHIMOMURA HIDEKI
Принадлежит:

PROBLEM TO BE SOLVED: To enable the information recognition device which recognizes an address, etc., to recognize recognition object information, which is inputted without specifying word sections and elements, fast with precision. SOLUTION: An element word recognition means 1a finds element word candidates for respective elements of the recognition object information and the likelihood of each element word candidate. Then a record number acquisition means 1d performs retrieval from a record storage part 1e and acquires the record numbers of records containing the respective element word candidates found by the element word recognition means 1a. Then a likelihood calculation means if finds the likelihood of each record by using a likelihood counter corresponding to each record. A result decision means 1h decides a recorded as the recognition result of the recognition object information according to the count value of the likelihood counter and the result taking-out means takes the record ...

Подробнее
22-06-2010 дата публикации

SYSTEM, METHOD, PROGRAM PRODUCT, AND NETWORKING USE FOR RECOGNIZING WORDS AND THEIR PARTS OF SPEECH IN ONE OR MORE NATURAL LANGUAGES

Номер: CA0002488814C
Автор: PARK, YOUNGJA

The present invention is disclosed for recognizing one or more words not listed in a dictionary data base. If there is no valid word obtained, a statistical process checks one or more sequences of a sub set of two or more characters in the word to determine a probability that the word is a valid word. In alternative embodiments, the invention includes a prefix removal process, a suffix removal process, a root process, and/or a combination process.

Подробнее
05-09-2019 дата публикации

AUTOMATED COMMUNICATION DESIGN CONSTRUCTION SYSTEM

Номер: CA0003010039A1
Принадлежит: RICHES, MCKENZIE & HERBERT LLP

An automated communication design analysis and construction system includes that includes one or more intelligent communication design servers, comprising: a normalization module that converts communication content files for different recipients to normalized intermediate format files; an objects identification and quantification module that identifies text objects and image objects in the normalized intermediate format files; a cross recipient group analysis module configured to identify static global objects that are invariant between recipients, data variables, and variable global objects that vary between recipients in the normalized intermediate format files; and an intelligent communication content learning and constructing engine that can construct standard communication design files based on the static global objects, the data variables, and the variable global objects. A data storage stores the communication content files and the standard communication design files. A communication ...

Подробнее
12-03-2019 дата публикации

Answer input method, device, storage medium and electronic device

Номер: CN0109460503A
Принадлежит:

Подробнее
04-05-2018 дата публикации

Method and device for implementing entity disambiguation, storage medium and program product

Номер: CN0107992480A
Принадлежит:

Подробнее
13-06-2013 дата публикации

ORTHOGRAPHICAL VARIANT DETECTION APPARATUS AND ORTHOGRAPHICAL VARIANT DETECTION PROGRAM

Номер: US20130151239A1
Принадлежит:

Provided is an orthographical variant detection apparatus which detects orthographical variant candidates with a high precision. The orthographical variant detection apparatus includes a term extraction unit that extracts terms from document data, a similarity computation unit that computes similarity of an arbitrary pair of the extracted terms, an orthographical variant candidate determination unit that determines, based on the similarity, whether or not the terms in the pair of terms are orthographical variant candidates, and a group classification unit that groups the orthographical variant candidates based on a character string commonly included in pair of terms as the orthographical variant candidates. 1. An orthographical variant detection apparatus comprising:a term extraction unit that extract terms from document data;a similarity computation unit that computes similarity of an arbitrary pair of the extracted terms;an orthographical variant candidate determination unit that determines, based on the similarity, whether or not the pair of terms are orthographical variant candidates; anda group classification unit that groups the orthographical variant candidates based on a character string commonly included in the pair of terms as the orthographical variant candidates.2. An orthographical variant detection apparatus comprising:a term extraction unit that extracts terms from document data;a similarity computation unit that performs operations including character type replacing on an arbitrary pair of the extracted terms, computes an edit distance based on the number of times of the operations, and computes similarity based on the edit distance; andan orthographical variant candidate determination unit that determines, based on the similarity, whether or not the pair of terms are orthographical variant candidates.3. The orthographical variant detection apparatus according to claim 1 , wherein the similarity computation unit obtains the edit distance by ...

Подробнее
27-06-2013 дата публикации

Method and apparatus for rating documents and authors

Номер: US20130166282A1
Автор: Peter Ridge, Tim Musgrove
Принадлежит: Federated Media Publishing LLC

Methods and apparatus for determining a competence rating of an author relating to one or more topics is disclosed. An exemplary method comprises determining semantic information associated with one or more documents related to the one or more topics, determining amplification information associated with the one or more documents, determining occurrence information associated with the author; and determining a competence rating for the author based at least in part on the semantic information associated with the one or more documents, the amplification information associated with the one or more documents, and the occurrence information associated with the author. A document rating for at least one of the one or more documents may also be determined based at least in part on the one or more weighted semantic features and the amplification information.

Подробнее
27-06-2013 дата публикации

System and Method of Spoken Language Understanding in Human Computer Dialogs

Номер: US20130166284A1
Принадлежит: AT&T INTELLECTUAL PROPERTY II, L.P.

A system and method are disclosed that improve automatic speech recognition in a spoken dialog system. The method comprises partitioning speech recognizer output into self-contained clauses, identifying a dialog act in each of the self-contained clauses, qualifying dialog acts by identifying a current domain object and/or a current domain action, and determining whether further qualification is possible for the current domain object and/or current domain action. If further qualification is possible, then the method comprises identifying another domain action and/or another domain object associated with the current domain object and/or current domain action, reassigning the another domain action and/or another domain object as the current domain action and/or current domain object and then recursively qualifying the new current domain action and/or current object. This process continues until nothing is left to qualify. 1. A method comprising:partitioning, via a processor, a speech recognizer output into independent clauses;identifying, independent of domain, a dialog act for each of the independent clauses;identifying, dependent on domain, an object within each of the independent clauses; andrecursively generating, for each independent clause in the independent clauses, a semantic representation using the dialog act and the object of each independent clause.2. The method of claim 1 , wherein the semantic representation is used by a dialog manager in a spoken dialog system to determine a response to a user input.3. The method of claim 1 , further comprising:identifying, dependent on domain, an action within each of the independent clauses, wherein recursively generating the semantic representation further comprises using the action.4. The method of claim 1 , wherein while recursively generating the semantic representation claim 1 , additional objects are extracted from each of the independent clauses.5. The method of claim 1 , wherein identifying the object comprises ...

Подробнее
18-07-2013 дата публикации

Format for displaying text analytics results

Номер: US20130185058A1
Принадлежит: DW ASSOC LLC

A system can receive text. The text can be divided into various portions. One or more significance indicators can be associated with each portion of text: these significance indicators can also be received by the system. The system can then display a portion of text and the associated significance indicators to the user.

Подробнее
08-08-2013 дата публикации

METHOD FOR LABELING SEMANTIC ROLE OF BILINGUAL PARALLEL SENTENCE PAIR

Номер: US20130204606A1
Автор: Zhuang Tao, Zong Chengqing

Methods for Semantic Role Labeling (SRL) of bilingual sentence pairs. Steps in this invention include tokenizing and part-of-speech tagging a bilingual sentence pair, performing word alignments on the bilingual sentence pair, finding word-aligned predicate pairs in the bilingual sentence pair, generating argument candidates for each predicate using monolingual SRL system, and performing joint inference to obtain the SRL results and argument alignment for each predicate pair. This method produces more accurate SRL results on both sides of bilingual sentence pairs. Furthermore, this method also aligns the predicate-argument structures between the sentence pairs. 1. A method of bilingual Semantic Role Labeling (SRL) , the method comprising:Step 1: Tokenizing bilingual sentence pair and finding word-aligned predicate pairs in a bilingual sentence pair;Step 2: For each predicate, using monolingual SRL method to generate argument candidates;Step 3: For each predicate, merging its duplicate argument candidates into one; andStep 4: For each word-aligned predicate pair, performing joint inference to obtain their arguments and the alignment between these arguments.2. The method of claim 1 , wherein the step 1 further comprises tokenizing each sentence in bilingual sentence pair and performing part-of-speech tagging and word-alignment for the bilingual sentence pair.3. The method of claim 1 , wherein the step 1 further comprises finding the verb pairs that align to each other claim 1 , wherein the word-aligned word pairs are examined one by one and all word-aligned verb pairs are identified.4. The method of claim 1 , wherein the step 2 further comprises:Step 2.1: For each sentence, using multiple parsing models to produce several parse trees; andStep 2.2: For each predicate, performing monolingual SRL multiple times to obtain several argument candidates, each time using one parse tree generated in Step 2.1.5. The method of claim 4 , wherein in Step 2.1 claim 4 , multiple ...

Подробнее
08-08-2013 дата публикации

ATTRIBUTION USING SEMANTIC ANALYISIS

Номер: US20130204877A1

A method, system, and computer program product for semantic attribution of a request. Source data statements for the request are received. A selection of a domain for the received source data statements is received. The received source data statements are semantically analyzed, which includes matching elements in the received source data statements to respective one or more entries in an ontology associated with the selected domain. The ontology includes items and relationships that define the selected domain. Each element in the received source data statements is a word or a phrase. The one or more entries are assigned to the matched elements, respectively, to annotate each matched element with a respective annotation consisting of the respective one or more entries. The annotated elements are saved with the respective annotations. 1. A method for semantic attribution of a request , said method implemented by a processor of a computer system , said method comprising:said processor receiving source data statements for the request;said processor receiving a selection of a domain for the received source data statements;said processor semantically analyzing the received source data statements, said semantically analyzing comprising matching elements in the received source data statements to respective one or more entries in an ontology associated with the selected domain, wherein the ontology comprises items and relationships that define the selected domain, and wherein each element in the received source data statements is a word or a phrase;said processor assigning the one or more entries to the matched elements, respectively, to annotate each matched element with a respective annotation consisting of the respective one or more entries; andsaid processor saving the annotated elements with the respective annotations.2. The method of claim 1 , wherein the method further comprises:said processor using the annotations to generate a search query for the request.3. The ...

Подробнее
22-08-2013 дата публикации

Multi-Concept Latent Semantic Analysis Queries

Номер: US20130218554A1
Автор: Paul A. Jakubik
Принадлежит: Individual

A method includes accessing text, identifying a plurality of terms from the text, determining a plurality of term vectors associated with the identified plurality of terms, and clustering the determined plurality of term vectors into a plurality of clusters, the plurality of clusters comprising a first and a second cluster, the first and second clusters each comprising two or more of the determined term vectors. The method further includes creating a first pseudo-document according to the first cluster, creating a second pseudo-document according to the second cluster, identifying a first set of terms associated with the first cluster using latent semantic analysis (LSA) of the first pseudo-document, identifying a second set of terms associated with the second cluster using LSA of the second pseudo-document, and combining the first and second sets of terms into a list of output terms.

Подробнее
29-08-2013 дата публикации

APPARATUS AND METHOD FOR PROVIDING INTERNET DOCUMENTS BASED ON SUBJECT OF INTEREST TO USER

Номер: US20130226559A1

The present invention provides an apparatus for providing Internet documents based on a subject of interest to a user, including an subject reception unit configured to receive information on a subject from a user terminal; a relevant document collection unit configured to collect relevant documents related to the information on the subject of interest using search engines; a similar sentence classification unit configured to extract a core sentence from the relevant documents, calculate similarity of sentences peripheral to the core sentence, and classify sentences similar to the core sentence into similar sentence sets based on the calculated similarity; and a similar sentence providing unit configured to provide the core sentence and the similar sentence sets to the user terminal. 1. An apparatus for providing Internet documents based on a subject of interest to a user , the apparatus comprising:a subject reception unit configured to receive information on a subject of interest from a user terminal;a relevant document collection unit configured to collect relevant documents related to the information on the subject using search engines;a similar sentence classification unit configured to extract a core sentence from the relevant documents, calculate similarity of sentences peripheral to the core sentence, and classify sentences similar to the core sentence into similar sentence sets based on the calculated similarity; anda similar sentence providing unit configured to provide the core sentence and the similar sentence sets to the user terminal.2. The apparatus of claim 1 , wherein the information on the subject of interest is information corresponding to a search word claim 1 , a query word claim 1 , or a keyword related to the subject of interest.3. The apparatus of claim 1 , wherein the relevant document collection unit collects the relevant documents by using a meta-search method using an open API provided by the search engines.4. The apparatus of claim 1 , ...

Подробнее
29-08-2013 дата публикации

SYSTEM AND METHOD FOR DISCOVERING STORY TRENDS IN REAL TIME FROM USER GENERATED CONTENT

Номер: US20130226560A1
Автор: Ittiachen Jebu
Принадлежит:

A method for identifying story trends includes identifying a set of words in a fixed size data stream based on a subword cache, and electronically determining at least one story trend associated with the set of words and electronically generating a story hash associated with the set of words. The method also includes storing the story hash in a story trend cache and updating the story trend cache according to the story hash, and retrieving one or more popular story topics according to the story trend cache. Machine readable media including program code that causes execution of a method for generating search results also are described. 1. A computer-implemented method for identifying story trends , the method comprising:identifying a set of words in a fixed size data stream based on a subword cache;electronically determining at least one story trend associated with the set of words and electronically generating a story hash associated with the set of words;storing the story hash in a story trend cache and updating the story trend cache according to the story hash; andretrieving one or more popular story topics according to the story trend cache for presentation to a user.2. The method of claim 1 , wherein the data stream is a user-generated data stream.3. The method of claim 1 , wherein generating a story hash associated with the set of words comprises generating a SimHash based on the set of words.4. The method of claim 1 , further comprising:receiving a request for stories from a user;using the story hash to calculate a hamming distance between the story hash and another story hash; andproviding a story to the user based on a merger of the story hashes, if the hamming distance indicates that the story hashes are proximate.5. The method of claim 4 , wherein the merger occurs at run time.6. Non-transitory machine readable media comprising program code that when executed by a programmable processor causes execution of a method for generating search results claim 4 , ...

Подробнее
29-08-2013 дата публикации

SYSTEM AND METHOD FOR SEARCHING FUNCTIONS HAVING SYMBOLS

Номер: US20130226562A1
Автор: Arnon Adam
Принадлежит: EQSQUEST LTD

A system and method for searching through functions and expressions with symbols. Moreover, the system can be used to recognize and further analyze the notations of this nature and use this in order to translate, transform into audio, or solve the mathematical problems. According to at least some embodiments, the functions comprise mathematic equations which are defined by symbols and mathematic notation. The system and method enable a user to enter a mathematical equation in a WYSIWYG environment to a search engine, and to find similar or identical equations, first and foremost according to theoretical similarity, and secondly, according to visual similarity. The engine does this be understanding the meaning behind the visual symbols of the equation using a Dynamic Hidden Markov Model (hereon DHMM). The system enables the user to insert the equation with no prior knowledge of LaTeX, or any computing language, and with no need to follow a predefined generic protocol in order to insert the query. 150-. (canceled)51. A method for analyzing a function comprising a plurality of symbols , wherein the function comprises a scientific equation , said equation comprising at least one operator and at least one variable , the method being performed by a computer , the method comprising:providing the function in a visual format to the computer;decomposing the function into the plurality of symbols by the computer;labeling each symbol with a label by determining each character of the function;determining a plurality of possible labels for each character or combinations of characters;selecting a symbol label for each symbol according to said probabilistic model, wherein each symbol comprises a character or a plurality of characters;forming said one or more semantic connections according to each symbol label;determining a degree of probability for said one or more semantic connections and said symbol labels according to a probabilistic dynamic mathematical model by said computer, ...

Подробнее
12-09-2013 дата публикации

DIALOG TEXT ANALYSIS DEVICE, METHOD AND PROGRAM

Номер: US20130238321A1
Принадлежит: NEC Corporation

A dialog text analysis device generates data for text processing from a dialog text. A negative judging means decides whether or not an event of a first utterance in a dialog text which is a text including content of a plurality of utterances is negated by a second utterance which exists subsequent to the first utterance. When the event of the first utterance is negated by the second utterance, the data for text processing generation means generates data for text processing which is data in which the negated event of the first utterance is eliminated from the dialog text. 1. A dialog text analysis device comprising:a negative judging unit which judges whether or not an event of a first utterance in a dialog text which is a text including content of a plurality of utterances is negated by a second utterance which exists subsequent to the first utterance; anda data for text processing generation unit which, when the event of the first utterance is negated by the second utterance, generates data for text processing which is data in which the negated event of the first utterance is eliminated from the dialog text.2. The dialog text analysis device according to claim 1 , further comprising an inquiry/response pair identifying unit which identifies from each utterance in an inputted dialog text an inquiry/response pair which is a pair of the first utterance which indicates content to ask to a speaker and a second utterance which exists subsequent to the first utterance and is a response to the first utterance claim 1 ,wherein the negative judging unit judges whether or not the event of the first utterance in the inquiry/response pair is negated by the second utterance.3. The dialog text analysis device according to claim 1 , wherein claim 1 , when content of the event of the first utterance negated by the second utterance indicates an affirmative fact claim 1 , the data for text processing generation unit changes the event which indicates the affirmative fact to an event ...

Подробнее
03-10-2013 дата публикации

Handheld electronic device including indication of a selected data source, and associated method

Номер: US20130262094A1
Принадлежит: BlackBerry Ltd

A method of enabling input into a handheld electronic device having stored therein a number of language objects includes detecting a selection of a languages, making a determination that the language is a default language or a non-default language, detecting as an ambiguous input an actuation of one or more input members, outputting at least a portion of a number of the language objects that corresponds to the ambiguous input, and outputting an indication representative of the language.

Подробнее
10-10-2013 дата публикации

SEMANTIC ENRICHMENT BY EXPLOITING TOP-K PROCESSING

Номер: US20130268261A1
Принадлежит: THOMSON LICENSING

Proper representation of the meaning of texts is crucial to enhancing many data mining and information retrieval tasks, including clustering, computing semantic relatedness between texts, and searching. Representing of texts in the concept-space derived from Wikipedia has received growing attention recently, due to its comprehensiveness and expertise. This concept-based representation is capable of extracting semantic relatedness between texts that cannot be deduced with the bag of words model. A key obstacle, however, for using Wikipedia as a semantic interpreter is that the sheer size of the concepts derived from Wikipedia makes it hard to efficiently map texts into concept-space. An efficient algorithm is proved which is able to represent the meaning of a text by using the concepts that best match it. In particular, this approach first computes the approximate top- concepts that are most relevant to the given text. These concepts are then leverage to represent the meaning of the given text. 1. A method for performing semantic interpretation for keywords , the method comprising:obtaining one or more keywords for semantic interpretation;computing top-k concepts in a knowledge database for the one or more keywords; andmapping the one or keywords into a concept space using the top-k concepts.2. The method of claim 1 , wherein the step of computing top-k concepts comprises the steps of:estimating the bounds on the number of input lines; andcomputing an expected score for a fully or partially unseen object.3. The method of claim 1 , wherein the step of obtaining one or more keywords for semantic interpretation comprises extracting keywords from close captioning data included with content.4. The method of claim 1 , further comprising processing concepts resulting from the mapping of the one or more keywords into the concept space.5. The method of claim 4 , wherein the processing comprises ranking the concepts6. The method of claim 4 , wherein the processing comprises ...

Подробнее
10-10-2013 дата публикации

METHOD FOR PROCESSING NATURAL LANGUAGE AND MATHEMATICAL FORMULA AND APPARATUS THEREFOR

Номер: US20130268263A1
Принадлежит:

The present disclosure provides an apparatus and method for processing a natural language and a mathematical formula. The apparatus includes a natural language and mathematical formula input unit configured to receive a natural language and a mathematical formula inputted; an information generation unit configured to generate parsing semantic information of the mathematical formula from combined data composed of the natural language combined with the mathematical formula; an operation information extraction unit configured to extract operation information generated by using a logical condition from the combined data; a natural language and mathematical formula structuralizing unit configured to analyze, classify in terms of specific meaning and recombine the combined data; an operation structuralizing unit configured to structuralize the operation information; and a natural language and mathematical formula indexing unit configured to index the combined data. 1. An apparatus for processing a natural language and a mathematical formula , the apparatus comprising:a natural language and mathematical formula input unit configured to receive a natural language and a mathematical formula inputted;an information generation unit configured to generate parsing semantic information of the mathematical formula from combined data including the natural language combined with the mathematical formula;an operation information extraction unit configured to extract operation information generated by using a logical condition from the combined data;a natural language and mathematical formula structuralizing unit configured to analyze, classify in terms of specific meaning and recombine the combined data;an operation structuralizing unit configured to structuralize the operation information; anda natural language and mathematical formula indexing unit configured to index the combined data.2. The apparatus of claim 1 , wherein the natural language and mathematical formula input unit ...

Подробнее
17-10-2013 дата публикации

Method and Apparatus for Identifying a Conversation in Multiple Strings

Номер: US20130273976A1
Принадлежит: Nokia Corporation

Techniques for identifying conversations in multiple short strings include determining from a first plurality of strings associated with a first contact of a user, based on time separations between successive strings, a first conversation portion and a different second conversation portion. The first conversation portion (snippet) comprises a plurality of strings of the first plurality; and the second snippet comprises a different pluralty of strings of the first plurality. A first semantic content for the first snippet and a second semantic content for the second snippet are determined. It is determined whether to merge the first snippet and the second snippet into a first conversation that includes the first snippet based, at least in part, on a similarity of the first semantic content to the second semantic content. 128-. (canceled)29. A method comprising:determining from a first plurality of strings associated at least in part with a first contact of a user, based at least in part on time separations between successive strings, a first conversation portion that comprises a plurality of strings of the first plurality and a different second conversation portion that comprises a different plurality of strings of the first plurality;determining a first semantic content for the first conversation portion and a second semantic content for the second conversation portion; anddetermining whether to merge the first conversation portion and the second conversation portion into a first conversation that includes the first conversation portion based, at least in part, on a similarity of the first semantic content to the second semantic content.30. A method of claim 29 , wherein determining whether to merge the first conversation portion and the second conversation portion further comprises combining the first conversation portion and the second conversation portion into the first conversation claim 29 , if the similarity is determined to exceed a similarity threshold.31. A ...

Подробнее
31-10-2013 дата публикации

METHOD FOR CLASSIFYING PIECES OF TEXT ON BASIS OF EVALUATION POLARITY, COMPUTER PROGRAM PRODUCT, AND COMPUTER

Номер: US20130289978A1

A computer-implemented method, program product, and system, for extracting pieces of text from a plurality of pieces of text. The method includes: primarily evaluating a measure of positive expressions and a measure of negative expressions included in each of pieces of text; secondarily evaluating each of the pieces of text on the basis of a plurality of evaluation functions, where certain evaluation functions among the plurality of evaluation functions include, as variables, the measure of positive expressions and the measure of negative expressions; and extracting a piece of text having an evaluation result with a higher rating in preference to a piece of text having an evaluation result with a lower rating, where the individual evaluation results are based on the same evaluation function among the plurality of evaluation functions. 1. A computer-implemented method for extracting pieces of text from a plurality of pieces of text , said method comprising the steps of:primarily evaluating a measure of positive expressions and a measure of negative expressions included in each of pieces of text;secondarily evaluating each of said pieces of text on the basis of a plurality of evaluation functions, wherein certain evaluation functions among said plurality of evaluation functions include, as variables, said measure of positive expressions and said measure of negative expressions; andextracting a piece of text having an evaluation result with a higher rating in preference to a piece of text having an evaluation result with a lower rating, wherein said individual evaluation results are based on said same evaluation function among said plurality of evaluation functions.2. The method of claim 1 , wherein one evaluation function among said certain evaluation functions is a function outputting an evaluation result with a higher rating for a piece of text having an average measure of positive expressions and an average measure of negative expressions.3. The method of claim 1 , ...

Подробнее
21-11-2013 дата публикации

METHOD AND SYSTEM RELATING TO SALIENT CONTENT EXTRACTION FOR ELECTRONIC CONTENT

Номер: US20130311169A1
Автор: Khan Shahzad
Принадлежит: WHYZ TECHNOLOGIES LIMITED

Individuals receive overwhelming barrage of information which must be filtered, processed, analysed, reviewed, consolidated and distributed or acted upon. Automatic approaches to “scraping” salient content from sources of content are provided allowing the salient content to be provided to the user or subjected to further processing such as clustering or sentiment analysis for example. 1. A method comprising:a) receiving an item of content;b) identifying within the item of content using a microprocessor a set of lexical pattern cues for core content of the item of content and selecting a segment of the item of content having a highest likelihood as being the core content based upon a structural analysis of the item of content in dependence upon at least the set of lexical pattern cues;c) parsing the item of content to generate a hierarchy of content within the item of content;d) ranking the hierarchy of content in dependence upon at least the lexical pattern cues and sorting the resulting ranking;e) identifying a gap when searching down the ranking meeting a predetermined threshold and removing those portions of the hierarchy of content below the gap to generate truncated content;f) find all occurrences for portions of the hierarchy of content with closest match to the lexical pattern cues closest to the start of the item of content;g) determining whether multiple matches to the lexical pattern cues exist and establishing an action in dependence upon at least whether multiple matches exist or not; establishing the occurrence for the portion of the hierarchy of content as the core content of the item of content when the determination of multiple matches is negative; and', 'establishing the occurrence for the portion of the hierarchy of content that at least one of contains the largest portion of the item of content and is the first occurrence as the core content of the item of content when the determination of multiple matches is positive., 'h) performing the action, ...

Подробнее
19-12-2013 дата публикации

Assisted Free Form Decision Definition Using Rules Vocabulary

Номер: US20130339003A1
Принадлежит:

A method of decision definition using a rules vocabulary includes: receiving free form input; identifying terms contained within the free form input; searching the rules vocabulary objects for terms; responsive to the term being found, obtaining input from a user as to whether to use the found term; responsive to the term not being found; searching the rules vocabulary attributes for terms having attributes corresponding to the term; responsive to the term being found, obtaining input from a user as to whether to use the found term; and refactoring the free form input with the found term accepted by the user. The method also includes updating the rules vocabulary with the term identified in the free form input as a synonym for the term found in said rules vocabulary. One embodiment further provides a method of determining semantic equivalence between a plurality of rules using a rules database having preferred terms. 1. A computer-implemented method of decision definition using a rules vocabulary , the method comprising the steps of:receiving free form input;a data processing system identifying terms contained within the free form input;searching the rules vocabulary objects for terms;{'b': '518', 'responsive to the term being found, obtaining () input from a user as to whether to use the found term;'} searching the rules vocabulary attributes for other terms having attributes corresponding to the term that is not found;', 'responsive to another term being found, obtaining input from a user as to whether to use the found similar term; and, 'responsive to the term not being foundrefactoring the free form input with the found term or the other term accepted by the user.2. The method of claim 1 , wherein said terms include one of objects claim 1 , attributes claim 1 , values claim 1 , relationships claim 1 , and verbs.3. The method of claim 1 , further comprising updating said rules vocabulary with said term identified in said free form input.4. The method of claim 1 , ...

Подробнее
30-01-2014 дата публикации

Labeling Context Slices To Produce a Storyline from Mobile Device Data

Номер: US20140032208A1
Принадлежит: ARO, Inc.

Embodiments create and label context slices from observation data that together define a storyline of a user's movements. A context is a (possibly partial) specification of what a user was doing in the dimensions of time, place, and activity. Contexts can vary in their specificity, their semantic content, and their likelihood. A storyline is composed of a time-ordered sequence of contexts that partition a given span of time. A storyline is created through a process of data collection, slicing and labeling. Raw context data can be collected from a variety of observation sources with various error characteristics. Slicing refines the chaotic collection of contexts produced by data collection into a single consistent storyline composed of a sequence of contexts representing homogeneous time intervals. Labeling adds more specific and semantically meaningful data (e.g., geography, venue, activity) to the storyline produced by slicing. 1. A method of labeling context slices of a storyline of a user's movements , the method comprising:receiving a plurality of context slices derived from associated context data collected from a plurality of observation sources; identifying a set of candidate labels based on context data associated with the context slice, the candidate labels each comprising semantic data describing the context data associated with the context slice, the semantic data selected from a group consisting of geography, venue, and activity;', 'ranking the set of candidate labels by likelihood;', 'applying one or more of the candidate labels to the context slice; and', 'storing a correspondence between the applied one or more labels and the context slice., 'for each of the plurality of context slices2. The method of claim 1 , wherein the set of candidate labels are from public venue records claim 1 , a user-specified database of places claim 1 , a database of descriptions of activities commonly applicable to venue categories claim 1 , calendar data claim 1 , and ...

Подробнее
06-02-2014 дата публикации

Applying Service Levels to Transcripts

Номер: US20140039880A1
Автор: Finke Michael, Koll Detlef
Принадлежит: MULTIMODAL TECHNOLOGIES, LLC

Speech is transcribed to produce a draft transcript of the speech. Portions of the transcript having a high priority are identified. For example, particular sections of the transcript may be identified as high-priority sections. As another example, portions of the transcript requiring human verification may be identified as high-priority sections. High-priority portions of the transcript are verified at a first time, without verifying other portions of the transcript. Such other portions may or may not be verified at a later time. Limiting verification, either initially or entirely, to high-priority portions of the transcript limits the time required to perform such verification, thereby making it feasible to verify the most important portions of the transcript at an early stage without introducing an undue delay into the transcription process. Verifying the other portions of the transcript later ensures that early verification of the high-priority portions does not sacrifice overall verification accuracy. 1(A) identifying a first semantic meaning of a first portion of a first transcript of speech;(B) assigning a first service level to the first portion of the transcript based on the first semantic meaning;(C) identifying a second semantic meaning of a second portion of the first transcript;(D) assigning a second service level to the second portion of the transcript based on the second semantic meaning; and(E) at a processor, verifying the transcript in accordance with the first and second service levels.. A computer-implemented method comprising: This application is related to U.S. patent application Ser. No. 10/923,517, filed on Aug. 20, 2004, entitled “Automated Extraction of Semantic Content and Generation of a Structured Document from Speech,” now U.S. Pat. No. 7,584,103, which is hereby incorporated by reference herein.It is desirable in many contexts to generate a structured textual document based on human speech. In the legal profession, for example, ...

Подробнее
27-02-2014 дата публикации

METHOD AND SYSTEM FOR DISCOVERING SUSPICIOUS ACCOUNT GROUPS

Номер: US20140058723A1

In one exemplary embodiment, a system for discovering suspicious account groups establishes a language model according to the post contents from each account of a first group of accounts during a first time interval, to describe the speech of the account, and compares the similarity among a plurality of language models of the first group of accounts to cluster the first group of accounts; and for a plurality of newly added data during a second time interval, discovers near-synonyms of at least a monitored vocabulary set, and updates the near-synonyms to a plurality of language models of a second group of accounts. The system further integrates the first and the second groups of accounts, and re-clusters an integrated group of accounts. 1. A method for discovering suspicious account groups , comprising:under a control of at least one hardware processor,establishing a language model according to one or more post contents from each account of a first group of accounts during a first time interval, to describe a linguistic fashion of the account, and comparing a similarity among a first group of language models of the first group of accounts to cluster the first group of accounts; andfor a plurality of newly added data during a second time interval, discovering one or more near-synonyms of at least one monitored vocabulary set, and updating the one or more near-synonyms to a second group of language models of a second group of accounts, further integrating the first and the second groups of accounts, and re-clustering an integrated group of accounts.2. The method as claimed in claim 1 , said method further includes:for the plurality of newly added data during each updated time interval of a plurality of updated time intervals, discovering a plurality of near-synonyms of at least a monitored vocabulary set, and updates or reconstructs said plurality of near-synonyms to a plurality of language models of a different group of accounts, and further integrating said different ...

Подробнее
13-03-2014 дата публикации

System and Method of Spoken Language Understanding in Human Computer Dialogs

Номер: US20140074477A1
Принадлежит: AT&T INTELLECTUAL PROPERTY II, L.P.

A system and method are disclosed that improve automatic speech recognition in a spoken dialog system. The method comprises partitioning speech recognizer output into self-contained clauses, identifying a dialog act in each of the self-contained clauses, qualifying dialog acts by identifying a current domain object and/or a current domain action, and determining whether further qualification is possible for the current domain object and/or current domain action. If further qualification is possible, then the method comprises identifying another domain action and/or another domain object associated with the current domain object and/or current domain action, reassigning the another domain action and/or another domain object as the current domain action and/or current domain object and then recursively qualifying the new current domain action and/or current object. This process continues until nothing is left to qualify. 1. A method comprising:identifying, in a domain-independent manner, a dialog act for an independent clause of a speech recognizer output;identifying, in a domain-dependent manner, an object within the independent clause; andrecursively generating, via a processor, for each sub-independent clause within the independent clause, a semantic representation using the dialog act and the object of the independent clause.2. The method of claim 1 , wherein the semantic representation is used by a dialog manager in a spoken dialog system to determine a response to a user input.3. The method of claim 1 , further comprising:identifying, in the domain-dependent manner, an action within each of the independent clause, wherein recursively generating the semantic representation further comprises using the action.4. The method of claim 1 , wherein while recursively generating the semantic representation claim 1 , additional objects are extracted from the independent clause.5. The method of claim 1 , wherein identifying the object comprises using a domain specific ...

Подробнее
03-04-2014 дата публикации

EMOTION IDENTIFICATION SYSTEM AND METHOD

Номер: US20140095149A1
Принадлежит:

A system and method for identifying emotion in text that connotes authentic human expression, and training an engine that produces emotional analysis at various levels of granularity and numerical distribution across a set of emotions at each level of granularity. The method may include classifying textual data as emotional textual data or non-emotional textual data, and determining duration of an emotional state. 1. A method of classifying textual data as emotional textual data or non-emotional textual data comprising:providing, with a processor, a database of data indicators that each define emotional content of textual data;receiving, at a processor, first textual data authored by an individual;processing, with a processor, the first textual data to produce a first data indicator defining emotional content of the first textual data;inputting, with a processor, the first data indicator into an emotion similarity model and the data indicators of the database into the emotion similarity model to determine at least one similarity between the first data indicator and the data indicators of the database; andclassifying, with a processor, the first textual data as emotional textual data or non-emotional textual data based on the at least one similarity.2. The method of claim 1 , wherein the textual data of the database has been authored by at least one individual on a webpage of an online forum system.3. The method of claim 2 , wherein the textual data of the database has been tagged with at least one tag by an author of the textual data claim 2 , the at least one tag being associated with at least one emotion and associating at least a portion of the textual data of the database with the at least one emotion.4. The method of claim 1 , wherein the first data indicator is produced using textual analysis selected from a group consisting of latent semantic analysis claim 1 , and positive pointwise mutual information.5. The method of claim 1 , wherein the emotion similarity ...

Подробнее
02-01-2020 дата публикации

System and Method for a Digital Therapeutic Delivery of Generalized Clinician Tips (GCT)

Номер: US20200000389A1
Принадлежит:

This invention relates generally to the field of electronic communications and the transmittance of such communications. More specifically, the invention discloses a new and useful method for delivering a therapeutic through an eco-system of digital content based on a user-mapped EMS. The method comprises: receiving a text input comprising message content from an electronic computing device associated with a user; parsing the message content comprised in the text input for emotionally-charged language; assigning a sentiment value, based on the emotionally-charged language, from a dynamic sentiment value spectrum to the text input; and, based on the sentiment value, imposing a sentiment vector, corresponding to the assigned sentiment value, to the text input, the imposed sentiment vector rendering a sensory effect on the message content designed to convey a corresponding sentiment. 1. A method for delivering generalized clinician tips (GCT) , specific to a users emotional or mental state (EMS) , said method comprising the steps of:fast-capturing of at least one EMS for the user based on at least one of a user-plotted point on at least one of a displayed mood map or mood wheel or user-selected from a menu, said EMS indicating an assessment of at least one of a feeling, sensation, mood, mental state, emotional condition, or physical status of the user;delivering said GCT to at least one of a user's device (mobile device, wearable, smart watch, tablet, desktop, laptop, headphones, speaker, or smart speaker) or home entertainment system in communication with at least one of the user's device or a voice-activated Internet-of-Things (IoT) hub; andwherein the GCT is selected from a store of EMS-specific GCT and is at least one of a suggestion or recommendation for the user to perform a task with clinical-consensus benefits to address at least one of the EMS.2. The method of claim 1 , wherein the EMS is defined based on at least one of a stored or previous user-plotted or ...

Подробнее
04-01-2018 дата публикации

SYSTEM AND METHOD FOR SEMANTIC PROCESSING OF NATURAL LANGUAGE COMMANDS

Номер: US20180001482A1
Принадлежит:

A system, method and computer-readable storage devices are for processing natural language commands, such as commands to a robotic arm, using a Tag & Parse approach to semantic parsing. The system first assigns semantic tags to each word in a sentence and then parses the tag sequence into a semantic tree. The system can use statistical approach for tagging, parsing, and reference resolution. Each stage can produce multiple hypotheses, which are re-ranked using spatial validation. Then the system selects a most likely hypothesis after spatial validation, and generates or outputs a command. In the case of a robotic arm, the command is output in Robot Control Language (RCL). 1. A method comprising:assigning, via a sequence tagger, a part of speech, a semantic tag and a label to each word in a natural language command addressed to a robotic arm to yield a tagged natural language command;semantically parsing, via a processor and a semantic parsor, the tagged natural language command to yield a parsed natural language command, wherein a data set used to train the semantic parsor does not include any tag from which the sequence tagger selects for assigning the semantic tag; andmoving the robotic arm according to a spatial validation of a physical context of the natural language command when applied to the robotic arm.2. The method of claim 1 , further comprising:receiving the natural language command.3. The method of claim 1 , further comprising:identifying a command type for the parsed natural language command and wherein moving the robotic arm is according to the command type.4. The method of claim 1 , wherein the semantic tag identifies entity types and event types in the natural language command.5. The method of claim 3 , further comprising claim 3 , after identifying the command type:performing the spatial validation, based on the command type, for the physical context of the natural language command when applied to the robotic arm; andif the spatial validation ...

Подробнее
05-01-2017 дата публикации

TASK MANAGEMENT BASED ON SEMANTIC ANALYSIS

Номер: US20170004008A1

Example implementations relate to managing tasks for a user. In example implementations, semantic tags may be detected in electronic documents that are accessed by a user. Models of the accessed electronic documents may be generated based on semantic analysis. A priority of a task for the user may be determined based on the detected semantic tags and the generated models. 1. A method for task management , the method comprising:detecting semantic tags in electronic documents accessed by a first user;generating, based on semantic analysis, models of the electronic documents accessed by the first user;inferring, based on the generated models, user collaboration patterns of the first user; anddetermining, based on the detected semantic tags and the inferred user collaboration patterns, a priority of a task for the first user, wherein the task is associated with one of the accessed electronic documents.2. The method of claim 1 , further comprising receiving a user request for task management services.3. The method of claim 2 , wherein the user request is received via a text message.4. The method of claim 2 , wherein:the user request specifies a task type to manage; andsemantic tags are detected in, and models are generated of, accessed electronic documents associated with the specified task type.5. The method of claim 1 , wherein determining the priority of the task comprises ranking a plurality of tasks in a task list claim 1 , the method further comprising allowing the first user to modify the ranking of the plurality of tasks.6. The method of claim 1 , wherein the one of the accessed electronic documents is initially accessed on a first user device claim 1 , the method further comprising:generating a first electronic calendar appointment, in an electronic calendar associated with the first user, for a time when the task is to be performed; anddisplaying the one of the accessed electronic documents on a second user device during the time when the task is to be ...

Подробнее
05-01-2017 дата публикации

Natural Language Relatedness Tool using Mined Semantic Analysis

Номер: US20170004129A1
Принадлежит:

Mined semantic analysis techniques (MSA) include generating a first subset of concepts, from a NL corpus, that are latently associated with an NL candidate term based on (i) a second subset of concepts from the corpus that are explicitly or implicitly associated with the candidate term and (ii) a set of concept association rules. The concept association rules are mined from a transaction dictionary constructed from the corpus and defining discovered latent associations between corpus concepts. A concept space of the candidate term includes at least portions of both the first and second subset of concepts, and includes indications of relationships between latently-associated concepts and the explicitly/implicitly-associated concepts from which the latently-associated concepts were derived. Measures of relatedness between candidate terms are deterministically determined based on their respective concept spaces. Example corpora include digital corpora such as encyclopedias, journals, intellectual property datasets, health-care related datasets/records, financial-sector related datasets/records, etc. 1. A natural language-relatedness (NLR) tool , comprising:an input interface via which an initial subset of a set of concepts of a corpus is received, each concept of the initial subset of concepts being semantically associated with a candidate term, the candidate term comprising one or more natural language words, the corpus comprising natural language, and the semantic associations of the initial subset of concepts with the candidate term determined based on a first semantic analysis of the corpus;a concept expander comprising first computer-executable instructions that are stored on one or more memories and that, when executed by one or more processors, cause the NLR tool to mine, based on the initial subset of concepts, a set of concept association rules of the corpus to discover an expansion subset of concepts of the corpus, each concept of the expansion subset of ...

Подробнее
05-01-2017 дата публикации

IDENTIFYING WORD-SENSES BASED ON LINGUISTIC VARIATIONS

Номер: US20170004130A1
Принадлежит:

One or more words are received. A set of frequency of occurrence values of the received word(s) within a set of domain tables is determined. A domain table in the set of domain tables is associated to the received word(s), based on the set of frequency of occurrence values meeting a threshold value. A word-sense of the received word(s) is determined based on a corresponding word-sense in the associated domain table and/or corresponding domain dictionary. 1one or more computer processors;one or more computer-readable storage media;program instructions stored on the computer-readable storage media for execution by at least one of the one or more processors, the program instructions comprising:program instructions to generate, by a computer, a plurality of arrays of aggregated statistical information of words, their corresponding word-senses, and temporal properties within different professional fields using an n-gram viewer, wherein the aggregated statistical information comprises frequency of usage of words, frequency of occurrence of words, frequency of co-occurrence of words with other words, and their respective corresponding word-senses;program instructions to generate, by the computer, a set of domain tables based on the generated plurality of arrays of aggregated statistical information, wherein each of the domain tables within the set of domain tables corresponds to a different professional field comprising medical, veterinary, legal, and engineering;program instructions to receive, from a remote server through a network, a digital text stream comprising metadata and one or more words from a doctor, using the computer, the network being an internet connection;program instructions to select, using the metadata, a medical frequency domain table, veterinary frequency domain table, and a word-sense domain table from the set of domain tables;program instructions to determine a frequency of occurrence value for the received digital text stream within each of the ...

Подробнее
05-01-2017 дата публикации

SENTIMENT ANALYSIS AND INVENTORY BASED EVENT PLANNING

Номер: US20170004439A1

To plan a social event, corresponding to a first invitee from an invitees list for the social event, a permission by the first invitee to obtain preference information from a data source is received. The preference information is usable to determine a sentiment of the first invitee towards an aspect of the social event. The information about the first invitee is collected from the data source. A sentiment analysis is performed using Natural Language Processing (NLP), on the information, producing the preference information of the first invitee. the preference information of the first invitee is aggregated with preference information of a second invitee in the invitee list to form an aggregated preference. A list of items needed to satisfy the aggregated preference is created. 1. A method comprising:receiving, corresponding to a first invitee from an invitees list for a social event and a permission by the first invitee to obtain preference information from a data source, the preference information being usable to determine a sentiment of the first invitee towards an aspect of the social event;collecting, from the data source, information about the first invitee;performing, by using a processor and a memory, using Natural Language Processing (NLP), a sentiment analysis on the information about the first invitee, the sentiment analysis producing the preference information of the first invitee;aggregating the preference information of the first invitee with preference information of a second invitee in the invitee list to form an aggregated preference; andcreating a list of items needed to satisfy the aggregated preference.2. The method of claim 1 , further comprising:assigning a confidence rating to the aggregated preference to form a confidence rated preference; andselecting, responsive to the confidence rating exceeding a threshold confidence rating, from a set of confidence rated preferences, the confidence rated preference.3. The method of claim 2 , further ...

Подробнее
07-01-2016 дата публикации

Search technology using synonims and paraphrasing

Номер: US20160004766A1
Принадлежит: ABBYY InfoPoisk LLC

The present invention is a method and a system of organizing information searches in electronic text corpora and displaying the search results in the user interface. The system and the method enable searches not just for words or word combinations, but also for specific lexical meanings of words, where a lexical meaning is a realization of a word's semantic meaning in a particular language. The completeness of search results is bases on incorporation synonyms and paraphrases in the search. The method also includes searching for fragments matching the query in electronic text corpora, estimating the results and the displaying the results ranked to the user.

Подробнее
07-01-2016 дата публикации

Method of providing relevant information and electronic device adapted to the same

Номер: US20160004784A1
Принадлежит: SAMSUNG ELECTRONICS CO LTD

Various embodiments of the present disclosure provide a method of providing relevant information and an electronic device adapted to the method. The method includes: displaying first information; extracting one or more retrieval words in a form of keyword or a form of phrase from the first information; obtaining second information as relevant information related to the first information by using the one or more extracted retrieval words; emphasizing objects corresponding to the one or more retrieval words on the first information; and displaying, when detecting the selection of at least one from the emphasized objects, the second information including the selected objects.

Подробнее
05-01-2017 дата публикации

METHOD FOR BUILDING A SPEECH FEATURE LIBRARY, AND METHOD, APPARATUS, DEVICE, AND COMPUTER READABLE STORAGE MEDIA FOR SPEECH SYNTHESIS

Номер: US20170004820A1
Принадлежит:

The present invention provides a method for building a speech feature library, as well as a method, an apparatus, a device and corresponding non-volatile, non-transitory computer readable storage media for speech synthesis. Because the speech feature library used in the present invention saves at least one context corresponding to each piece of personalized textual information and at least one piece of textual information semantically identical to the personalized textual information, when performing speech synthesis, even if the provided textual information is not personalized textual information corresponding to the desired personalized speech, personalized textual information semantically identical to the textual information to be subject to speech synthesis may be first found in the speech feature library to thereby achieve personalized speech synthesis, such that use of the personalized speech will not be restricted by aging, sickness, and death of a person. 1. A method for building a speech feature library , comprising:converting speech recording of an object into personalized textual information;analyzing and obtaining at least one context corresponding to each piece of personalized textual information and at least one semantically identical piece of textual information;saving, in a speech feature library of the object, each piece of personalized textual information and a corresponding linguistic feature, each linguistic feature indicating a context and a piece of textural information that correspond;performing audio sampling to the speech recording to obtain an audio sample value; andsaving an audio feature in the speech feature library of the object, the audio feature indicating an audio sample value.2. The method according to claim 1 , further comprising:saving a speech feature corresponding to each piece of personalized textual information in the speech feature library, each speech feature indicating a piece of linguistic feature and a piece of audio ...

Подробнее
07-01-2021 дата публикации

CONTROLLING DOCUMENT EDITS IN A COLLABORATIVE ENVIRONMENT

Номер: US20210004356A1
Принадлежит:

An approach is provided for controlling an edit of content. It is determined that an edit of content is being performed by a user. A criticality score of the content is determined by using a natural language processing technique. The criticality score indicates a measure of sensitivity of the content. A behavior of the user while editing the content is identified. A measure of deviation is determined between the behavior of the user and a pattern of historical editing behavior of user(s). Based on the criticality score and the measure of the deviation, an edit risk score is determined. The edit risk score is determined to exceed a threshold score. Responsive to determining that the edit risk score exceeds the threshold score, an alert is transmitted to the user. The alert indicates to the user that the edit of the content is an unexpected edit. 1. A computer-implemented method comprising:determining, by one or more processors, that an edit of content is being performed by a user;determining, by the one or more processors, a criticality score of the content by using a natural language processing technique, the criticality score indicating a measure of sensitivity of the content;identifying, by the one or more processors, a behavior of the user while editing the content;determining, by the one or more processors, a measure of deviation between the behavior of the user and a pattern of historical behavior of one or more users while the one or more users edited the content;based on the criticality score and the measure of the deviation between the behavior of the user and the pattern of historical behavior, determining, by the one or more processors, an edit risk score indicating a probability that the edit of the content is an unexpected edit of the content;determining, by the one or more processors, that the edit risk score exceeds a threshold score; andresponsive to determining that the edit risk score exceeds the threshold score, transmitting, by the one or more ...

Подробнее
04-01-2018 дата публикации

STATE MACHINE BASED CONTEXT-SENSITIVE SYSTEM FOR MANAGING MULTI-ROUND DIALOG

Номер: US20180004729A1
Автор: Qiu Nan, Wang Haofen
Принадлежит:

The present invention discloses a state machine based context-sensitive multi-round dialog management system, comprising: an input module, for receiving multi-modal input information from a user; an intention identification engine module, for identifying intention information in the multi-modal input information; an intention module, for bringing multiple intention information identified by the intention identification engine module into one-to-one correspondence with multiple intention sub-modules at back ends; a state machine module, comprising a plurality of state machines for managing a relevant context in the dialog management system and providing support for an output result; an instruction parsing engine module, comprising a plurality of instruction parsing engine sub-modules for parsing corresponding intention information and acquiring the parsed multiple intention information; and an output module, for acquiring policy information according to the results from the parsing engine module and the intention identification module, and transmitting the policy information to the state machine module. 1. A state machine based context-sensitive multi-round dialog management system , comprising:an input module, for receiving multi-modal input information from a user;an intention identification engine module, for identifying intention information in the multi-modal input information;an intention module, for bringing multiple intention information identified by the intention identification engine module into one-to-one correspondence with multiple intention sub-modules at back ends;a state machine module, comprising a plurality of state machines for managing a relevant context in the dialog management system and providing support for an output result;an instruction parsing engine module, comprising a plurality of instruction parsing engine sub-modules for parsing corresponding intention information and acquiring the parsed multiple intention information; andan output ...

Подробнее
04-01-2018 дата публикации

CORPUS GENERATION DEVICE AND METHOD, HUMAN-MACHINE INTERACTION SYSTEM

Номер: US20180004730A1
Автор: Qiu Nan, Wang Haofen
Принадлежит:

A corpus generation device and method, the device comprising: a segmentation module, connected to at least one monolingual parallel corpus for segmenting a sentence into words and processing the segmented words by a knowledge-driven approach; a classification module, for classifying sentences having different tag sequences but the same meaning into the same sentence cluster; a mapping module, for determining the categories of sentence structures of all the sentences in the sentence cluster, recording and storing a mapping mode for transforming tags between sentence structures when different categories of sentence structures in the same sentence cluster are transformed; a sentence structure generation module, for generating sentence structures according to a first mapping mode between a first category of sentence structures in one of the sentence clusters and other categories of sentence structures in the same sentence cluster; and a corpus generation module, for nesting a word corresponding to a sequence tag to generate a new monolingual parallel corpus. 1. A knowledge-driven corpus generation device , comprising:a segmentation module, connected to at least one monolingual parallel corpus and configured for segmenting sentences in each monolingual parallel corpus into words, and tagging the segmented words by a knowledge-driven approach;a classification module, for identifying the knowledge-driven sentence, and classifying sentences having different tag sequences but the same meaning into the same sentence cluster;a mapping module, for analyzing the sentences in each sentence cluster of each monolingual parallel corpus, determining the categories of sentence structures of all the sentences in the sentence cluster, determining, recording and storing a mapping mode for transforming tags between corresponding sentence structures when different categories of sentence structures in the same sentence cluster are transformed;a sentence structure generation module, for ...

Подробнее
04-01-2018 дата публикации

OBTAINING TRANSLATIONS UTILIZING TEST STEP AND SUBJECT APPLICATION DISPLAYS

Номер: US20180004733A1
Принадлежит:

In one example of the disclosure, a machine-translation for each of a plurality of strings is determined, the strings for display upon execution of a subject application. A first display of a test step to be performed by a test application during execution of the subject application is caused. A second display of a state for the subject application that includes the plurality of strings is caused concurrent with the first display. A user-translation for each of the strings is obtained, the user-translations provided via a GUI included within the second display. A translation property file associated with the subject application is amended to include the user-translations. 1. A system , comprising:a machine-translation engine, to determine a machine-translation for each of a plurality of strings, the strings for display upon execution of a subject application;a first display engine, to cause a first display of a test step to be performed by a test application during execution of the subject application;a second display engine, to cause, concurrent with the first display, a second display of a state for the subject application that includes the plurality of strings;a user-translation engine, to obtain a user-translation for each of the strings, the user-translations provided via a GUI included within the second display; anda property file engine, to amend a translation property file associated with the subject application to include the user-translations.2. The system of claim 1 , wherein the first display engine is to obtain a user-initiated instruction to begin a quality assurance test upon the subject application claim 1 , and the first display engine is to cause the first display and the second display engine is to cause the second display responsive to receipt of the instruction.3. The system of claim 1 , wherein the test step is performed by the test application concurrent with the provision of the second display.4. The system of claim 1 , wherein the GUI is a ...

Подробнее
02-01-2020 дата публикации

Analyzing and Answering Questions

Номер: US20200004753A1
Принадлежит:

A computer-implemented method includes receiving, at a computer system, a question; identifying one or more first semantic elements in the question; selecting, from one or more electronic documents, a plurality of candidate responses to the question based on comparison of the one or more first semantic elements to second semantic elements; determining completeness scores for the plurality of candidate responses, wherein each of the completeness scores indicates how completely a corresponding candidate response from the plurality of candidate responses answers the question; determining relevance scores for the plurality of candidate responses, wherein each of the relevance scores indicates how relevant a corresponding candidate response from the plurality of candidate responses is to the question; and providing, by the computer system, at least a portion of the plurality of candidate responses based, at least in part, on the completeness scores and the relevance scores. 1receiving, at a computer system, a question;identifying, by the computer system, one or more first semantic elements in the question;selecting, from one or more electronic documents, a plurality of candidate responses to the question based on comparison of the one or more first semantic elements to second semantic elements that have been identified in the plurality of candidate responses;determining completeness scores for the plurality of candidate responses, wherein each of the completeness scores indicates how completely a corresponding candidate response from the plurality of candidate responses answers the question;determining relevance scores for the plurality of candidate responses, wherein each of the relevance scores indicates how relevant a corresponding candidate response from the plurality of candidate responses is to the question; andproviding, by the computer system, at least a portion of the plurality of candidate responses based, at least in part, on the completeness scores and the ...

Подробнее
07-01-2021 дата публикации

Method and System for Intelligently Suggesting Paraphrases

Номер: US20210004432A1
Принадлежит: Microsoft Technology Licensing, LLC

A method and system for providing replacement text segments for a given text segment may include receiving a request to provide the replacement text segment for the text segment in the document, examining a content characteristic of the document, and examining at least one of user-specific information, organization-specific information, or non-linguistic features of the document, before identifying at least one replacement text segment for the text segment, via a machine translation system, based on the content characteristic of the document and at least one of the user-specific information, the organization-specific information, or the non-linguistic features of the document. The method and system may include providing the identified replacement text segment for display to a user, receiving an input indicating a user's selection of the identified replacement text segment, and upon receiving the input, replacing the text segment in the document with the identified replacement text segment. 1. A data processing system comprising:a processor; anda memory in communication with the processor, the memory comprising executable instructions that, when executed by, the processor, cause the data processing system to perform functions of:receiving a request to provide a replacement text segment for a text segment in a document;upon receiving the request, examining a content characteristic of the document;examining at least one of user-specific information, organization-specific information, or non-linguistic features of the document;identifying at least one replacement text segment for the text segment, via a machine translation system, based on the content characteristic of the document and at least one of the user-specific information, the organization-specific information, or the non-linguistic features of the document;providing the identified at least one replacement text segment for display to a user;receiving an input indicating a user's selection of the identified at ...

Подробнее
07-01-2021 дата публикации

IDENTIFYING ENTITY ATTRIBUTE RELATIONS

Номер: US20210004438A1
Автор: Iter Dan, Li Fangtao, Yu Xiao
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, that facilitate identifying entity-attribute relationships in text corpora. Methods include determining whether an attribute in a candidate entity-attribute pair is an actual attribute of the entity in the entity-attribute candidate pair. This includes generating embeddings for words in the set of sentences that include the entity and the attribute and generating, using known entity-attribute pairs. This also includes generating an attribute distributional embedding for the entity based on other attributes associated with the entity from the known entity-attribute pairs, and generating an attribute distributional embedding for the attribute based on known attributes associated with known entities of the attribute in the known entity-attribute pairs. Based on these embeddings, a feedforward network determines whether the attribute in the entity-attribute candidate pair is an actual attribute of the entity in the entity-attribute candidate pair. 1. A computer implemented method comprising:obtaining an entity-attribute candidate pair that defines an entity and an attribute, wherein the attribute is a candidate attribute of the entity; generating embeddings for words in the set of sentences that include the entity and the attribute;', 'generating, using known entity-attribute pairs, an attribute distributional embedding for the entity, wherein the attribute distributional embedding for the entity specifies an embedding for the entity based on other attributes associated with the entity from the known entity-attribute pairs;', 'generating, using the known entity-attribute pairs an attribute distributional embedding for the attribute, wherein the attribute distributional embedding for the attribute specifies an embedding for the attribute that is based on known attributes associated with known entities of the attribute in the known entity-attribute pairs;', 'determining, ...

Подробнее
02-01-2020 дата публикации

METHOD FOR PROCESSING LANGUAGE INFORMATION AND ELECTRONIC DEVICE THEREFOR

Номер: US20200004768A1
Принадлежит:

The disclosure relates to an artificial intelligence (AI) system for simulating human brain functions such as perception and judgement by using a machine learning algorithm such as deep learning, and an application thereof. An operation method of an electronic device comprises the steps of: receiving an input message; determining a user's language information included in the input message; determining language information for a response corresponding to the user's language information; and outputting the response on the basis of the language information for the response. 1. A method for operating an electronic device , the method comprising:receiving an input message;determining language information of a user included in the input message;determining language information for a response corresponding to the language information of the user; andoutputting the response, based on the language information for the response.2. The method of claim 1 , wherein determining the language information for the response corresponding to the language information of the user comprises:determining a value indicating a language proficiency of the user, based on the language information of the user; anddetermining the language information for the response corresponding to the value indicating the language proficiency of the user.3. The method of claim 2 , wherein the determining the value indicating the language proficiency of the user claim 2 , based on the language information of the user claim 2 , comprises determining the value indicating the language proficiency of the user according to a number of words included in the language information of the user.4. The method of claim 2 , wherein the outputting the response claim 2 , based on the language information for the response claim 2 , comprises:outputting a response comprised of photographs if the value indicating the language proficiency of the user is a first value; andoutputting a response comprised of moving pictures if the ...

Подробнее
07-01-2021 дата публикации

KEYPHRASE EXTRACTION BEYOND LANGUAGE MODELING

Номер: US20210004439A1
Принадлежит: Microsoft Technology Licensing, LLC

A system for extracting a key phrase from a document includes a neural key phrase extraction model (“BLING-KPE”) having a first layer to extract a word sequence from the document, a second layer to represent each word in the word sequence by ELMo embedding, position embedding, and visual features, and a third layer to concatenate the ELMo embedding, the position embedding, and the visual features to produce hybrid word embeddings. A convolutional transformer models the hybrid word embeddings to n-gram embeddings, and a feedforward layer converts the n-gram embeddings into a probability distribution over a set of n-grams and calculates a key phrase score of each n-gram. The neural key phrase extraction model is trained on annotated data based on a labeled loss function to compute cross entropy loss of the key phrase score of each n-gram as compared with a label from the annotated dataset. 1. A system for extracting a key phrase from a document , the system comprising:a processor and a memory, wherein a program executing on the processor is configured to run a neural key phrase extraction model having:a first layer to extract a word sequence from the document;a second layer to represent each word in the word sequence by word embedding and visual features;a third layer to concatenate the word embedding and the visual features to produce hybrid word embeddings;a convolutional transformer to model the hybrid word embeddings to n-gram embeddings; anda feedforward layer to convert the n-gram embeddings into a probability distribution over a set of n-grams and calculate a key phrase score of each n-gram,wherein the neural key phrase extraction model was trained on annotated data based on a labeled loss function to compute loss of the key phrase score of each n-gram as compared with a label from the annotated dataset, andwherein the key phrase score of each n-gram is used in a computer-implemented application.2. The system of claim 1 , wherein the labeled loss function is ...

Подробнее
07-01-2021 дата публикации

Toxic vector mapping across languages

Номер: US20210004440A1
Принадлежит: Spectrum Labs Inc, Superset Partners Inc

Methods, systems, and devices for language mapping are described. Some machine learning models may be trained to support multiple languages. However, word embedding alignments may be too general to accurately capture the meaning of certain words when mapping different languages into a single reference vector space. To improve the accuracy of vector mapping, a system may implement a supervised learning layer to refine the cross-lingual alignment of particular vectors corresponding to a vocabulary of interest (e.g., toxic language). This supervised learning layer may be trained using a dictionary of toxic words or phrases across the different supported languages in order to learn how to weight an initial vector alignment to more accurately map the meanings behind insults, threats, or other toxic words or phrases between languages. The vector output from this weighted mapping can be sent to supervised models, trained on the reference vector space, to determine toxicity scores.

Подробнее
07-01-2021 дата публикации

DERIVING MULTIPLE MEANING REPRESENTATIONS FOR AN UTTERANCE IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK

Номер: US20210004441A1
Автор: Sapugay Edwin, Sarda Gopal
Принадлежит:

The present approaches are generally related to an agent automation framework that is capable of extracting meaning from user utterances, such as requests received by a virtual agent (e.g., a chat agent), and suitably responding to these user utterances. In certain aspects, the agent automation framework includes a NLU framework and an intent-entity model having defined intents and entities that are associated with sample utterances. The NLU framework may include a meaning extraction subsystem designed to generate meaning representations for the sample utterances of the intent-entity model to construct an understanding model, as well as generate meaning representations for a received user utterance to construct an utterance meaning model. The disclosed NLU framework may include a meaning search subsystem that is designed to search the meaning representations of the understanding model to locate matches for meaning representations of the utterance meaning model. 1. An agent automation system , comprising:a memory configured to store a natural language understanding (NLU) framework, wherein the NLU framework includes a part-of-speech (POS) component, a correction component, a variability filter component, a parser component, and a final scoring and filtering component; and using the POS component to perform part-of-speech (POS) tagging of a set of utterances to generate a set of potential POS taggings from the set of utterances;', 'using the variability filter component to remove one or more POS taggings from the set of potential POS taggings that are substantially similar to other POS taggings in the set of potential POS taggings;', 'using the parser component to generate a set of potential meaning representations from the set of potential POS taggings; and', 'using the final scoring and filtering component to calculate a respective final score for each potential meaning representation in the set of potential meaning representations and to remove potential meaning ...

Подробнее
07-01-2021 дата публикации

PREDICTIVE SIMILARITY SCORING SUBSYSTEM IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK

Номер: US20210004442A1
Принадлежит:

Present embodiments include an agent automation framework having a similarity scoring subsystem that performs meaning representation similarity scoring to facilitate extraction of artifacts to address an utterance. The similarity scoring subsystem identifies a CCG form of an utterance-based meaning representation and queries a database to retrieve a comparison function list that enables quantifications of similarities between the meaning representation and candidates within a search space. The comparison functions enable the similarity scoring subsystem to perform computationally-cheapest and/or most efficient comparisons before other comparisons. The similarity scoring subsystem may determine an initial similarity score between the particular meaning representation and the candidates of the search space, then prune non-similar candidates from the search space. Selective search space pruning enables the similarity scoring subsystem to iteratively compare more data of the meaning representation to the search space via increasingly-complex comparison functions, while narrowing the search space to potentially-matching candidates. 1. An agent automation system , comprising:a memory configured to store a natural language understanding (NLU) framework including a similarity scoring subsystem having a form class database; and receiving a meaning representation of a user utterance;', 'identifying a cognitive construction grammar (CCG) form of the meaning representation;', 'determining at least one form class entry of the form class database that matches the CCG form of the meaning representation; and', 'retrieving a mathematical comparison function list from the at least one form class entry, wherein the mathematical comparison function list enables the similarity scoring subsystem to compare at least a portion of the meaning representation to at least a search-space portion of a search-space meaning representation to determine a similarity score therebetween., 'a processor ...

Подробнее
07-01-2021 дата публикации

PINNING ARTIFACTS FOR EXPANSION OF SEARCH KEYS AND SEARCH SPACES IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK

Номер: US20210004443A1
Принадлежит:

Present embodiments include an agent automation framework having an artifact pinning subsystem that pins meaning representations of a search space to enable the agent automation system to target particularly relevant candidates for improved inferences. To generate the search space, the artifact pinning subsystem may determine multiple understandings of sample utterances within intent-entity models to generate meaning representations. The sample utterances generally each belong to an identified intent that may have been labeled with a particular entity, within a structure defined by the intent-entity models. To validate the relevance of each meaning representation for an identified intent, the artifact pinning subsystem may pin meaning representations that include the particular intent and include a respective entity corresponding to the labeled entity. In addition to model-based entity pinning, the search space may also be generated with respect to a contextual intent of an on-going conversation between a user and a behavior engine. 1. An agent automation system , comprising:a memory configured to store a natural language understanding (NLU) framework including a meaning search subsystem; and receiving a plurality of sample meaning representations, wherein one or more labeled meaning representations of the plurality of sample meaning representations comprise a particular intent and a labeled entity that is associated with the particular intent;', 'for each labeled meaning representation of the one or more labeled meaning representations, pinning a set of meaning representations from the plurality of sample meaning representations, wherein each meaning representation of the set of meaning representations comprises the particular intent and a respective entity corresponding to the labeled entity of the labeled meaning representation; and', 'generating a search space based at least in part on each set of meaning representations corresponding to each labeled meaning ...

Подробнее
02-01-2020 дата публикации

INDEX GENERATION METHOD, DATA RETRIEVAL METHOD, APPARATUS OF INDEX GENERATION

Номер: US20200004784A1
Принадлежит:

An index generation method includes: executing an extraction process that includes extracting a plurality of morphemes from document information; and executing a generation process that includes generating index information with respect to each of the plurality of morphemes, wherein the index information includes a first logical value and a second logical value, wherein the first logical value indicates existence of a first morpheme associated with ‘morpheme information indicating the first morpheme’ and ‘position information indicating a position of the first morpheme in the document information’, and wherein the second logical value indicates existence of the first morpheme associated with ‘attribute information indicating a semantic attribute of the first morpheme’ and ‘position information indicating the position of the first morpheme’. 1. An index generation method comprising:executing an extraction process that includes extracting a plurality of morphemes from document information; andexecuting a generation process that includes generating index information with respect to each of the plurality of morphemes,wherein the index information includes a first logical value and a second logical value,wherein the first logical value indicates existence of a first morpheme associated with ‘morpheme information indicating the first morpheme’ and ‘position information indicating a position of the first morpheme in the document information’, andwherein the second logical value indicates existence of the first morpheme associated with ‘attribute information indicating a semantic attribute of the first morpheme’ and ‘position information indicating the position of the first morpheme’.2. The index generation method according to claim 1 ,wherein the semantic attribute represents a semantic relationship between the first morpheme and another morpheme in the document information.3. The index generation method according to claim 2 ,wherein the attribute information indicates a ...

Подробнее
02-01-2020 дата публикации

AUTOMATED WEBSITE DATA COLLECTION METHOD

Номер: US20200004792A1
Принадлежит:

An automated website data collection method uses a hybrid web crawler strategy to obtain a probability distribution of a webpage tag of a webpage of a website to obtains an important feature of the website, and then extracts a text content of important features of the website, and forms a seed vocabulary data set using a composite semantic model. A thematic vocabulary data set having high frequency and highly representative hierarchical structure is further generated by the seed vocabulary data set, and the thematic vocabulary data can be further presented by the visualized system to show the hierarchical structure of thematic vocabulary data set. 1. An automated website data collection method for using an electronic device to crawl a website using a hybrid web crawler to generate a text data set , comprising:specifying one of web pages of the website as an analysis web page and obtaining all specified features of the analysis web page;selecting a plurality of network addresses associated with the specified features as a web crawling seed node;crawling at least one level of the network addresses associated with each web crawling seed node of the website, and selecting a part of the network addresses from the website as a set of associated network addresses;selecting a crawling target network address from the set of associated network addresses of the website;extracting all webpage tags and corresponding text content associated with the crawling target network address of the website; andgenerating the text data set by using the webpage tags and corresponding text content according to a hierarchical structure of the crawling target network address.2. The automated website data collection method as claimed in claim 1 , wherein the analysis webpage is an initial page of the website.3. The automated website data collection method as claimed in claim 1 , wherein the specified feature is a distribution probability of each webpage tag in the analysis webpage.4. The ...

Подробнее
01-01-2015 дата публикации

USING A RULE ENGINE TO MANIPULATE SEMANTIC OBJECTS

Номер: US20150006150A1
Принадлежит:

Provided are techniques for the manipulation of semantic objects within a semantic store, including a semantic reasoning apparatus comprising a processor; a non-transitory computer-readable storage medium; a semantic store comprising a plurality of semantic objects; a semantic model; a rule, comprising a condition part and an action part; wherein the rule is based upon the semantic model and configured to execute the action part in response to a determination that the condition part is satisfied by one or more objects of the plurality of semantic objects and a semantic driver that employs the semantic model as input for driving behavior, comprising logic for determining that the condition part is satisfied by the one or more objects of the plurality of objects; and modifying a semantic object of the plurality of semantic objects in conformity to the action part in response to the determining that the condition part is satisfied. 1. An semantic reasoning apparatus , comprising:a processor;a non-transitory computer-readable storage medium (CRSM) coupled to the processor;a semantic store comprising a plurality of semantic objects;a semantic model; a condition part; and', 'an action part;', 'wherein the rule is based upon the semantic model and configured to execute the action part in response to a determination that the condition part is satisfied by one or more objects of the plurality of semantic objects; and, 'a rule, comprising;'} determining that the condition part is satisfied by the one or more objects of the plurality of objects; and', 'modifying a semantic object of the plurality of semantic objects in conformity to the action part in response to the determining that the condition part is satisfied., 'a semantic driver, stored on the CRSM and executed on the processor, that employs the semantic model as input for driving behavior, comprising logic, for2. The apparatus of claim 1 , the logic further comprising logic for inferring the rule based upon information ...

Подробнее
01-01-2015 дата публикации

SYSTEM AND METHOD FOR COMPUTERIZED PSYCHOLOGICAL CONTENT ANALYSIS OF COMPUTER AND MEDIA GENERATED COMMUNICATIONS TO PRODUCE COMMUNICATIONS MANAGEMENT SUPPORT, INDICATIONS AND WARNINGS OF DANGEROUS BEHAVIOR, ASSESSMENT OF MEDIA IMAGES, AND PERSONNEL SELECTION SUPPORT

Номер: US20150006153A1
Автор: SHAW Eric D.
Принадлежит: WT Technologies, LLC

At least one computer-mediated communication produced by or received by an author is collected and parsed to identify categories of information within it. The categories of information are processed with at least one analysis to quantify at least one type of information in each category. A first output communication is generated regarding the at least one computer-mediated communication, describing the psychological state, attitudes or characteristics of the author of the communication. A second output communication is generated when a difference between the quantification of at least one type of information for at least one category and a reference for the at least one category is detected involving a psychological state, attitude or characteristic of the author to which a responsive action should be taken. 1receiving with a computer the at least one communication with each communication being comprised of a group of words originated by the person;processing a text of the received group of words in each of the received at least one communication with a computer to determine risk posed by the person represented in the text of the at least one communication; andin response to the determined risk posed by the person represented in the text of the at least one communication, generating with a computer an output communication pertaining to the risk posed by the person represented in the text obtained from the processing of the text of the received at least one communication.. A method of computer analysis of at least one communication originated from a person, comprising: This application is a continuation of U.S. Ser. No. 13/446,412, filed Apr. 13, 2012 (issued as U.S. Pat. No. 8,775,152 on Jul. 8, 2014; which, in turn is a continuation of U.S. patent application Ser. No. 13/295,138, filed Nov. 14, 2011 (now U.S. Pat. No. 8,160,867); which, in turn, is a continuation of U.S. patent application Ser. No. 12/885,806, filed Sep. 20, 2010 (now U.S. Pat. No. 8,078,453); ...

Подробнее
01-01-2015 дата публикации

USING A RULE ENGINE TO MANIPULATE SEMANTIC OBJECTS

Номер: US20150006154A1
Принадлежит:

Provided are techniques for the manipulation of semantic objects within a semantic store, including a semantic reasoning apparatus comprising a processor; a non-transitory computer-readable storage medium; a semantic store comprising a plurality of semantic objects; a semantic model; a rule, comprising a condition part and an action part; wherein the rule is based upon the semantic model and configured to execute the action part in response to a determination that the condition part is satisfied by one or more objects of the plurality of semantic objects and a semantic driver that employs the semantic model as input for driving behavior, comprising logic for determining that the condition part is satisfied by the one or more objects of the plurality of objects; and modifying a semantic object of the plurality of semantic objects in conformity to the action part in response to the determining that the condition part is satisfied. 1. A method , comprising: a condition part; and', 'an action part;', 'wherein the rule is based upon the semantic model and configured to execute the action part in response to a determination that the condition part is satisfied by one or more objects of a plurality of semantic objects of a semantic store; and, 'evaluating, by a semantic driver that employs a semantic model as input for driving behavior, a rule, the rule comprising;'}determining that the condition part is satisfied based upon the evaluating; and, in response to the determining,modifying a semantic object of the plurality of semantic objects in conformity to the action part.2. The method of claim 1 , the logic further comprising logic for interring the rule based upon information stored in conjunction with the semantic model.3. The method of claim 1 , the logic further comprising logic for inferring claim 1 , new information for inclusion in the semantic store.4. The method of claim 1 , wherein data employed to evaluate the condition part of the rule is retrieved from a ...

Подробнее
01-01-2015 дата публикации

Enhanced Answers in DeepQA System According to User Preferences

Номер: US20150006158A1
Принадлежит:

A semantic search engine is enhanced to employ user preferences to customize answer output by, for a first user, extracting user preferences and sentiment levels associated with a first question; receiving candidate answer results of a semantic search of the first question; weighting the candidate answer results according to the sentiment levels for each of the user preferences; and producing the selected candidate answers to the first user. Optionally, user preferences and sentiment levels may be accumulated over different questions for the same user, or over different users for similar questions. And, supplemental information may be retrieved relative to a user preference in order to further tune the weighting per the preferences and sentiment levels. 1. A method for using user preferences to customize answer output comprising:for a first user and a first question, extracting by a natural language processor one or more user preferences and one or more sentiment levels, wherein each user preference and each sentiment level is associated with the first question, or associated with the first user, or associated with both the first question and the first user;performing by a computer a semantic search on the first question;receiving by a computer candidate answers from the semantic search;selecting by a computer one or more candidate answer results according to the sentiment levels and the user preferences; andproducing by a computer the selected candidate answers for review by the first user.2. The method as set forth in wherein the user preferences and the sentiment levels are accumulated by a computer system for the first user for the first question and for at least a second question claim 1 , and wherein the selecting is performed according to the accumulated user preferences and sentiment levels.3. The method as set forth in wherein user preferences and the sentiment levels are accumulated by a computer system for at least a second user associated with a second ...

Подробнее
07-01-2021 дата публикации

Cognitive Iterative Minimization of Personally Identifiable Information in Electronic Documents

Номер: US20210004485A1
Принадлежит:

Mechanisms are provided to minimize personally identifiable information (PII) in an electronic document. An iterative personally identifiable information minimization (IPIIM) engine receives an electronic document comprising natural language content having a mention of a protected entity and obfuscates the mention of the protected entity to thereby generate a minimized natural language content. A question answering system processes the minimized natural language content to generate a listing of candidate answers and corresponding confidence scores and the IPIIM engine determines whether or not the minimized natural language content is sufficiently obfuscated based on the listing of candidate answers and corresponding confidence scores. In response to determining that the minimized natural language content is sufficiently obfuscated, the minimized natural language content is provided for processing by a requestor computing device. 1. A method , in a data processing system comprising at least one processor and at least one memory , the at least one memory comprising instructions executed by the at least one processor to cause the at least one processor to minimize personally identifiable information (PII) in an electronic document , the method comprising:receiving, by an iterative personally identifiable information minimization (IPIIM) engine executing in the data processing system, an electronic document comprising natural language content having a mention of a protected entity;applying, by the IPIIM engine, natural language processing and analytic analysis on the natural language content to obfuscate the mention of the protected entity and thereby generate a first minimized natural language content;processing, by a question answering computing system, the first minimized natural language content to generate a first listing of one or more candidate answers and corresponding confidence scores;determining, by the IPIIM engine, whether or not the first minimized ...

Подробнее
04-01-2018 дата публикации

SYSTEM AND METHOD FOR GENERATING PROFESSIONAL PROFILE FOR EMPLOYEES

Номер: US20180004839A1
Принадлежит:

System and method for generating professional profile of an employee in an organization are disclosed. An organizational ontology may be built for the employee. The organizational ontology may be indicative of interrelationships between the employee, other employees and tasks performed by the employee. Further, email data associated with the employee may be stored for a complete lifecycle of the employee. The email data may be analyzed to identify insights about the employee and the professional profile may be generated based on the identified insights. 1. A system for generating an employee profile , the system comprising:a memory; anda processor coupled to the memory, wherein the processor is configured to execute instructions stored in the memory to:build an organizational ontology for an employee working in an organization, wherein the organizational ontology comprises a plurality of branches at least indicative of interrelationships between the employee, one or more other employees and one or more tasks performed by the employee;store email data for the employee, wherein the email data is stored for an employee lifecycle of the employee, wherein the employee lifecycle is indicative of a tenure completed by the employee in the organization;analyze, based on each branch of the organizational ontology, the stored email data, for determining one or more insights associated with the employee; andgenerate a professional profile for the employee, based at least on the one or more insights.2. The system of claim 1 , wherein the processor is further configured to execute instructions stored in the memory to analyze the email data based on one of a natural language processing analysis claim 1 , syntactic parsing claim 1 , entity recognition analysis claim 1 , co-reference analysis claim 1 , sentiment analysis claim 1 , and quantitative text analysis.3. The system of claim 1 , wherein the plurality of branches of the organizational ontology are further indicative of ...

Подробнее
02-01-2020 дата публикации

TEXT ENTITY DETECTION AND RECOGNITION FROM IMAGES

Номер: US20200004815A1
Принадлежит: Microsoft Technology Licensing, LLC

Named entity recognition can be performed on an image to classify any text in an image. A boundary that encompasses the classified entity may be predicted. Subsequently, upon request, optical character recognition (OCR) can be performed on just the region inside the boundary. The disclosed implementations conserve computer resources such as processing power and battery compared to performing OCR on the entire image. 1. A system , comprising:at least one computer readable device storing instructions; receiving an image comprising a plurality of entities;', 'determining, using a neural network, a boundary of one of the plurality of entities of the image that comprises text;', 'predicting a classification of the text of the one of the plurality of entities of the image;', 'outputting the classification of the text;', 'receiving a request to perform an action based upon the classification of the text; and', 'performing the action in accordance with the request., 'one or more hardware processors that are coupled to the at least computer readable device and that are configured to execute the instructions to cause the system to perform operations comprising2. The system of claim 1 , wherein the operations further comprise performing optical character recognition on only a region within the boundary.3. The system of claim 1 , wherein the optical character recognition is performed subsequent to the request.4. The system of claim 1 , wherein the action is selected from the group consisting of: making a telephone call claim 1 , adding contact information claim 1 , storing information to the computer readable device; searching the Internet claim 1 , preparing an email message claim 1 , navigating to a home address claim 1 , preparing a text message claim 1 , and opening a web browser to a web page.5. The system of claim 1 , wherein the operation further comprise visually indicating the boundary of the one of the plurality of entities of the image; or visually indicating the one ...

Подробнее
02-01-2020 дата публикации

RECOMMENDING MESSAGE WORDING BASED ON ANALYSIS OF PRIOR GROUP USAGE

Номер: US20200004818A1
Принадлежит:

Embodiments generally relate to recommending message wording in a collaboration environment. In some embodiments, a method includes monitoring content characteristics of messages provided by users in a collaboration environment, where the content characteristics include individual usage statistics. The method further includes aggregating the individual usage statistics. The method further includes determining group usage statistics from the aggregated individual usage statistics. The method further includes determining community characteristics based at least in part on the group usage statistics. The method further includes providing one or more recommendations to at least one user who writes at least one new message based at least in part on the community characteristics and based at least in part on one or more recommendation policies. 1. A system comprising:at least one processor and a computer readable storage medium having program instructions embodied therewith, the program instructions executable by the at least one processor to cause the at least one processor to perform operations comprising:monitoring content characteristics of messages provided by users in a collaboration environment, wherein the content characteristics include individual usage statistics;aggregating the individual usage statistics;determining group usage statistics from the aggregated individual usage statistics;determining community characteristics based at least in part on the group usage statistics; andproviding one or more recommendations to at least one user who writes at least one new message based at least in part on the community characteristics and based at least in part on one or more recommendation policies.2. The system of claim 1 , wherein the community characteristics are further based at least in part on one or more of a destination.3. The system of claim 1 , wherein the individual usage statistics comprises statistics associated with one or more of punctuation claim 1 , ...

Подробнее
02-01-2020 дата публикации

PREDICTING PROBABLITY OF OCCURRENCE OF A STRING USING SEQUENCE OF VECTORS

Номер: US20200004819A1
Принадлежит:

Systems and methods are disclosed to receive a plurality of strings where each string of the plurality of strings comprises a plurality of symbols. For each string of the plurality of strings, a first sequence of vectors is generated based at least on a maximum length of word for each symbol in the string. The first sequence of vectors is provided to a machine learning unit for each string of the plurality of strings. A probability of occurrence of each string of the plurality of strings is obtained from the machine learning unit. 1. A method comprising:receiving a plurality of strings, each string of the plurality of strings comprising a plurality of symbols;for each string of the plurality of strings, generating, by a processing device, a first sequence of vectors based at least on a maximum length of word for each symbol in the string, wherein the maximum length of word for each symbol in the string corresponds to length of a longest possible word within the string that starts with the symbol, the longest possible word comprising one or more of a word: 1) with a commonplace meaning, or 2) found in a dictionary;providing to a machine learning unit the first sequence of vectors for each string of the plurality of strings; andobtaining from the machine learning unit a probability of occurrence of each string of the plurality of strings.2. The method of claim 1 , wherein each vector of the first sequence of vectors corresponds to each symbol in the string.3. The method of claim 2 , wherein each vector of the first sequence of vectors is derived by joining together a first vector comprising the maximum length of word for a given symbol in the string and a second vector comprising a symbol vector for the given symbol in the string.4. (canceled)5. (canceled)6. The method of claim 3 , wherein the symbol vector for each symbol in the string is based on one or more of:a symbol embedding;a unified vector for symbols of an alphabet that is different from the alphabet used in ...

Подробнее
02-01-2020 дата публикации

CONTENT OPTIMIZATION FOR AUDIENCES

Номер: US20200004820A1
Принадлежит: Adobe Inc.

Techniques are disclosed to assist an author in creating content variations of a given input text to better suit the mood or the affect preferences of the target audience. Affect distribution in the content is utilized to capture these psycholinguistic preferences. According to one embodiment, in a first phase the optimal/idea psycholinguistic preference for text content aimed at a particular audience segment is determined. In a second phase, a given text content is modified to align to a target language distribution, which was determined in the first phase. In one example case, word level replacement, insertions and deletions are executed to generate a modified and coherent version of the input text. The output text thus reflects the psycholinguistic requirements of the audience. 1. A method for performing content optimization , the method comprising:processing historic data to generate an audience affect model map, wherein said audience affect model map provides a mapping between a pair of communities and a probability distribution characterizing at least one emotional affect;processing historic data to generate a topic affect model map, wherein said topic model map provides a mapping between a topic and a probability distribution characterizing at least one emotional affect; and, performing a candidate word selection using said audience affect model map and said topic affect model map to generate a preliminary list of candidate words;', 'pruning said preliminary list of candidate words to generate a final list of candidate words;', 'generating candidate word transformations for said final list of candidate words; and,', 'performing a transformation on said input content document using said generated candidate word transformations to generate an optimized content document., 'performing a content optimization on an input content document, using said audience affect model map and said topic affect model map by'}2. The method according to claim 1 , wherein generating ...

Подробнее
02-01-2020 дата публикации

METHOD AND DEVICE FOR EXTRACTING ACTION OF INTEREST FROM NATURAL LANGUAGE SENTENCES

Номер: US20200004821A1
Принадлежит:

A method and device for extracting Action of Interest (AOI) from natural language sentences is disclosed. The method includes creating an input vector comprising a plurality of parameters for each target word in a sentence inputted by a user. The method further includes processing for each target word, the input vector through a trained neural network with RELU activation, which is trained to identify AOI from a plurality of sentences. The method includes assigning AOI tags to each target word in the sentence based on processing of associated input vector through the trained neural network with RELU activation. The method further includes extracting AOI text from the sentence based on the AOI tags assigned to each target word in the sentence. The method further includes providing a response to the sentence inputted by the user based on the AOI text extracted from the sentence. 1. A method for extracting Action of Interest (AOI) from natural language sentences , the method comprising:creating, by a AOI processing device, an input vector comprising a plurality of parameters for each target word in a sentence inputted by a user, wherein the plurality of parameters for each target word comprise a Part of Speech (POS) vector associated with the target word and at least two words preceding the target word, a word embedding for the target word, a word embedding for a head word of the target word in the dependency parse tree of the sentence, and a dependency label for the target word;processing for each target word, by the AOI processing device, the input vector through a trained neural network with Rectified Linear Units (RELU) activation, wherein the trained neural network with RELU activation is trained to identify words associated with AOI from a plurality of sentences;assigning, by the AOI processing device, AOI tags to each target word in the sentence based on processing of associated input vector through the trained neural network with RELU activation;extracting, by ...

Подробнее
02-01-2020 дата публикации

METHOD AND DEVICE FOR EXTRACTING ATTRIBUTES ASSOCIATED WITH CENTRE OF INTEREST FROM NATURAL LANGUAGE SENTENCES

Номер: US20200004822A1
Принадлежит:

A method and device for extracting attributes associated with Center of Interest (COI) from natural language sentences is disclosed. The method includes creating an input vector comprising a plurality of parameters for each target word in a sentence inputted by a user. The method further includes processing for each target word, the input vector through a trained bidirectional GRU neural network, which is trained to identify attributes associated with COI from a plurality of sentences. The method includes associating COI attribute tags to each target word in the sentence based on processing of associated input vector through the trained bidirectional GRU neural network. The method further includes extracting attributes from the sentence based on the COI attribute tags associated with each target word in the sentence. The method further includes providing a response to the sentence inputted by the user based on the attributes extracted from the sentence. 1. A method for extracting attributes associated with Center of Interest (COI) from natural language sentences , the method comprising:creating, by a COI attribute processing device, an input vector comprising a plurality of parameters for each target word in a sentence inputted by a user, wherein the plurality of parameters for each target word comprise a Part of Speech (POS) vector associated with the target word and at least two words preceding the target word, a word embedding for the target word, a word embedding for a head word of the target word in a dependency parse tree for the sentence, and a dependency label for the target word;processing for each target word, by the COI attribute processing device, the input vector through a trained bidirectional Gated Recurrent Unit (GRU) neural network, wherein the trained bidirectional GRU neural network is trained to identify attributes associated with COI from a plurality of sentences, and wherein attributes associated with a COI in a sentence augment the context of ...

Подробнее
02-01-2020 дата публикации

METHOD AND DEVICE FOR EXTRACTING POINT OF INTEREST FROM NATURAL LANGUAGE SENTENCES

Номер: US20200004823A1
Принадлежит:

A method and device for extracting Point of Interest (POI) from natural language sentences is disclosed. The method includes creating an input vector comprising a plurality of parameters for each target word in a sentence inputted by a user. The method further includes processing for each target word, the input vector through a trained bidirectional LSTM neural network, which is trained to identify POI from a plurality of sentences. The method includes associating POI tags to each target word in the sentence based on processing of associated input vector through the trained bidirectional LSTM neural network. The method further includes extracting POI text from the sentence based on the POI tags associated with each target word in the sentence. The method further includes providing a response to the sentence inputted by the user based on the POI text extracted from the sentence. 1. A method of extracting Point of Interest (POI) from natural language sentences , the method comprising:creating, by a POI processing device, an input vector comprising a plurality of parameters for each target word in a sentence inputted by a user, wherein the plurality of parameters for each target word comprises a Part of Speech (POS) vector associated with a target word and at least two words preceding the target word, a word embedding for the target word, a word embedding for a head word of the target word in the dependency parse tree of the input sentence, and a dependency label for the target word;processing for each target word, by the POI processing device, the input vector through a trained bidirectional Long Short Term Memory (LSTM) neural network;assigning, by the POI processing device, POI tags to each target word in the sentence based on processing of associated input vector through the trained bidirectional LSTM neural network;extracting, by the POI processing device, POI text from the sentence based on the POI tags associated with each target word in the sentence; ...

Подробнее
02-01-2020 дата публикации

METHOD AND SYSTEM FOR BRIDGING DISPARATE PLATFORMS TO AUTOMATE A NATURAL LANGUAGE INTERFACE

Номер: US20200004824A1
Принадлежит:

Presented here is a system and method to bridge the gap between the disparate platforms, and to allow the user to interface with the disparate platforms using a natural language interface. The system can improve user interface for electronic devices because the user does not have to switch between the disparate platforms. Instead, the user can interact with the disparate platforms through a single natural language interface. The disparate platforms, some of which may not have a natural language user interface, can be enabled to interact with the user through the natural language interface when the system interfaces between a natural language processing platform and the remainder of the disparate platforms. 1. A method comprising:obtaining a plurality of disparate platforms and a plurality of actions, wherein each action in the plurality of actions can be performed by at least one platform in the plurality of disparate platforms, wherein a platform in the plurality of disparate platforms cannot communicate with a remainder of the plurality of disparate platforms;receiving a natural language request from a user by a groupware platform, wherein the groupware platform is a part of the plurality of disparate platforms;based on the plurality of actions and the plurality of disparate platforms, deciding to send the natural language request from the user to a natural language processing platform, wherein the groupware platform and the natural language processing platform are not configured to communicate to each other;formatting the natural language request into a formatted natural language request accepted by the natural language processing platform;sending the formatted natural language request to the natural language processing platform;interpreting, by the natural language processing platform, the natural language request to identify a user requested action associated with the natural language request;based on the plurality of actions and the plurality of disparate ...

Подробнее
02-01-2020 дата публикации

MISINFORMATION DETECTION IN ONLINE CONTENT

Номер: US20200004882A1
Принадлежит:

Techniques are presented for providing misinformation detection in online content. The described techniques can identify instances of misinformation in online content and pass a misinformation result to the user. A misinformation probability analysis can be performed by applying a syntactic analysis and a semantic analysis to detect misinformation with confidence by applying featurization to a URL, text of content referenced by the URL, and metadata associated with the URL using a feature set, the feature set comprising semantic-based features and syntactic-based features, wherein the semantic features and the syntactic features are selected from the group consisting of: sentiment amplifiers, sentiment continuity disruption features, lexical features, keywords, baseline features, speech act, sensicon features, emotion detection on the obtained text, exaggerated language, strong adjectives, heuristics, bag-of-words, objectivity, colloquial-ness score, and semantic difference. 1. A method comprising:receiving a uniform resource locator (URL); obtaining text from content referenced by the URL and metadata associated with the URL, wherein the metadata associated with the URL comprises an author, a publisher, a date of publication, a headline, or a combination thereof;', 'analyzing the URL, the obtained text, the metadata associated with the URL, or a combination thereof to determine if previously received;', 'if previously received, providing a notification from cached information; if not previously received, performing a misinformation probability analysis of the URL, the obtained text, and the metadata associated with the URL using a feature set at a misinformation probability service, the feature set comprising semantic-based features and syntactic-based features;', 'receiving, from the misinformation probability service, a probability value representing a misinformation confidence for the URL based on the performed misinformation probability analysis;', 'determining ...

Подробнее
02-01-2020 дата публикации

PERSONALIZED ARTIFICIAL INTELLIGENCE AND NATURAL LANGUAGE MODELS BASED UPON USER-DEFINED SEMANTIC CONTEXT AND ACTIVITIES

Номер: US20200004890A1
Принадлежит:

An artificial intelligence (“AI”) engine generates an activity graph that includes nodes corresponding to activities and that defines clusters of content associated with the activities. A natural language (“NL”) search engine can receive a NL query and parse the NL query to identify entities and intents specified by the NL query. Clusters of content defined by the activity graph can be identified based upon the identified entities and intents. A search can then be made of the identified clusters of content using the entities and intents. Search results identifying the content located by the search can then be returned in response to the NL query. 1. A computer-implemented method , comprising:generating, by way of an artificial intelligence (AI) engine, an activity graph comprising nodes associated with activities and defining clusters of content associated with the activities;receiving a natural language (NL) query by way of an NL search engine;parsing the NL query to identify one or more entities and intents specified by the NL query;identifying one or more clusters of the content based on the identified entities and intents;searching the content in the identified one or more clusters of content using the identified entities and intents; andreturning search results identifying the content located by the search in response to the NL query.2. The computer-implemented method of claim 1 , further comprising using the activity graph to train the NL search engine to identify the entities and intents.3. The computer-implemented method of claim 1 , wherein the NL query is received by way of a search UI provided by an activity management application.4. The computer-implemented method of claim 1 , further comprising searching one or more properties associated with the activities using the identified entities and intents.5. The computer-implemented method of claim 4 , wherein the properties are defined by schema associated with the activities.6. The computer-implemented ...

Подробнее
07-01-2021 дата публикации

METHOD AND APPARATUS FOR DETERMINING (RAW) VIDEO MATERIALS FOR NEWS

Номер: US20210004602A1
Автор: Lu Daming, Tian Hao
Принадлежит:

The present disclosure discloses a method and apparatus for determining video material of news. The method for determining video material of news comprises: acquiring a weighted score value of a score of a keyword of a news text in a plurality of dimensions; filtering a keyword set of news based on the weighted score value of the score of the keyword; searching a pre-selected video using the keyword set of the news; and determining video material of the news based on the pre-selected video. The present disclosure improves the consistency between the video material of the news and the news text. 1. A method for determining video material of news , comprising:acquiring a weighted score value of a score of a keyword of a news text in a plurality of dimensions, the score of the keyword of the news text in the plurality of dimensions including: a score of the keyword determined based on a correlation between a word obtained by segmenting the news text and a news title;filtering a keyword set of news based on the weighted score value of the score of the keyword;searching a pre-selected video using the keyword set of the news; anddetermining video material of the news based on the pre-selected video.2. The method according to claim 1 , wherein the acquiring a weighted score value of a score of a keyword of a news text in a plurality of dimensions includes at least one of:acquiring the score of the keyword of the news text using an attention model of extracting the keyword, the acquiring the score of the keyword of the news text using an attention model of extracting the keyword including: acquiring, using the attention model of extracting the keyword, the score of the keyword determined based on the correlation between the word obtained by segmenting the news text and the news title;acquiring the score of the keyword of the news text using a TF-IDF; oracquiring the score of the keyword of the news text using a domain dictionary of a different granularity.3. The method ...

Подробнее
07-01-2021 дата публикации

Method and apparatus for determining (raw) video materials for news

Номер: US20210004603A1
Автор: Daming Lu, Hao Tian
Принадлежит: Baidu USA LLC

The present disclosure discloses a method and apparatus for determining video material of news. The method for determining video material of news comprises: recognizing a person name in a news text; searching a video based on the person name, to obtain a to-be-selected video; extracting a key frame in the to-be-selected video; recognizing a person in the key frame to obtain identity information of the person; and determining the to-be-selected video as video material of news, in response to the identity information of the person conforming to the person name. The present disclosure improves the consistency between the video material of the news and the news text.

Подробнее
07-01-2016 дата публикации

Content-Based Audio Playback Emphasis

Номер: US20160005402A1
Принадлежит: MModal IP LLC

Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript. 1(A) deriving, from a region of a document and a corresponding region of a spoken audio stream, a likelihood score representing a likelihood that the region of the document correctly represents content in the corresponding region of the spoken audio stream, and tangibly storing a representation of the likelihood score in a second computer-readable medium;(B) selecting a relevance score representing a measure of relevance of the region of the spoken audio stream, the measure of relevance representing a measure of importance that the region of the spoken audio stream be brought to the attention of a human proofreader, and tangibly storing a representation of the relevance score in a third computer-readable medium; and(C) deriving, by dividing the relevance score by the likelihood score, an emphasis factor for modifying emphasis placed on the region of the spoken audio stream when played back, and storing a representation of the emphasis factor in a fourth computer-readable medium.. A method performed by a computer processor executing computer program instructions tangibly stored on a first computer-readable medium to perform a method ...

Подробнее
02-01-2020 дата публикации

ARTIFICIAL INTELLIGENCE SERVER FOR DETERMINING DEPLOYMENT AREA OF ROBOT AND METHOD FOR THE SAME

Номер: US20200005085A1
Принадлежит: LG ELECTRONICS INC.

An artificial intelligence server for determining a deployment area of a robot includes a memory and a processor. The memory is configured to store density data for a control area. The processor is configured to obtain a plurality of current context keywords corresponding to a current time point, determine at least one related keyword among the obtained plurality of current context keywords using the density data, and determine the deployment area of the robot based on density data corresponding to the determined related keyword. 1. An artificial intelligence server for determining a deployment area of a robot , comprising:a memory configured to store density data for a control area; and obtain a plurality of current context keywords corresponding to a current time point;', 'determine at least one related keyword among the obtained plurality of current context keywords using the density data; and', 'determine the deployment area of the robot based on density data corresponding to the determined related keyword., 'a processor configured to2. The artificial intelligence server according to claim 1 , wherein the control area includes a plurality of unit areas and is a maximum activity range of the robot.3. The artificial intelligence server according to claim 2 , wherein the plurality of current context keywords essentially include a context keyword for a time corresponding to a current time point claim 2 , and further include a context keyword for at least one of day claim 2 , weather claim 2 , season claim 2 , or date corresponding to the current time point.4. The artificial intelligence server according to claim 3 , wherein the density data includes at least one context keyword corresponding to a density distribution for the control area or a target time point of calculating the density distribution claim 3 , andwherein the density distribution for the control area includes a density for each of at least one group area included in the control area.5. The artificial ...

Подробнее
02-01-2020 дата публикации

OFFTRACK VIRTUAL AGENT INTERACTION SESSION DETECTION

Номер: US20200005118A1
Принадлежит:

Generally discussed herein are devices, systems, and methods for detecting a conversation with a virtual agent is offtrack and responding appropriately. A method can include receiving a prompt, expected responses to the prompt, and a response of an interaction session with the virtual agent, the interaction session to solve a problem of a user, determining whether the response indicates the interaction session is in an offtrack state based on the prompt, expected responses, and response, in response to a determination that the interaction session is in the offtrack state, determining a taxonomy of the offtrack state, and providing, based on the determined taxonomy, a next prompt to the interaction session. 1. A system comprising:a virtual agent interface device to provide an interaction session in a user interface with a human user, the interaction session regarding a problem to be solved by a user; receive a prompt, expected responses to the prompt, and a response of the interaction session;', 'determine whether the response indicates the interaction session is in an offtrack state based on the prompt, expected responses, and response;', 'in response to a determination that the interaction session is in the offtrack state, determine a taxonomy of the offtrack state; and', 'provide, based on the determined taxonomy, a next prompt to the interaction session., 'processing circuitry in operation with the virtual agent interface device to2. The system of claim 1 , wherein the processing circuitry is configured to implement a plurality of models claim 1 , wherein each of the models is configured to produce a score indicating a likelihood that a different taxonomy of the taxonomies applies to the prompt claim 1 , expected responses claim 1 , and response.3. The system of claim 1 , wherein the processing circuitry is further to receive context data indicating a number of prompts and responses previously presented in the interaction session and the prompts and responses ...

Подробнее
03-01-2019 дата публикации

DOCUMENT PROCESSING

Номер: US20190005012A1
Принадлежит: Accenture Global Solutions Limited

A document processing system receives an electronic document including component documents generated from various sources in different formats. Plain text data can be extracted from the electronic document in addition to formatting and structuring information. The plain text data is segmented into sections and various entities are extracted and linked from the sections. An interactive graphical user interface (GUI) that displays content including the plain text data is formatted according to the styling information and annotated entity relationships are determined from the linked entities. The GUI enables user edits to the annotated entity relationships. 1. A document processing system that extracts editable data from electronic documents , the system comprising:one or more processors; anda non-transitory data storage comprising machine-executable instructions that cause the one or more processors to: the non-editable image file pertaining to an electronic document, and', 'the electronic document including at least one component document, and', 'the markup file preserving a format and a structure of the component document from the image file;, 'convert a non-editable image file into a markup file,'}parse the markup file to extract plain text data of the non-editable image file;determine styling information of the non-editable image file from the markup file;automatically segment into sections, the plain text data by analyzing contents of the markup file according to boundary rules; 'the identifying performed using natural language processing (NLP);', 'identify and extract entities automatically from the segmented sections of the plain text data,'} domain-specific ontologies,', 'knowledge bases, and', 'graphical inferences;, 'link the entities using at least one of the GUI displaying content,', 'the content including the plain text data formatted according to the styling information,', 'the GUI including annotated entity relationships determined from the linked ...

Подробнее
03-01-2019 дата публикации

CONVERSATION SYSTEM-BUILDING METHOD AND APPARATUS BASED ON ARTIFICIAL INTELLIGENCE, DEVICE AND COMPUTER-READABLE STORAGE MEDIUM

Номер: US20190005013A1
Автор: SUN Ke, WANG JU, Zhang Jingjing

The present disclosure provides a conversation system building method and apparatus based on artificial intelligence, a device and a computer-readable storage medium. In embodiments of the present disclosure, the user only needs to intervene the annotation operation of the conversation samples in the case that the conversation system is not satisfied with the recognition parameters of the input information provided by the user, without manually participating in the annotation operations of all conversation samples. The operations are simple, the correctness rate is high, and thereby the efficiency and reliability of building the conversation system is improved. 1. A conversation system building method based on artificial intelligence , wherein the method comprises:obtaining a sample adjusting instruction of a conversation system triggered by a user, the sample adjusting instruction being triggered by the user according to the conversation system's recognition parameters for input information provided by the user;according to the sample adjusting instruction, outputting at least one adjustment option for the user to select;according to the adjustment option selected by the user, outputting an adjustment interface to obtain adjustment information provided by the user based on the adjustment interface;obtaining an adjustment parameter of the conversation service according to the adjustment information;performing data annotation processing according to the input information and the adjustment parameter of the conversation system, to obtain conversation samples for building the conversation system.2. The method according to claim 1 , wherein before obtaining a sample adjusting instruction triggered by a user claim 1 , the method further comprises:obtaining input information provided by the user to perform the conversation service with the conversation system;outputting the input information;according to the input information, obtaining recognition parameters of the ...

Подробнее
03-01-2019 дата публикации

SYSTEMS AND METHODS FOR EXTRACTING FUNDER INFORMATION FROM TEXT

Номер: US20190005020A1
Принадлежит: Elsevier, Inc.

Systems and methods of extracting funding information from text are disclosed herein. The method includes receiving a text document, extracting paragraphs from the text document using a natural language processing model or a machine learning model, and classifying, using a machine learning classifier, the paragraphs as having funding information or not having funding information. The method further includes labeling, using a first annotator, potential entities within the paragraphs classified as having funding information, and labeling, using a second annotator, potential entities within the paragraphs classified as having funding information, where the first annotator implements a first named-entity recognition model and the second annotator implements a second named-entity recognition model that is different from the first named-entity recognition model. The method further includes extracting the potential entities from the paragraphs classified as having funding information and determining, using an ensemble mechanism, funding information from the potential entities. 1. A method of extracting funding information from text , the method comprising:receiving, at a computing device, a text document;extracting paragraphs from the text document using at least one of a natural language processing model or a machine learning model;classifying, using a machine learning classifier, the paragraphs as having funding information or not having funding information;discarding the paragraphs classified as not having funding information;labeling, using a first annotator, one or more potential entities within the paragraphs classified as having funding information; the first annotator implements a first named-entity recognition model and the second annotator implements a second named-entity recognition model that is different from the first named-entity recognition model, and', 'the one or more potential entities comprise at least one unique entity labeled by the first annotator or ...

Подробнее
03-01-2019 дата публикации

Automatically assisting conversations using graph database

Номер: US20190005023A1
Принадлежит: Microsoft Technology Licensing LLC

Examples of the present disclosure describe systems and methods for automatically assisting conversations using a graph database. In order to minimize misunderstanding of words and phrases used by participants during a conversation, phrases from the conversation may be received by conversation assistance application as the conversation takes place. Entities may be extracted from the phrase based on natural language recognition according to a domain context of the participant being assisted. One or more tags may be looked up from a graph database, and may be provided to the participant as a list of hashtags related to the conversation. Links to documents may be extracted based on the tags for the participant for viewing during the conversation.

Подробнее
03-01-2019 дата публикации

VIRTUAL ASSISTANT PROVIDING ENHANCED COMMUNICATION SESSION SERVICES

Номер: US20190005024A1
Принадлежит:

Methods for providing enhanced services to users participating in communication sessions (CS), via a virtual assistant, are disclosed. One method receives content that is exchanged by users participating in the CS. The content includes natural language expressions that encode a conversation carried out by users. The method determines content features based on natural language models. The content features indicate intended semantics of the natural language expressions. The method determines a relevance of the content and identifies portions of the content that are likely relevant to the user. Determining the relevance is based on the content features, a context of the CS, a user-interest model, and a content-relevance model of the natural language models. Identifying the likely relevant content is based on the determined relevance of the content and a relevance threshold. A summary of the CS is automatically generated from summarized versions of the likely relevant portions of the content. 1. A computerized system comprising:one or more processors; and receiving content that is exchanged within a communication session (CS), wherein the content includes one or more natural language expressions that encode a portion of a conversation carried out by a plurality of users participating in the CS;', 'determining one or more content features based on the content and one or more natural language models, wherein the one or more content features indicate one or more intended semantics of the one or more natural language expressions;', 'determining a relevance of the content based on the content features, a user-interest model for a first user of the plurality of users, and a content-relevance model for the first user;', 'identifying one or more portions of the content based on the relevance of the content and one or more relevance thresholds, wherein the one or more identified portions of the content are likely relevant to the first user; and', 'generating a summary of the CS ...

Подробнее
03-01-2019 дата публикации

PERFORMING SEMANTIC GRAPH SEARCH

Номер: US20190005025A1
Автор: Malabarba Scott
Принадлежит:

Provided are techniques for performing entity-based semantic graph search. Semantic extraction is performed on content items to identify entities. A semantic graph is generated with a vertex for each of the content items, each of the entities, and each user associated with any of the content items and with edges between vertices representing relationships, wherein each of the edges is encoded with metadata about a type of a relationship and a strength of a relationship. A vertex structure is generated that contains identifiers of the content items, the entities, and each user mapped to vertices in the semantic graph. In response to receiving a search request with a search term, using the semantic graph and the vertex structure to identify search results for the search term. 1. A method , comprising:performing, with a processor of a computer, semantic extraction on content items to identify entities;generating a semantic graph with a vertex for each of the content items, each of the entities, and each user associated with any of the content items and with edges between vertices representing relationships, wherein each of the edges is encoded with metadata about a type of a relationship and a strength of a relationship;generating a vertex structure that contains identifiers of the content items, the entities, and each user mapped to vertices in the semantic graph; andin response to receiving a search request with a search term, using the semantic graph and the vertex structure to identify search results for the search term.2. The method of claim 1 , further comprising: mapping the search term to an entity of the entities;', 'in response to determining that the entity is not in the vertex structure, creating a new vertex in the semantic graph for the entity; and', 'in response to determining that the entity is in the vertex structure, setting a vertex corresponding to the entity as an origin vertex., 'in response to receiving the search request with the search term,'}3 ...

Подробнее
03-01-2019 дата публикации

INFORMATION EXTRACTION METHOD AND APPARATUS

Номер: US20190005026A1
Автор: Zhang Zhenzhong
Принадлежит: BOE Technology Group Co., Ltd.

The present invention is related to an information extraction method. The information extraction method may comprise providing r semantic relationships, acquiring entity pairs corresponding to the semantic relationships, acquiring first instances based on the entity pairs, and eliminating instances that do not have the semantic relationships from the first instances to obtain second instances. r is a positive integer. Each of the entity pairs contains a pair of named entities. The first instances are sentences containing the entity pairs. 1. An information extraction method , comprising:providing r semantic relationships,acquiring entity pairs corresponding to the semantic relationships,acquiring first instances based on the entity pairs, andeliminating instances that do not have the semantic relationships from the first instances to obtain second instances,wherein r is a positive integer, each of the entity pairs contains a pair of named entities, and the first instances are sentences containing the entity pairs.2. The information extracting method according to claim 1 , wherein acquiring the entity pairs corresponding to the semantic relationships and acquiring the first instances based on the entity pairs comprises:acquiring the entity pairs corresponding to the semantic relationships from a knowledge base;tagging sentences that contain the named entities of all the entity pairs in a database using a named entity recognition tool; andretrieving the first instances containing the entity pairs from the tagged sentences.3. The information extracting method according to claim 1 , wherein eliminating the instances that do not have the semantic relationships from the first instances to obtain the second instances comprises:{'sub': 'nf', 'extracting first features from each of the first instances based on the entity pairs to construct a first instance-feature matrix M, and'}{'sub': 'rn', 'constructing a semantic relationship-first instance matrix M,'}wherein the first ...

Подробнее
03-01-2019 дата публикации

System and Method For Domain-Independent Aspect Level Sentiment Detection

Номер: US20190005027A1
Автор: Feng Zhe, He Yifan, Xu Kui, Zhao Lin
Принадлежит:

A method for automated aspect-based sentiment analysis includes parsing reviews from a first domain to generate rhetorical structure trees and extracting rhetorical rules from the rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the rhetorical structure trees associated with a probability that the path corresponds to a positive or negative sentiment based on annotation data. The method further includes parsing reviews from a second domain to generate a second plurality of rhetorical structure trees, generating training data that associates at least one aspect in the review from the second domain with a sentiment associated with a rhetorical rule in the plurality of rhetorical rules, and training a classifier to identify sentiments in reviews from the second domain using the second plurality of reviews and the training data. 1. A method for automated sentiment analysis comprising:receiving, with a network interface device in a server, a first plurality of reviews from a first domain, each review in the first plurality of reviews being associated with annotation data that identify a plurality of sentiments and a plurality of aspects included in the first plurality of reviews;parsing, with a processor in the server, the first plurality of reviews from the first domain to generate a first plurality of rhetorical structure trees, each rhetorical structure tree in the first plurality of rhetorical structure trees corresponding to one review in the first plurality of reviews and each rhetorical structure tree in the first plurality of rhetorical structure trees including at least one span associated with a predetermined relationship;extracting, with the processor in the server, a plurality of rhetorical rules from the first plurality of rhetorical structure trees, each rhetorical rule including a path extracted from at least one span in at least one of the first plurality of rhetorical structure trees ...

Подробнее
03-01-2019 дата публикации

SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIUM FOR VALIDATION OF IDIOMATIC EXPRESSIONS

Номер: US20190005028A1
Автор: Mago Rishi
Принадлежит:

A method for validation of idiomatic expressions may include the steps of receiving an input text string; performing a search of a database based on the input text string, in which the database stores a plurality of idiomatic expressions; identifying a first set of idiomatic expressions, in which the first set includes at least one of the plurality of idiomatic expressions stored in the database, in which each idiomatic expression in the first set has an associated concordance score that meets or exceeds a predetermined concordance threshold value, in which the associated concordance score indicates a degree of similarity between the respective idiomatic expression in the first set and the input text string; and outputting the first set of idiomatic expressions. A computing device configured to implement the method and a non-transitory computer-readable medium configured to store instructions that define the method are also disclosed. 1. A computing device configured to perform validation of idiomatic expressions , the computing device comprising:one or more processors; and receive an input text string;', 'perform a search of a database based on the input text string, the database storing a plurality of idiomatic expressions;', 'identify a first set of idiomatic expressions, the first set including at least one of the plurality of idiomatic expressions stored in the database, each idiomatic expression in the first set having an associated concordance score that meets or exceeds a predetermined concordance threshold value, the associated concordance score indicating a degree of similarity between the respective idiomatic expression in the first set and the input text string; and', 'output the first set of idiomatic expressions., 'a memory storing computer-readable instructions that, when executed by the one or more processors, cause the computing device to2. The computing device of claim 1 , wherein:the associated concordance score is determined based on a comparison ...

Подробнее
03-01-2019 дата публикации

SYSTEMS AND METHODS FOR NATURAL LANGUAGE PROCESSING OF STRUCTURED DOCUMENTS

Номер: US20190005029A1
Принадлежит:

Systems and methods for natural language processing of structured documents. In another embodiment, in an information processing apparatus comprising at least one computer processor, a method for processing a structured document may include: (1) receiving a document; (2) parsing the document into a plurality of components using a statistical parser; (3) extracting a plurality of entities from each component; (4) identifying a potential relationship between two of the plurality of entities; (5) generating a numeric representation for the potential relationship; (6) confirming the potential relationship with a logical regression model; and (7) generating and storing a unified structured file for the document. 1. A method for processing a structured document , comprising: receiving a document;', 'parsing the document into a plurality of components using a statistical parser;', 'extracting a plurality of entities from each component;', 'identifying a potential relationship between two of the plurality of entities;', 'generating a numeric representation for the potential relationship;', 'confirming the potential relationship with a logical regression model; and', 'generating and storing a unified structured file for the document., 'in an information processing apparatus comprising at least one computer processor2. The method of claim 1 , wherein the statistical parser comprises a neural network.3. The method of claim 1 , wherein the plurality of components comprise at least one of a participating party claim 1 , an article claim 1 , a section claim 1 , a subsection claim 1 , and a subsubsection.4. The method of claim 1 , wherein the statistical parser parses the document based on a first vector of word embeddings and a second vector of orthographic properties of words in the document.5. The method of claim 1 , wherein the step of parsing the document into a plurality of components comprises identifying a relationship among the plurality of components.6. The method of ...

Подробнее
03-01-2019 дата публикации

REGULARITIES AND TRENDS DISCOVERY IN A FLOW OF BUSINESS DOCUMENTS

Номер: US20190005050A1
Автор: Proux Denys
Принадлежит: XEROX CORPORATION

A method for encoding documents includes building or otherwise providing a condensed dictionary including identifiers for block headers identified in text blocks extracted from a collection of training documents. For at least one test document a set of text content blocks is identified. For each of the text content blocks in the set, a block header is identified. Each block header in the training and test documents includes a sequence includes no more than a predetermined maximum number of characters. An encoding of the test document is generated, based on the identifiers of the block headers identified in the test document that are in the condensed dictionary. 1. A method for encoding documents comprising:for each training document in a collection of training documents, identifying a set of text content blocks;for each of the text content blocks in the set, identifying a block header, the block header comprising a sequence consisting of no more than a predetermined maximum number of characters;building a condensed dictionary from the block headers identified for the collection of training documents, the condensed dictionary including an identifier for each of only a subset of the block headers identified in the collection of training documents; and identifying a set of text content blocks;', 'for each of the text content blocks in the set, identifying a block header, the block header comprising a sequence consisting of no more than the predetermined maximum number of characters;', 'generating an encoding of the test document, based on the identifiers of the identified block headers that are in the condensed dictionary,, 'for at least one test documentwherein at least one of the identifying of the set of text content blocks, identifying of the block headers, building the condensed dictionary, and generating the encoding of the test document is performed with a processor.2. The method of claim 1 , wherein the training documents in the training collection are scanned ...

Подробнее
03-01-2019 дата публикации

DYNAMIC SEMANTIC NETWORKS FOR LANGUAGE UNDERSTANDING AND QUESTION ANSWERING

Номер: US20190005090A1
Принадлежит:

A computer-implemented method of answering questions comprises: receiving, by one or more processors, a query; based on the query, generating, by the one or more processors, a matrix; based on the matrix, modifying, by the one or more processors, a dynamic memory; based on the matrix, determining, by the one or more processors, a first response from the dynamic memory; based on the matrix, determining, by the one or more processors, a second response from a database; based on the first response and the second response, determining, by the one or more processors, a third response; and in response to the query, providing, by the one or more processors, the third response. 1. A computer-implemented method of answering questions , comprising:receiving, by one or more processors, a query;based on the query, generating, by the one or more processors, a matrix;based on the matrix, modifying, by the one or more processors, a dynamic memory;based on the matrix, determining, by the one or more processors, a first response from the dynamic memory;based on the matrix, determining, by the one or more processors, a second response from a database;based on the first response and the second response, determining, by the one or more processors, a third response; andin response to the query, providing, by the one or more processors, the third response.2. The computer-implemented method of claim 1 , wherein:the query comprises a plurality of words; andthe generating of the matrix comprises determining a vector for each word of the plurality of words.3. The computer-implemented method of claim 2 , wherein:the vector for each word of the plurality of words has at least one hundred dimensions; andthe determining of the vector for each word of the plurality of words comprises retrieving a predetermined corresponding vector for the word from a dictionary database.4. The computer-implemented method of claim 1 , wherein: providing the matrix as input to a long short-term memory (LSTM); and', ...

Подробнее
03-01-2019 дата публикации

Method and system for creating entity records using existing data sources

Номер: US20190005112A1
Принадлежит: Innoplexus AG

A method and a system for creating entity records. The method includes extracting data records from existing data sources, the data records including entity names and attributes associated therewith; identifying the entity names and attributes corresponding thereto, in each extracted data record; classifying identified entity attributes, based on attribute signatures associated therewith; clustering data records based on similar classified entity attributes and/or similar entity names; comparing entity attributes in clustered data records to identify entity attributes with missing attributes data; assigning representative attributes data to entity attributes with missing attributes data; combining clustered data records to form entity record segments; clustering entity record segments based on similar identity signatures thereof; comparing relevant entity attributes of clustered entity record segments to identify entity record segments having a relation therebetween; and combining related entity record segments to form entity records.

Подробнее
03-01-2019 дата публикации

METHOD AND SYSTEM FOR MANAGING ASSOCIATIONS BETWEEN ENTITY RECORDS

Номер: US20190005118A1
Принадлежит:

A method and system for managing associations between entity records. The method includes: identifying a first plurality of entity records in at least one curated database, associations between the first plurality of entity records based on at least one entity attribute and associations between entity attributes of the first plurality of entity records; extracting data records from existing data sources, wherein the data records comprise entity names and entity attributes associated with the entity names; identifying the entity names, and entity attributes corresponding to each of the entity names in each of the extracted data records; clustering the entity names and corresponding entity attributes to create a second plurality of entity records; and identifying associations between the second plurality of entity records based on the identified associations between the first plurality of entity records and/or identified association between entity attributes of the first plurality of entity records. 1. A method of managing associations between entity records , wherein the method comprises:identifying a first plurality of entity records in at least one curated database, associations between the first plurality of entity records based on at least one entity attribute, and associations between entity attributes of the first plurality of entity records;extracting data records from existing data sources, wherein the data records comprise entity names and entity attributes associated with the entity names;identifying the entity names, and entity attributes corresponding to each of the entity names in each of the extracted data records;clustering the entity names and corresponding entity attributes to create a second plurality of entity records; andidentifying associations between the second plurality of entity records based on the identified associations between the first plurality of entity records and/or identified association between entity attributes of the first ...

Подробнее
03-01-2019 дата публикации

METHOD AND SYSTEM OF PRESENTING INFORMATION RELATED TO SEARCH

Номер: US20190005124A1
Принадлежит:

A method and system for presenting information related to a search. A user is operable to perform the search on a computing device using search query. The method includes providing first input area, for receiving the search query from the user, on a first user-interface of the computing device, displaying one or more query segments and relations between the query segments on a second area of the first user-interface, providing an arranged set of extracted search results in a third area of the first user-interface, providing list of one or more concepts associated with the extracted search results in a fourth area of the first user-interface, and providing information, related to a selected search result or a selected concept, on a second user-interface, in response to a selection input, from the user, based on one of a search result or a concept associated with the search results. 1. A method of presenting an information related to a search , wherein a user is operable to perform the search on a computing device using a search query , wherein the method comprises:providing a first input area, for receiving the search query from the user, on a first user-interface of the computing device;displaying one or more query segments and relations between the one or more query segments on a second area of the first user-interface, wherein the received search query is developed to obtain the one or more query segments and the relations between the one or more query segments;providing an arranged set of extracted search results in a third area of the first user-interface, wherein search results are extracted, from at least one database, based on the developed search query and arranged based on one or more parameters associated with the extracted search results;providing a list of one or more concepts associated with the extracted search results in a fourth area of the first user-interface, wherein the extracted search results are analyzed to identify the one or more concepts ...

Подробнее
02-01-2020 дата публикации

METHOD AND SYSTEM FOR AUTOMATICALLY GENERATING PRODUCT VISUALIZATION FROM E-COMMERCE CONTENT MANAGING SYSTEMS

Номер: US20200005387A1
Принадлежит: Vimeo, Inc.

A method and a system for automatically generating a product visualization using video, based on meta data obtained from a content management system (CMS) are provided herein. The method may include: obtaining product images and meta-data linked to the product from the CMS; selecting a product visualization instruction set; modifying the product visualization instruction set based on at least one of: content of the product images, and content of the meta-data linked to the product, by adjusting one or more instructions in the instruction set, to yield a modified product visualization instruction set; applying the modified product visualization instruction set to the product images and the meta-data linked to the product, to generate a visualization of the product, wherein the product visualization includes a sequence of frames wherein at least one of the frames includes one or more of the product images together with visual representation of the meta data. 1. A method of automatically generating a product visualization using video based on meta data obtained from a content management system (CMS) , the method comprising:obtaining one or more product images and meta-data linked to said product from said CMS;selecting a product visualization instruction set from a plurality of product visualization instruction sets;modifying the product visualization instruction set based on at least one of: content of the product images, and content of the meta-data linked to said product, by adjusting one or more instructions in the instruction set to yield a modified product visualization instruction set; andapplying said modified product visualization instruction set to the product images and the meta-data linked to said product, to generate a visualization of said product,wherein said product visualization comprises a sequence of frames, wherein at least one of the frames includes one or more of the product images together with a visual representation of one or more of the meta ...

Подробнее
02-01-2020 дата публикации

VISUALIZATION OF USER INTENT IN VIRTUAL AGENT INTERACTION

Номер: US20200005503A1
Принадлежит:

Generally discussed herein are devices, systems, and methods for visualization of user intent in accessing a virtual agent. A method can include receiving sentences from the respective interaction sessions, projecting the sentences to a higher-dimensional space to create respective higher-dimensional vectors, projecting the higher-dimensional vectors to a lower-dimensional space to create respective lower-dimensional vectors, the lower-dimensional space including either two dimensions or three dimensions, plotting the lower-dimensional vectors as respective points on a graph, encoding the respective points consistent with the respective intents determined to be associated with the sentences by an intent classifier to create encoded points, and generating a visual representation of the encoded points. 1. A computing device , comprising:a processor; and receiving sentences from the respective interaction sessions;', 'projecting the sentences to a higher-dimensional space to create respective higher-dimensional vectors;', 'projecting the higher-dimensional vectors to a lower-dimensional space to create respective lower-dimensional vectors, the lower-dimensional space including either two dimensions or three dimensions;', 'plotting the lower-dimensional vectors as respective points on a graph;', 'encoding the respective points consistent with the respective intents determined to be associated with the sentences by an intent classifier to create encoded points; and', 'generating a visual representation of the encoded points., 'a memory device including instructions embodied thereon, wherein the instructions, which when executed by the processor, cause the processor to perform operations for visualization of respective intents of first users in respective interaction sessions with at least one virtual agent, the operations comprising2. The computing device of claim 1 , wherein the visual representation includes respective sentences associated with the encoded points ...

Подробнее
03-01-2019 дата публикации

NATURAL LANGUAGE UNIFICATION BASED ROBOTIC AGENT CONTROL

Номер: US20190005328A1
Принадлежит: Accenture Global Solutions Limited

In some examples, natural language unification based robotic agent control may include ascertaining, by a robotic agent, an image of an object or an environment, and ascertaining a plurality of natural language insights for the image. A semantic relatedness may be determined between each insight of the plurality of insights, and a semantic relatedness graph may be generated for the plurality of insights. For each insight of the plurality of insights, at least one central concept may be identified. Based on the semantic relatedness graph and the identified at least one central concept, the plurality of insights may be clustered to generate at least one insights cluster. For insights included in the least one insights cluster, a unified insight may be generated. Further, an operation associated with the robotic agent, the object, or the environment may be controlled by the robotic agent and based on the unified insight. 1. A natural language unification based robotic agent control apparatus comprising: ascertain, by a robotic agent, an image of an object or an environment, and', 'ascertain a plurality of natural language insights for the image;, 'an insight analyzer, executed by at least one hardware processor, to'} 'determine semantic relatedness between each insight of the plurality of insights;', 'a semantic relatedness analyzer, executed by the at least one hardware processor, to'} 'generate, based on the determined semantic relatedness, a semantic relatedness graph for the plurality of insights;', 'a semantic relatedness graph generator, executed by the at least one hardware processor, to'} 'identify, for each insight of the plurality of insights, at least one central concept;', 'a central concepts identifier, executed by the at least one hardware processor, to'} 'cluster, based on the semantic relatedness graph and the identified at least one central concept, the plurality of insights to generate at least one insights cluster;', 'an insights cluster generator, ...

Подробнее
03-01-2019 дата публикации

NATURAL LANGUAGE EMINENCE BASED ROBOTIC AGENT CONTROL

Номер: US20190005329A1
Принадлежит: Accenture Global Solutions Limited

In some examples, natural language eminence based robotic agent control may include ascertaining, by a robotic agent, an image of an object or an environment, and ascertaining a plurality of natural language insights for the image. For each insight of the plurality of insights, an eminence score may be generated, and each insight of the plurality of insights may be ranked according to the eminence scores. An operation associated with the robotic agent, the object, or the environment may be controlled by the robotic agent and based on a highest ranked insight. 1. A natural language eminence based robotic agent control apparatus comprising: ascertain, by a robotic agent, an image of an object or an environment, and', 'ascertain a plurality of natural language insights for the image;, 'an insight analyzer, executed by at least one hardware processor, to'} generate, for each insight of the plurality of insights, an eminence score, and', 'rank each insight of the plurality of insights according to the eminence scores; and, 'an eminence score generator, executed by the at least one hardware processor, to'} 'control, by the robotic agent and based on a highest ranked insight, an operation associated with the robotic agent, the object, or the environment.', 'a robotic agent controller, executed by the at least one hardware processor, to'}2. The apparatus according to claim 1 , wherein the eminence score generator is to generate claim 1 , for each insight of the plurality of insights claim 1 , the eminence score by: determining, by a semantic relatedness analyzer that is executed by the at least one hardware processor, semantic relatedness between each insight of the plurality of insights,', 'generating, based on the semantic relatedness between each insight of the plurality of insights, a semantic relatedness graph, wherein each node of the semantic relatedness graph represents an insight of the plurality of insights, and', 'determining, for each node of the semantic ...

Подробнее
04-01-2018 дата публикации

EMOTION TYPE CLASSIFICATION FOR INTERACTIVE DIALOG SYSTEM

Номер: US20180005646A1
Автор: Leung Max, Un Edward
Принадлежит:

Techniques for selecting an emotion type code associated with semantic content in an interactive dialog system. In an aspect, fact or profile inputs are provided to an emotion classification algorithm, which selects an emotion type based on the specific combination of fact or profile inputs. The emotion classification algorithm may be rules-based or derived from machine learning. A previous user input may be further specified as input to the emotion classification algorithm. The techniques are especially applicable in mobile communications devices such as smartphones, wherein the fact or profile inputs may be derived from usage of the diverse function set of the device, including online access, text or voice communications, scheduling functions, etc. 1. An apparatus for an interactive dialog system , the apparatus comprising:a semantic content generation block configured to generate an output statement informationally responsive to a user dialog input, the output statement comprising a computer-generated object to be displayed on a display device;a classification block configured to select, based on at least one fact or profile input, an emotion type code to be imparted to the computer-generated object, the emotion type code specifying one of a plurality of predetermined emotion types; anda visual generation block configured to generate a digital image representation of the computer-generated object, the digital image representation generated to have the predetermined emotion type specified by the emotion type code;wherein the at least one fact or profile input is derived from usage of a mobile communications device implementing the interactive dialog system.2. The apparatus of claim 1 , the digital image representation comprising displayed text having different font or text size depending on the predetermined emotion type specified by the emotion type code.3. The apparatus of claim 1 , the digital image representation comprising an emoticon having the predetermined ...

Подробнее
04-01-2018 дата публикации

Modification of textual messages

Номер: US20180006979A1
Принадлежит: International Business Machines Corp

A writing style of content of a composed message, directed to a set of recipients, may be determined. A previous message that includes a first subset of recipients in the set of recipients may be analyzed. Writing habits of the first subset of recipients may be identified based on the analyzing. A difference between the writing style of the content and the writing habits of the first subset of recipients may be identified. The content of the composed message may be modified based on the difference.

Подробнее
02-01-2020 дата публикации

SYSTEM FOR MULTI-PARTY CHAT TECHNICAL FIELD

Номер: US20200007477A1
Автор: Nair Rahul
Принадлежит:

A computer system analyzes an input in a chat window of a first device, wherein the chat window corresponds to a first chat session with a user of a second device and a second chat session with a user of a third device. The computer system determines, based on the analyzing the input, whether the input corresponds to an information associated with the first chat session and an information associated with the second chat session. Based on determining that the input corresponds to the information associated with the first chat session and the information associated with the second chat session, the computer system transmits the input to the second device and the third device. 1. A computer system , comprising:one or more computer-readable memories storing program instructions; and a first communication user interface element displayed on a first device in association with a first chat session, wherein the first chat session is between a user of the first device and a user of a second device, wherein the first communication user interface element, when in a first state, allows a first communication to be transmitted from the first device to the second device when a transmission user interface element is selected, and wherein the first communication user interface element, when in a second state, restricts the first communication from being transmitted from the first device to the second device when the transmission user interface element is selected; and', 'a second communication user interface element displayed on the first device in association with a second chat session, wherein the second chat session is between the user of the first device and a user of a third device, wherein the second communication user interface element, when in the first state, allows the first communication to be transmitted from the first device to the third device when the transmission user interface element is selected, and wherein the second communication user interface element, when in ...

Подробнее
02-01-2020 дата публикации

MID-TIER MESSAGING SYSTEM

Номер: US20200007493A1
Принадлежит:

A mid-tier messaging system receives a request to initiate a communication session via a first messaging channel that includes a first messaging interface of a first messaging application provided at a user device and a first message provider of a plurality of message providers. The mid-tier messaging system establishes a first session that is mapped to the communication session. The mid-tier messaging system establishes the communication, using the first session, between the first messaging interface and the first message provider. The mid-tier messaging system provides, through the first session, first message communications between the first message provider and the first messaging interface. The mid-tier messaging system seamlessly switches to a second message provider service when the first message provider service cannot service an intent of the first communication session. 1. A mid-tier messaging system , comprising:a non-transitory memory; and receiving a request to initiate a communication session via a first messaging channel that includes a first messaging interface of a first messaging application provided at a user device and a first message provider of a plurality of message providers;', 'establishing a first session that is mapped to the communication session;', 'establishing the communication session, using the first session, between the first messaging interface and the first message provider; and', 'providing, through the first session, first message communications between the first message provider and the first messaging interface., 'one or more hardware processors coupled to the non-transitory memory and configured to read instructions from the non-transitory memory to cause the system to perform operations comprising2. The system of claim 1 , wherein the operations further comprise:receiving an indication that the first message provider is unable to service an intent of the communication session;determining a second messaging channel that ...

Подробнее
03-01-2019 дата публикации

SEMANTICS BASED CONTENT SPECIFICATON OF IOT DATA

Номер: US20190007513A1
Принадлежит:

An M2M entity may retrieve data such that the representation of the data may consistently be returned in a form that can be dynamically specified in order to reduce complexity and overhead required by a requestor or consumer of the data. The semantic descriptions of the data that exist in the service layer may be used in order to provide desired results to the requestor or consumer of the data. 1. An apparatus comprising:a processor; and receiving a message comprising a request to retrieve a first measurement in a first format;', 'receiving the first measurement and meta-data comprising format description of the first measurement;', 'parsing the first measurement based on the format description; and', 'converting the first measurement to the first format based on the parsing of the first measurement., 'a memory coupled with the processor, the memory comprising executable instructions that when executed by the processor cause the processor to effectuate operations comprising2. The apparatus of claim 1 , wherein the first measurement is in a second format before the receiving of the message.3. The apparatus of claim 1 , wherein the message comprises a link to the first format.4. The apparatus of claim 1 , wherein the format description comprises a value and a label.5. The apparatus of claim 1 , wherein the format description comprises information that identifies the value can be interpreted.6. The apparatus of claim 1 , wherein the format description comprises information that identifies the value as an integer.7. The apparatus of claim 1 , wherein the apparatus is a machine-to-machine gateway device.8. A method comprising:receiving, by a device, a message comprising a request to retrieve a first measurement in a first format;receiving the first measurement and meta-data comprising format description of the first measurement;parsing the first measurement based on the format description; andconverting the first measurement to the first format based on the parsing of ...

Подробнее
03-01-2019 дата публикации

Named Entity Disambiguation for providing TV content enrichment

Номер: US20190007711A1
Автор: GEVA Guy, LASSER MENAHEM
Принадлежит:

Methods and systems are disclosed for enriching a viewing experience of a user watching video content on a screen of a client terminal by increasing the relevance of additional media content proposed or provided to the user. Disambiguation of named entities detected in a video content item being played is performed by identifying and accessing an information source directly associated with the video content item, and/or by analyzing visual content of a segment of the video content item. Selecting, proposing and/or providing an additional media content item is based on the information source and/or on the analyzing. 1. A method for enriching a viewing experience of a user watching video content on a screen of a client terminal by increasing the relevance of additional media content proposed to the user , the method comprising:a. providing at least a portion of a video content item to the client terminal, thereby causing playing the at least a portion of the video content item on the screen of the client terminal;b. obtaining a segment of text that is spoken in a sound track of the video content item;c. identifying an occurrence of an ambiguous reference to a named entity in the segment of text, the ambiguous reference matching multiple candidate named entities; i. identifying the video content item;', 'ii. identifying an information source, the information source being directly associated with the identified video content item; and', 'iii. assigning one candidate named entity of the multiple candidate named entities to the ambiguous reference, the assigning being based on information from the identified information source;, 'd. disambiguating the ambiguous reference, the disambiguating comprisinge. selecting one or more media content items that are related to the video content item, the selecting being based on the assigned one candidate named entity; andf. providing one media content item of the one or more media content items, thereby causing displaying the one ...

Подробнее
12-01-2017 дата публикации

Reducing power consumed by a head-mounted system that measures affective response to content

Номер: US20170010647A1
Принадлежит:

Some aspects of this disclosure include systems, methods, and/or computer programs that may be used to reduce power consumed to measure affective response of a user to content presented via a display of a head-mounted system such as an augmented reality system or a virtual reality system. In one embodiment, a controller selects a mode of operation for a sensor coupled to the head-mounted system based on a received indication. The mode of operation is selected from a set comprising first and second modes of operation, and the sensor is configured to consume significantly less power while operating at the first mode of operation compared to the power it is configured to consume while operating at the second mode of operation. 1. A head-mounted system , comprising:a display configured to present segments of content to a user;a sensor configured to measure affective response of the user to the segments of content; anda controller configured to receive an indication indicating a mode of operation for the sensor from among a set comprising first and second modes of operation, and to configure the sensor to operate in the indicated mode of operation; wherein the sensor consumes at least 50% less power while operating at the first mode of operation compared to the power it consumes while operating at the second mode of operation.2. The head-mounted system of claim 1 , further comprising a processor configured to generate the segments of content as part of an interactive computer game claim 1 , and to generate the indication; wherein the indication is provided in a form of a tag that describes a mode of operation in which to operate the sensor while the user consumes a certain segment of content.3. The head-mounted system of claim 1 , wherein the indication is received from a content emotional response analyzer (content ERA) configured to: receive a segment of content claim 1 , analyze the segment claim 1 , and output an indication indicating whether a value related to an ...

Подробнее
08-01-2015 дата публикации

SYSTEM AND METHOD FOR SEMANTIC ANALYSIS OF CANDIDATE INFORMATION TO DETERMINE COMPATIBILITY

Номер: US20150012263A1
Принадлежит:

A computer includes a taxonomy, mapping grammatical patterns to qualities. A scanner on the computer can scan content to identify phrases that correspond to the grammatical patterns in the taxonomy. The computer can then calculate percentages of occurrences for the grammatical patterns, and also for combinations of grammatical patterns. The calculated percentages of occurrences can then be output. 1. A system , comprising:{'b': '105', 'a computer ();'}{'b': 130', '105, 'a memory () in the computer ();'}{'b': 135', '130', '105, 'a taxonomy () stored in the memory () of the computer ();'}{'b': 140', '105', '305', '310', '315', '205', '135, 'a scanner () in the computer () to identify phrases in a content (, , ) that correspond to grammatical patterns () in the taxonomy ();'}{'b': 145', '205', '305', '310', '315', '205', '305', '310', '315', '205', '305', '310', '315, 'a percentage calculator () to calculate percentages of occurrences for each grammatical pattern () in the scanned content (, , ) and to calculate a percentage of occurrences for each combination of grammatical patterns () in the scanned content (, , ) relative to all grammatical patterns () in the scanned content (, , ); and'}{'b': 150', '205', '305', '310', '315', '205', '305', '310', '315', '205', '305', '310', '315, 'an outputter () to output the percentages of occurrences for each grammatical pattern () in the scanned content (, , ) and the percentage of occurrences for each combination of grammatical patterns () in the scanned content (, , ) relative to all grammatical patterns () in the scanned content (, , ).'}2150305310315205. A system according to claim 1 , wherein the outputter () is operative to output all phrases in the content ( claim 1 , claim 1 , ) that correspond to at least one of the grammatical patterns ().3. A system according to claim 2 , wherein:{'b': 155', '205', '305', '310', '315', '205', '305', '310', '315, 'the system further comprises a comparator () to compare the calculated ...

Подробнее
12-01-2017 дата публикации

System for Natural Language Understanding

Номер: US20170011023A1
Автор: GHANNAM Maan, GHANNAM Rima
Принадлежит:

A general-purpose apparatus for analyzing natural language text that allows for the implementation of a broad range of natural language understanding applications. The apparatus for natural language understanding analyzes a source text and transforms the source text into a semantically-interpretable syntactic representation (SISR), comprising a syntax template and semantic clause annotations. The general-purpose apparatus for natural language understanding is adaptable to various source text natural languages and is adaptable to various natural language understanding applications, such as query answering, translation, summarization, information extraction, disambiguation, and parsing. A natural language query answering apparatus for answering questions about a source text, whereby the query answering apparatus utilizes the general-purpose apparatus for transforming the natural language query into SISR format. 2. The query answering method according to claim 1 , further comprising a lexicon of question prototype categories wherein the said lexicon consists of left side entries of the question prototypes and the right side value of either: ‘cause’ claim 1 , ‘effect’ claim 1 , ‘goal’ claim 1 , ‘time’ claim 1 , ‘number’ claim 1 , ‘amount’ claim 1 , ‘subject’ claim 1 , ‘object’ claim 1 , ‘manner’ claim 1 , ‘location’ claim 1 , ‘proposition truth’ claim 1 , ‘preposition’ claim 1 , ‘adjective’ or ‘the entire proposition’.3. The query answering method according to claim 1 , further comprising a fact-act lexicon claim 1 , wherein each left side entry of verb or verb occurrence prototype corresponds to a right side value of either ‘fact’ or ‘act’ depending on whether the said entry relates to an intentional act.4. The query answering method according to claim 1 , further comprising generating a semantically expanded SISR upon receiving the said digitally encoded query in a SISR and the said digitally encoded natural language source text in a SISR.5. The semantically expanded ...

Подробнее
12-01-2017 дата публикации

SYSTEM AND METHOD FOR CONTEXTUALISING A STREAM OF UNSTRUCTURED TEXT REPRESENTATIVE OF SPOKEN WORD

Номер: US20170011024A1
Автор: CANNINGS Nigel Henry
Принадлежит:

A system for contextualising an unstructured stream of text, representative of spoken word, including a grammar processor, a sentence processor, a frequency processor, a summer and an emotion processor. The unstructured stream of text is processed and outputs an audio file total for each matched phrase, word and proper noun, determined from the unstructured text, to a data significance processor. The data significance processor receives and audio file total for each name, proper noun and matched real phrase, determined from the unstructured text, and outputs a list including the names, proper nouns and matched real phrases in order of contextual significance. 1. A system for contextualising an unstructured stream of text representative of spoken word , the system comprising:a grammar processor operable to receive an indexed and time-recorded unstructured stream of text, and identify and extract sentences, names and proper nouns from the unstructured stream of text and output a sentence list, a names list and a proper nouns list;a sentence processor comprising a natural language processor, a real phrase database and a phrase comparator, wherein the natural language processor is operable to receive the sentence list from the grammar processor and segment each sentence into possible phrases, and the phrase comparator is operable to receive the possible phrases and compare each possible phrase with a plurality of real phrases stored in the real phrase database to provide a matched phrase list derived from the possible phrases;a frequency processor operable to receive the word list, proper noun list and matched phrase list and determine the number of occurrences of each word, proper noun and matched phrase in the unstructured stream of text, in a system word and phrase corpus and in a real-world word and phrase database and for each name, proper noun and matched phrase, provide a text frequency total, a system corpus total and a real world total;a summer operable to sum ...

Подробнее
12-01-2017 дата публикации

SENTENCE SIMPLIFICATION FOR SPOKEN LANGUAGE UNDERSTANDING

Номер: US20170011025A1
Принадлежит: Microsoft Technology Licensing, LLC

Sentence simplification may be provided. A spoken phrase may be received and converted to a text phrase. An intent associated with the text phrase may be identified. The text phrase may then be reformatted according to the identified intent and a task may be performed according to the reformatted text phrase. 1. A method for providing sentence simplification , the method comprising:receiving a spoken utterance;converting the spoken utterance to a text phrase;identifying a top level predicate associated with the text phrase;reformatting the text phrase according to the identified predicate; andperforming a task according to the reformatted text phrase.2. The method of claim 1 , wherein identifying the top level predicate associated with the text phrase comprises performing a dependency parse on the text phrase.3. The method of claim 2 , wherein performing a dependency parse comprises:identifying a top level predicate; andexcluding at least one auxiliary word in the text phrase.4. The method of claim 3 , wherein the at least one auxiliary word comprises a dependent of the top level predicate.5. The method of claim 3 , wherein the at least one auxiliary word comprises at least one predefined auxiliary keyword.6. The method of claim 3 , wherein identifying the top level predicate comprises evaluating a weighting criterion associated with each word of the text phrase.7. The method of claim 1 , wherein reformatting the text phrase according to the identified predicate comprises defining a domain associated with the task.8. The method of claim 7 , further comprising filling at least one semantic slot associated with the defined domain.9. The method of claim 8 , wherein the slot is filled with at least one word of the text phrase.10. The method of claim 9 , wherein the at least one word of the text phrase is not associated with the reformatted text phrase.11. A computer-readable medium which stores a set of instructions which when executed performs a method for providing ...

Подробнее
12-01-2017 дата публикации

ANSWERING TIME-SENSITIVE QUESTIONS

Номер: US20170011036A1
Автор: MUNGI Ashish, Mustafi Joy
Принадлежит:

A method providing an answer to an input question containing at least one time-sensitive word or at least one time-sensitive phrase using natural language processing (NLP) is provided. The method may include receiving the input question. The method may also include performing natural language processing (NLP) analysis on the input question to extract a required value phrase. The method may further include forming at least one mathematical equation based on the extracted required value phrase. Additionally, the method may include forming at least one interim question based on the extracted required value phrase. The method may further include solving the at least one formed mathematical equation and the at least one formed interim question. The method may also include narrating the answer to the input question in natural language based on the solved at least one interim question or the solved at least one mathematical equation. 1. A method for providing an answer to an input question containing at least one time-sensitive word or at least one time-sensitive phrase using natural language processing (NLP) , the method comprising: determining a relationship between a plurality of T-Words and a plurality of corresponding values, wherein the plurality of corresponding values include a plurality of related lookup phrases and a plurality of concept terms;', 'mapping the plurality of T-Words to the plurality of corresponding values based on the determined relationship; and', 'storing the mapped plurality of T-Words to the plurality of corresponding values in the online T-Word Dictionary;, 'creating and maintaining an online T-Word Dictionary, wherein creating and maintaining the online T-Word Dictionary comprisesreceiving the input question, wherein the input question is entered by a user via a graphical user interface associated with a first computer;performing natural language processing (NLP) analysis on the input question to extract a required value phrase; identifying ...

Подробнее
12-01-2017 дата публикации

Help Processing Method and Device Based on Semantic Recognition

Номер: US20170011117A1
Автор: He Shan, JIANG Qiang, Li Hang
Принадлежит:

A help processing method and device based on semantic recognition are presented. The method includes receiving, by user equipment, a search request entered by a user, where the search request includes information about a problem statement described in a natural language; performing semantic recognition processing on the information about the problem statement, to obtain information about a search intention of the user; and searching a database using the information about the search intention as a search term, to obtain help content needed by the user. 1. A help processing method based on semantic recognition , comprising:receiving a search request entered by a user, wherein the search request comprises information about a problem statement described in a natural language;performing semantic recognition processing on the information about the problem statement in order to obtain information about a search intention of the user; andsearching a database using the information about the search intention as a search term in order to obtain help content needed by the user.2. The method according to claim 1 , wherein performing the semantic recognition processing on the information about the problem statement in order to obtain the information about the search intention of the user comprises:sending the information about the problem statement to a network server, so that the network server performs semantic recognition processing on the information about the problem statement in order to obtain the information about the search intention; andreceiving the information about the search intention, which is fed back by the network server.3. The method according to claim 1 , wherein searching the database using the information about the search intention as the search term in order to obtain help content needed by the user comprises at least one of:searching, using the information about the search intention as the search term, a help database pre-stored in user equipment in order ...

Подробнее