Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 50844. Отображено 200.
23-06-2020 дата публикации

Automatic conversation reply method and device, storage medium and computer equipment

Номер: CN0111324713A
Автор:
Принадлежит:

Подробнее
30-10-2020 дата публикации

Method and device for building subject model and computer-readable storage medium

Номер: CN0111859917A
Автор:
Принадлежит:

Подробнее
18-12-2020 дата публикации

Conversation abstract generation method and system based on comprehensive feature extraction

Номер: CN0112100327A
Автор:
Принадлежит:

Подробнее
26-05-2020 дата публикации

Spoken language, face and gesture communication device and computing architecture for interacting with digital media content

Номер: CN0111201567A
Автор:
Принадлежит:

Подробнее
18-12-2020 дата публикации

Knowledge graph-based financial domain knowledge question-answering method

Номер: CN0112100344A
Автор:
Принадлежит:

Подробнее
10-04-2020 дата публикации

Word segmentation method and system

Номер: CN0110991173A
Автор:
Принадлежит:

Подробнее
24-11-2020 дата публикации

Номер: CN0111985215A
Автор:
Принадлежит:

Подробнее
29-12-2020 дата публикации

Номер: CN0112149407A
Автор:
Принадлежит:

Подробнее
12-05-2023 дата публикации

Text emotion recognition method and device, computer equipment and readable storage medium

Номер: CN116108836A
Автор: LI TAIHAO, RUAN YUPING
Принадлежит:

The invention relates to a text emotion recognition method and device, computer equipment and a readable storage medium. The method comprises the steps that a to-be-recognized text and a text corpus set are obtained, and the text corpus set comprises a plurality of text statements and a plurality of expression tags matched with the text statements; determining a similarity value between the representation vector of the to-be-recognized text and the representation vector of each text statement, and determining an expression feature vector of the to-be-recognized text based on the similarity value and the plurality of expression tags; and obtaining a semantic feature vector of the to-be-recognized text, and determining an emotion type of the to-be-recognized text based on the semantic feature vector and the expression feature vector. According to the text emotion recognition method provided by the invention, the expression feature vector is used as a tool for assisting in emotion recognition ...

Подробнее
16-05-2023 дата публикации

Text classification method and device, equipment and storage medium

Номер: CN116127068A
Принадлежит:

The invention discloses a text classification method which is used for solving the problems that in the prior art, when a text classification model is used for text classification, the text classification effect is poor, a large number of human computing power resources need to be consumed in the model training process, and the model training period is long. The method comprises the following steps: generating a second query word library according to a sorting result of query words in a pre-constructed first query word library; inputting the obtained to-be-classified text into the second query word bank, and judging whether a keyword matched with the to-be-classified text exists in the second query word bank or not; when the judgment result is no, inputting the to-be-classified text into a first query word bank, and determining at least one keyword corresponding to the to-be-classified text in the first query word bank; and according to the keyword, classifying the to-be-classified text ...

Подробнее
14-07-2023 дата публикации

Intelligent question and answer method and system based on process automation robot technology

Номер: CN116431794A
Принадлежит:

The invention provides an intelligent question and answer method and system based on a process automation robot technology, and relates to the technical field of intelligent question and answer, and the method comprises the steps: reading customer service work chat records, carrying out the accurate identification of chat responses, extracting the keyword features of the chat records, carrying out the part-of-speech classification of the keyword features, and constructing a channel set of a plurality of part-of-speech, performing association of strong and weak identifiers, interaction of message contents of real-time clients, content segmentation extraction and matching identification of the message contents, output of response contents of each channel, extraction of common response contents of all channels, and calculation of strong and weak characteristic values of a combined channel; and performing response processing on the message content according to the common response content corresponding ...

Подробнее
19-05-2023 дата публикации

Document processing method and device, computer program product and readable storage medium

Номер: CN116136958A
Принадлежит:

The invention relates to a document processing method and device, computer equipment, a storage medium and a computer program product, and is applied to the technical field of data processing. The method comprises the following steps: acquiring a to-be-processed document; obtaining a target chapter title dictionary tree according to the to-be-processed document and a regular expression corresponding to a preset title style; the target chapter title dictionary tree comprises at least one sub-level, and each sub-level comprises at least one node; obtaining a document tree based on statistical information and feature information of each node of each sub-level in the target chapter title dictionary tree; and according to statistical information and feature information between each node in the document tree and a brother node corresponding to each node, performing mode mining on each node of each sub-level in the document tree to obtain a document mode corresponding to the to-be-processed document ...

Подробнее
23-05-2023 дата публикации

Model training method and device, text classification method and device, equipment and storage medium

Номер: CN116150370A
Принадлежит:

The invention relates to a model training method and device, a text classification method and device, equipment and a storage medium. According to the method, the first information and the second information are determined through the sentences and the context information of the sentences, so that the first information and the second information both comprise the sentences, and the preceding text information or the following text information in the context information appears in the first information or the second information according to the preset probability. As the flexibility of the difference between the first information and the second information is high, the contrast ratio between the first prediction probability distribution and the second prediction probability distribution obtained after the first information and the second information are processed by the to-be-trained model is flexible. Due to the flexibility, the trained machine learning model can pay more attention to sentences ...

Подробнее
13-06-2023 дата публикации

Text error correction method and device, equipment and storage medium

Номер: CN116258137A
Принадлежит:

The embodiment of the invention provides a text error correction method and device, equipment and a storage medium, and relates to the technical field of artificial intelligence. The method comprises the steps of performing word segmentation on an obtained to-be-corrected text to obtain a plurality of text words and a text candidate set of the text words, calculating embedding vectors of the text words based on the to-be-corrected text and character pronunciation and character pattern information of the text words, and then obtaining a Chinese character structure weight matrix of the text words based on a character structure mapping table and the text candidate set, and inputting the embedded vector and the Chinese character structure weight matrix into a text error correction model to output the character probability of each candidate word in the text candidate set, thereby obtaining a prediction word of the text word, and performing error correction on the text to be subjected to error ...

Подробнее
08-08-2023 дата публикации

Quality control method and device of electronic medical record, storage medium and electronic equipment

Номер: CN116562271A
Принадлежит:

The invention discloses a quality control method and device for an electronic medical record, a storage medium and electronic device.According to the embodiment of the invention, a medical record text input by a user is obtained in real time, standard medical terms are adopted, and real-time quality control is conducted on key fields in the medical record text, so that the problem that the terms in the electronic medical record are not standard and not accurate is solved, and the quality control accuracy of the electronic medical record is improved. And the quality of the electronic medical record is controlled in real time, so that the time difference generated in the process from the generation of the electronic medical record to the modification of the data in the electronic medical record can be avoided, and the probability of diagnosis and treatment errors is reduced.

Подробнее
15-08-2023 дата публикации

Aircraft repetitive fault safety management and control method for unstructured data

Номер: CN116596502A
Автор: HE ZIFAN, XU JUAN
Принадлежит:

The invention discloses an unstructured data-oriented aircraft repetitive fault safety management and control method, which comprises the steps of taking a Chinese text of a new fault of an aircraft as a data sample, extracting a training corpus, obtaining a Doc2Vec optimal word vector dimension and iteration times through training, compressing a feature matrix into a feature vector training neural network model, and obtaining a Doc2Vec optimal word vector dimension and a Doc2Vec optimal word vector dimension and a Doc2Vec optimal word vector dimension; the similarity between the existing fault and the new fault is calculated by using a deep learning algorithm to obtain an existing fault case list with relatively high similarity with the new fault, so that whether the new fault is a repetitive fault is judged; finally, for a newly occurring aircraft repetitive fault, manually verifying the accuracy of similar existing fault cases by experts; on the basis of similar fault retrieval results ...

Подробнее
10-12-2020 дата публикации

LINGUISTICALLY RICH CROSS-LINGUAL TEXT EVENT EMBEDDINGS

Номер: WO2020247616A1
Принадлежит:

A machine accesses a preexisting set of natural language text documents in multiple natural languages. Each natural language text document in at least a portion of the preexisting set is associated with an event. The machine trains, using the preexisting set of natural language text documents and the associated events, an event encoder to learn associations between texts and event annotations. The event encoder leverages a parser in each of the two or more natural languages. The machine generates, using the event encoder, new event annotations for texts. The machine trains, using the preexisting set of natural language text documents and the new event annotations for the texts generated by the event encoder, an event extraction engine to extract events from natural language texts in the two or more natural languages. The event extraction engine leverages the parser in each of the two or more natural languages.

Подробнее
25-03-2021 дата публикации

TEXT SENTIMENT ANALYSIS MODEL TRAINING METHOD, APPARATUS AND DEVICE, AND READABLE STORAGE MEDIUM

Номер: WO2021051598A1
Автор: JIN, Ge, XU, Liang
Принадлежит:

A text sentiment analysis model training method, apparatus and device, and a readable storage medium, relating to the technical field of artificial intelligence. The method comprises: obtaining a text sample to be trained (S10); performing word segmentation processing on the text sample by means of a preset word segmentation method, and dividing the text sample into a plurality of different words (S20); performing encoding processing on the plurality of different words on the basis of a preset encoding method to obtain word vectors (S30); inputting the word vectors into a preset deep neural network, and performing dimension reduction processing on the word vectors on the basis of an embedded layer (S40); calculating word vectors after dimension reduction on the basis of a hidden layer in the deep neural network to obtain corresponding features (S50); classifying the features corresponding to the text sample by means of a multi-classification SVM support vector machine, and determining a ...

Подробнее
08-04-2021 дата публикации

QUASI-RECURRENT NEURAL NETWORK BASED ENCODER-DECODER MODEL

Номер: US20210103816A1
Принадлежит:

The technology disclosed provides a quasi-recurrent neural network (QRNN) encoder-decoder model that alternates convolutional layers, which apply in parallel across timesteps, and minimalist recurrent pooling layers that apply in parallel across feature dimensions.

Подробнее
05-01-2023 дата публикации

WHITELISTING REDACTION SYSTEMS AND METHODS

Номер: US20230004710A1
Принадлежит:

A whitelisting approach to redaction may include receiving a normalization request containing a binary file with complex structured data, such as a workbook or spreadsheet file, from an application on a user device through a redaction service provided by an e-discovery platform. In response to the normalization request, a normalization process starts and runs asynchronously to apply a whitelist to the binary file. The whitelist is configurable and specifies features (e.g., data types, workbook object types, etc.) of the binary file that are considered safe by a party to a lawsuit or investigation. The normalization process includes a pre-processing stage, a transferring stage, and a post-processing stage to push all the data to a visible level and produce a normalized document with the whitelisted features. An artifact containing the normalized document can then be generated and used to render a normalized representation for interactive redaction.

Подробнее
21-12-2023 дата публикации

SYSTEM FOR FINE-GRAINED SENTIMENT ANALYSIS USING A HYBRID MODEL AND METHOD THEREOF

Номер: US20230409834A1
Принадлежит:

The present invention discloses a system and method for fine-grained sentiment analysis using a hybrid model which is extensible to multiple languages, wherein the system (100) comprises a polarity detection module (101) for continuously detecting the positive sentiments, negative sentiments, and neutral sentiments of one or more sentences in a document. Further, a sentiment classification module (102) predicts the intensity and classifies positive sentiments and negative sentiments into one or more pre-defined sentiment classes, wherein the sentiment classification module (102) provides a reference sentiment interval for the classified sentences in the document. Furthermore, a length-based sentiment scoring module (103) continuously assigns a score to the classified sentences in the document ranging between −s to +s, wherein −s indicates extremely negative sentiment of the sentence and +s indicates extremely positive sentiment of the sentence.

Подробнее
02-03-2022 дата публикации

LISTEN, INTERACT, AND TALK: LEARNING TO SPEAK VIA INTERACTION

Номер: EP3407264B1
Принадлежит: Baidu USA LLC

Подробнее
14-06-2023 дата публикации

METHOD AND APPARATUS FOR EXTRACTING TEXT INFORMATION, ELECTRONIC DEVICE AND STORAGE MEDIUM

Номер: EP4123496A3
Принадлежит:

A method for extracting text information includes: acquiring a text to be extracted and a target field name; extracting candidate text information matching the target field name from the text to be extracted based on the text to be extracted and the target field name; and acquiring target text information matching fusion semantics of the text to be extracted, the target field name and the candidate text information by filtering the candidate text information based on the fusion semantics. Therefore, when the candidate text information matching the target field name is extracted from the text to be extracted, the candidate text information is filtered based on the fusion semantics of the text to be extracted, the target field name and the candidate text information, which improves the accuracy of extracting text information.

Подробнее
23-06-2020 дата публикации

Information filtering method and device and electronic equipment

Номер: CN0111324810A
Автор:
Принадлежит:

Подробнее
17-01-2020 дата публикации

Номер: CN0110705261A
Автор:
Принадлежит:

Подробнее
28-02-2020 дата публикации

Statement hotspot extraction method and system

Номер: CN0110852095A
Автор: FEI ZHIJUN, QIU XUETAO, WANG YU
Принадлежит:

Подробнее
27-11-2020 дата публикации

Номер: CN0112000970A
Автор:
Принадлежит:

Подробнее
18-02-2020 дата публикации

Multi-mode-based conference spokesman identity non-inductive confirmation method

Номер: CN0110807370A
Принадлежит:

Подробнее
12-05-2020 дата публикации

Knowledge graph-based professional specialist recommendation method

Номер: CN0111143672A
Автор:
Принадлежит:

Подробнее
27-10-2020 дата публикации

Номер: CN0111831847A
Автор:
Принадлежит:

Подробнее
19-06-2020 дата публикации

Номер: CN0111310477A
Автор:
Принадлежит:

Подробнее
23-06-2023 дата публикации

Event element generation method and system, terminal and storage medium

Номер: CN116306586A
Принадлежит:

The invention provides an event element generation method and system, a terminal and a storage medium, and the method comprises the steps: constructing a word prefix tree and a word prefix vector matrix according to a vocabulary set, carrying out the coding processing of a sample text, obtaining a text representation vector, carrying out the decoding processing of the text representation vector, and obtaining a text decoding result, determining word probability distribution of each word according to a text decoding result, and fusing the word prefix vector matrix and the word probability distribution to obtain word list probability distribution; and constructing an event element set according to the word list probability distribution. According to the method, the word prefix vector matrix and the word probability distribution are fused, vocabulary priori knowledge can be effectively added in the word probability distribution, the accuracy of word list probability distribution is further ...

Подробнее
02-05-2023 дата публикации

Method for matching power distribution network project quota based on material description

Номер: CN116049344A
Принадлежит:

The invention relates to a method for matching power distribution network project quota based on material description, which comprises the following steps: constructing a word segmentation model, and establishing a word segmentation library; acquiring material data information; analyzing and processing the obtained material data; deconstructing the material data, and storing the processed data into a database; performing word segmentation processing on the engineering material description and the quota description according to a word segmentation library; searching the segmented words of the material in a quota library file, matching the quota with the highest score, and displaying in a system; and warehousing the matching result, and establishing a material description and quota matching relationship for next accurate retrieval. According to the method, the defects of inaccuracy of manual quota matching and incompleteness of a rule passing coding matching are effectively overcome; according ...

Подробнее
23-06-2023 дата публикации

Low-frequency phrase extraction method and system based on comment usefulness voting

Номер: CN116306626A
Автор: LIU SHA
Принадлежит:

The invention relates to the technical field of neural networks, and discloses a low-frequency phrase extraction method based on comment usefulness voting, which comprises the following steps: acquiring a comment data set, segmenting the comment data set to obtain candidate phrases, and filtering the candidate phrases to obtain low-frequency candidate phrases; phrases related to the subject in the low-frequency candidate phrases are extracted, the low-frequency candidate phrases are scored according to context information of the phrases related to the subject in the comment data set, the correlation between the low-frequency candidate phrases and the subject is judged according to the score, and low-frequency phrase extraction of comment usefulness voting is achieved. According to the method, low-frequency and valuable phrase information can be found from a large number of comments by effectively utilizing context information.

Подробнее
23-05-2023 дата публикации

Data annotation and generation method, model training method, equipment and medium

Номер: CN116151233A
Принадлежит:

The embodiment of the invention provides a data annotation method, a data generation method, a model training method, equipment and a medium. The method can comprise the steps that a to-be-labeled sample is obtained, the to-be-labeled sample comprises at least one text, the text comprises at least one event, and each text comprises at least one text statement; the text statement in each text is recognized to obtain a recognition result, if the recognition result comprises to-be-labeled information, the corresponding text statement is labeled according to the to-be-labeled information, the to-be-labeled information comprises an information type and an argument, the information type is a viewpoint type expressed by the text statement for an event, and the argument is a viewpoint type expressed by the text statement for the event; the argument comprises a target event corresponding to the viewpoint and/or entity information of a text related to the viewpoint. The comprehensiveness of content ...

Подробнее
02-06-2023 дата публикации

Small sample relation extraction method and system, electronic equipment and storage medium

Номер: CN116205217A
Принадлежит:

The invention discloses a small sample relation extraction method and system, electronic equipment and a storage medium, and relates to the technical field of data processing. The method comprises the steps of obtaining a target text; determining entity relationship representation according to the small sample relationship extraction model and the target text; the entity relationship representation comprises entity texts and corresponding concepts and relationships; wherein the small sample relation extraction model is trained by comparing learning loss and cross entropy loss; the small sample relation extraction model comprises a concept coding module, a sentence coding module and a text concept fusion module; the concept coding module and the sentence coding module are both connected with the text concept fusion module; the concept coding module is constructed on the basis of a skip-gram model; the sentence coding module is constructed based on a Bert embedding model; the text concept ...

Подробнее
25-04-2023 дата публикации

Disease emotion information determination method and device, computer equipment and storage medium

Номер: CN116010593A
Автор: LI JIA, ZHANG YUNYAN
Принадлежит:

The invention relates to a disease emotion information determination method and device, computer equipment and a storage medium. The method comprises the following steps: acquiring a medical record text; through a pre-training model, extracting a first text feature and a second text feature about a concerned object in the medical record text, and based on the first text feature, determining object information of the concerned object and abnormal performance information when an abnormality is presented; the pre-training model is obtained by training a medical record sample which masks medical keywords; the medical keyword comprises the object information and the abnormal performance information; performing abnormal polarity identification on the second text feature through a graph neural network; on the basis of the abnormal polarity, the object information and the abnormal performance information obtained through recognition, disease emotion information of the concerned object is generated ...

Подробнее
22-07-2021 дата публикации

DOCUMENT DISPLAY ASSISTANCE SYSTEM, DOCUMENT DISPLAY ASSISTANCE METHOD, AND PROGRAM FOR EXECUTING SAID METHOD

Номер: WO2021145146A1
Принадлежит:

The present invention is a document display assistance system that, for a document of a particular field, estimates important words and displays same highlighted. The system comprises: a database in which are registered words to be selected and words not to be selected; a trained word selection model that has undergone machine learning for estimating whether a given word is a word to be selected; a text pre-processing unit that extracts each word from a received document to be displayed; a word classifying unit that, on the basis of the database, classifies each word as a word to be selected, a word not to be selected, or an uncertain word; a text post-processing unit that generates output data by assigning a prescribed attribute to a prescribed word in the document to be displayed; and an output unit that outputs the output data. If a label is estimated indicating that an uncertain word classified by the word classifying unit is a word to be selected, the word selection model classifies ...

Подробнее
04-02-2021 дата публикации

NEURAL NETWORK SYSTEM FOR TEXT CLASSIFICATION

Номер: WO2021021330A1
Принадлежит:

A computer-implemented method is provided to perform text classification with a neural network system. The method includes providing a computing device to receive input datasets including user input question text and feed the datasets to the neural network system. The neural network system includes one or more neural networks configured to extract and concatenate character-based features, word-based features from the question datasets and clickstream embeddings of clickstream data to form a representation vector indicative of the question text and user behavior. A representation vector is fed into fully connected layers of a feed-forward network. The feed-forward network is configured to predict a first class and a second class associated with respective user input questions based on the representation vector.

Подробнее
07-10-2021 дата публикации

INFORMATION PROCESSING DEVICE, METHOD OF CONTROLLING SAME, PROGRAM, AND LEARNED MODEL

Номер: WO2021199657A1
Автор: ISHII Kumiko, DU Xin
Принадлежит:

The purpose of the present invention is to obtain an embedding vector in which a feature of an object that fluctuates in price according to the date is embedded. In response to input of a set of texts ni published from a past date through a reference date, a neural network (11) outputs a classification y^jt indicating whether the price of each object has increased or decreased from a date immediately before the reference date through the reference date. An information processing device that implements the neural network (11) learns a model (19) including an embedding vector. Specifically, the information processing device extracts feature vectors nKi, nVi of two different levels from each text ni published on each date. Then, the information processing device determines a weight αji on the basis of the inner product of the feature vector nKi and an embedding vector sj. Then, the information processing device determines a state mjτ by multiplying the weight αji by the other of the feature ...

Подробнее
22-03-2022 дата публикации

Generative adversarial network based modeling of text for natural language processing

Номер: US0011281976B2

Mechanisms are provided to implement a generative adversarial network (GAN) for natural language processing. With these mechanisms, a generator neural network of the GAN is configured to generate a bag-of-ngrams (BoN) output based on a noise vector input and a discriminator neural network of the GAN is configured to receive a BoN input, where the BoN input is either the BoN output from the generator neural network or a BoN input associated with an actual portion of natural language text. The mechanisms further configure the discriminator neural network of the GAN to output an indication of a probability as to whether the input BoN is from the actual portion of natural language text or is the BoN output of the generator neural network. Moreover, the mechanisms train the generator neural network and discriminator neural network based on a feedback mechanism that compares the output indication from the discriminator neural network to an indicator of whether the input BoN is from the actual ...

Подробнее
26-05-2020 дата публикации

Method and device for extracting entity relation based on deep learning, and server

Номер: US0010664660B2

A method and device for extracting entity relation based on deep learning and a server are provided. The method includes: preprocessing a text to be mined, to obtain a sentence with entities in the text to be mined; determining an entity pair in the sentence according to the entities, wherein the entity pair includes at least two entities, and determining candidate relations between entities in the entity pair; and determining an entity relation between the entities in the entity pair from the candidate relations.

Подробнее
29-09-2020 дата публикации

Generating a response to a natural language command based on a concatenated graph

Номер: US0010789425B2

For generating a response to a natural language command based on a concatenated graph, a processor identifies one or more relevant data sets in response to a natural language command received from an input device. Each relevant data set includes one of a subject of the natural language command and a subject of another relevant data set. The processor further generates a graph for each of the one or more relevant data sets and concatenates the graphs into a concatenated graph. In addition, the processor generates a response to the natural language command based on the concatenated graph.

Подробнее
15-07-2021 дата публикации

Bias Detection in Conversational Agent Platforms

Номер: US20210216720A1
Принадлежит:

A mechanism is provided for implementing a bias detection mechanism that mitigates unintended bias in a conversational agent by leveraging conversational agent definitions, a conversational agent chat logs, and user satisfaction statistics. One or more protected attributes are identified within an utterance from the conversational agent chat logs. Using the identified protected attributes, a replacement utterance with a replacement term is generated for at least one of the identified protected attributes in the utterance. A score is generated for the utterance and the replacement utterance using utterance level relative term importance for protected attributes and regular terms in the utterance and the replacement utterance. Utilizing the scoring, a determination is made as to whether unintended bias exists within the utterance. Responsive to unintended bias being detected, an action is implemented that causes a change to a machine learning model used by the conversational agent.

Подробнее
05-07-2022 дата публикации

Data accuracy using natural language processing

Номер: US0011379466B2
Принадлежит: ACCENTURE GLOBAL SOLUTIONS LIMITED

Examples for enhancing veracity of data are described herein. Data from a repository may be received based on a data receiving rule. From the received data, a first dataset may be generated using statistical modeling. Also, a first data veracity score for the first dataset is generated which is indicative of a degree of usability of the dataset. Another aspect relates to identifying an anomaly in the first dataset, the corrector, for each anomaly, to identify a correction technique from amongst a plurality of correction techniques. Further, a second dataset is generated using the identified correction technique having second data veracity score higher than the first data veracity score.

Подробнее
07-02-2023 дата публикации

Encoding entity representations for cross-document coreference

Номер: US0011573994B2

A computer-implemented method for performing cross-document coreference for a corpus of input documents includes determining mentions by parsing the input documents. Each mention includes a first vector for spelling data and a second vector for context data. A hierarchical tree data structure is created by generating several leaf nodes corresponding to respective mentions. Further, for each node, a similarity score is computed based on the first and second vectors of each node. The hierarchical tree is populated iteratively until a root node is created. Each iteration includes merging two nodes that have the highest similarity scores and creating an entity node instead at a hierarchical level that is above the two nodes being merged. Further, each iteration includes computing the similarity score for the entity node. The nodes with the similarity scores above a predetermined value are entities for which coreference has been performed in input documents.

Подробнее
27-06-2023 дата публикации

Single input graphical user interface control element and method

Номер: US0011687529B2
Принадлежит: Deephaven Data Labs LLC

Described are methods, systems and computer readable media for GUI control elements and associated processing methods.

Подробнее
19-12-2023 дата публикации

Enforcing data ownership at gateway registration using natural language processing

Номер: US0011847412B2
Принадлежит: Capital One Services, LLC

Enforcing data ownership may include receiving a request to register an application programming interface (API) endpoint. A plurality of elements of the API endpoint and a target API endpoint may be preprocessed. A distance may be computed for each of element of the API endpoint relative to at least one of the elements of the target API endpoint. A distance score for the API endpoint may be computed based on the distance scores. A term frequency-inverse document frequency (TF-IDF) value may be computed for a plurality of metadata terms of the API endpoint and the target API endpoint. A similarity score between the TF-IDF values of the metadata terms may be computed. An adjusted score may be computed for the API endpoint based on the distance score and the similarity scores. The API endpoint may be registered based on the adjusted score being below a permissions threshold.

Подробнее
12-05-2020 дата публикации

Entity identification method for improving knowledge migration

Номер: CN0111144119A
Автор:
Принадлежит:

Подробнее
15-05-2020 дата публикации

Man-machine verification method and system based on natural language processing

Номер: CN0111159686A
Автор:
Принадлежит:

Подробнее
13-11-2020 дата публикации

Text retrieval result scoring method, retrieval method and device

Номер: CN0111930928A
Автор:
Принадлежит:

Подробнее
16-10-2020 дата публикации

Номер: CN0111782785A
Автор:
Принадлежит:

Подробнее
01-12-2020 дата публикации

Prosody prediction model training method, prosody prediction method and related device

Номер: CN0112017628A
Автор:
Принадлежит:

Подробнее
13-10-2020 дата публикации

Intelligent city space-time big data spatialization engine construction method based HMM model

Номер: CN0111767476A
Автор:
Принадлежит:

Подробнее
13-10-2020 дата публикации

Retrieval method and device, electronic equipment and storage medium

Номер: CN0111767477A
Автор:
Принадлежит:

Подробнее
23-06-2023 дата публикации

Word recommendation model training method and device

Номер: CN116306629A
Автор: CHEN YINGYING, HUANG YUYAN
Принадлежит:

The invention discloses a word recommendation model training method and device, and the method comprises the steps: building a text data set, determining the entity word of each text data, determining a first text vector and a first graph vector corresponding to each entity word, determining a loss function corresponding to a word recommendation model according to the first text vector and the first graph vector, and carrying out the training of the word recommendation model. The universality of the word recommendation model in different scenes is improved, and the training efficiency of the word recommendation model is improved.

Подробнее
30-06-2023 дата публикации

Construction method and device of retrieval model, electronic equipment and storage medium

Номер: CN116361339A
Принадлежит:

The invention discloses a retrieval model construction method and device, electronic equipment and a storage medium. The method comprises the steps of obtaining at least one fault record text, and determining at least one to-be-processed word corresponding to the fault record text; performing matching processing on the to-be-processed words based on the standard term vocabulary, and determining at least one to-be-used word; the determination unit is used for determining an association relationship of at least two content words associated with the relational words based on the relational attributes of the relational words; and based on the at least two content words and the corresponding association relationship, determining a target retrieval model, so as to determine a fault associated word corresponding to the to-be-retrieved information based on the target retrieval model. The problem that in the prior art, due to the fact that fault data are analyzed through language recognition to ...

Подробнее
23-05-2023 дата публикации

Method and device for identifying article catalog in financial field

Номер: CN116151224A
Автор: DENG YONG, LI WEI
Принадлежит:

The invention provides a method and a device for identifying article directories in the financial field, and relates to the technical field of data processing. The method and the device for recognizing the article catalog in the financial field comprise a part-of-speech library module, a preprocessing module, a processing module and a generation module, the part-of-speech library module is a supporting module of the device, and the preprocessing module and the processing module are used for extracting the catalog based on the part-of-speech library module. The generation module is used for outputting a result generated by the processing module; and the part-of-speech library module is used for initializing a directory part-of-speech, initializing an nshort part-of-speech, initializing a user-defined part-of-speech, exhaustion first-level directory features of three types of stock inviting books, user-defined directory features and the like. According to the method, the special catalog is ...

Подробнее
11-07-2023 дата публикации

Space gene recognition and extraction method based on social media text data

Номер: CN116414985A
Автор: DUAN JIN, LI YIGE, GUO CHUYI
Принадлежит:

The invention discloses a social media text data-based space gene identification and extraction method, which comprises the following steps of: acquiring network text data about a city, and then preprocessing the data to obtain a data set D1; constructing a dictionary and a vector space in analysis software, introducing an LDA topic model, and performing topic classification on an obtained data set D1; synonyms are merged in all the topics, synonym replacement is carried out in the data set D1, and a data set D2 is obtained; counting the number of co-occurrence times of every two keywords in the data set D2, and constructing a co-word matrix M; and clustering semantic network analysis results by using a hierarchical clustering model to obtain a spatial combination mode, namely a spatial gene. According to the method, web text data about a certain research city is collected from a multi-source social media platform, and a practical technical means is provided for city researchers to identify ...

Подробнее
18-07-2023 дата публикации

Song imitation writing method and device, electronic equipment and storage medium

Номер: CN116453489A
Принадлежит:

The invention provides a song imitation writing method and device, electronic equipment and a storage medium. Application to the technical field of computers. The method comprises the following steps: acquiring a source song, and segmenting the source song into source lyrics and source tunes; and generating imitated lyrics of the source lyrics according to picture words in the source lyrics, wherein the picture words are used for representing pictures expressed by the source lyrics. And according to the source tune, performing autoregression to generate an imitated tune of the source tune. And combining the imitated lyrics and the imitated tunes to obtain a target song. According to the method, the source song is divided into the source lyrics and the source tunes, then the imitated lyrics of the source lyrics are generated according to the picture words, the imitated tunes of the source tunes are generated in an autoregression mode according to the source tunes, and then the imitated lyrics ...

Подробнее
30-05-2023 дата публикации

Automatic chart generation system

Номер: CN116186125A
Принадлежит:

The invention provides an automatic chart generation system which comprises a database retrieval module used for receiving an NLP query for a database, constructing a query statement according to the NLP query and automatically generating form data of a query result; the form serialization module is used for obtaining form data, determining the type of the form data and generating a serialized form based on the type of the form data; and the chart presentation module is used for receiving rendering parameters associated with the serialized form, determining a to-be-presented data element set based on a plurality of rendering parameters in the chart creation interface, and rendering by utilizing a specific chart style to obtain a target chart. Through the scheme of the invention, the user can freely select the data in the NLP query content and autonomously select the rich chart display form, the real-time visual display of the selected data is realized, the chart content is realized according ...

Подробнее
04-07-2023 дата публикации

Text enhancement method based on N-gram algorithm and using downstream task to screen text

Номер: CN116384375A
Принадлежит:

The invention relates to an N-gram algorithm-based text enhancement method for screening texts by using downstream tasks, which comprises the following steps of: selecting a text generator to generate texts similar to original texts; setting a scoring function for evaluating the text generation quality, and scoring the texts generated in each batch; evaluating the screened text on the downstream task; and selecting texts with smooth semantics from the obtained texts as the input of the next batch of text generation process, and repeatedly generating the texts to obtain a required number of new texts. According to the method, the scoring mode based on the N-gram algorithm and the weighted average is adopted at the same time, and the quality of the generated text is evaluated in combination with downstream tasks, so that the text generated by the text enhancement method is smoother and more coherent, and training of an intelligent language model is facilitated.

Подробнее
11-08-2023 дата публикации

Powerpoint generation method and device

Номер: CN116579308A
Принадлежит:

The invention discloses a PowerPoint generation method and device, and the method comprises the steps: obtaining a theme for generating a PowerPoint, and obtaining secondary titles of the PowerPoint and text contents under each secondary title based on a pre-constructed and trained text generation module; structuring the theme of the presentation file, the secondary titles and the text content under each secondary title to obtain a plurality of parts, taking each part as a page of the presentation file, and performing keyword extraction on other pages except a home page and a directory page; on the basis of the extracted keywords, generating illustration images corresponding to all the pages of the presentation manuscripts through a text generation image module; and carrying out automatic typesetting on the divided text content and the illustration image of the corresponding page to obtain a complete presentation file.

Подробнее
11-08-2023 дата публикации

Text abnormal word recognition method and system based on neural language model

Номер: CN116579330A
Принадлежит:

The invention relates to the field of text semantic understanding, in particular to a text abnormal word recognition method and system based on a neural language model, and the method and system are used for recognizing network terms, new words and wrong words. (2) determining a context word sequence of each word of the text to be recognized in a sliding window mode; 3) respectively inputting the context word sequence of each word into a pre-established and trained recognition model to obtain the probability of occurrence of each word in the context word sequence; and 4) comparing the occurrence probability of each word with a set threshold value, and judging whether the word is an abnormal word or not. According to the method, abnormal words do not need to be marked, and training can be carried out on a large amount of unsupervised data; the prediction probability threshold values of the abnormal words and the normal words, which are determined in a mode of predicting probability statistics ...

Подробнее
04-04-2023 дата публикации

Policy information identification method

Номер: CN115906842A
Принадлежит:

The invention discloses a policy information identification method, and belongs to the technical field of natural language processing. A set of complete policy document element system is constructed, and different elements in the policy document are clearly divided. Based on the system, classification of each paragraph in the policy document and key information extraction of entity-level text paragraphs can be realized more accurately. Besides, the policy information recognizer provided by the invention simplifies the recognition difficulty of text entities in a mode of predicting vacant content tags under the constructed policy text element system, can more accurately extract useful key information from texts based on the constructed policy document element system, and improves the recognition efficiency. And the method has excellent performance under the condition that the scale of the labeled training data set is relatively small.

Подробнее
04-04-2023 дата публикации

Service keyword prediction method based on text disambiguation, storage medium and equipment

Номер: CN115906864A
Автор: HAN SONGTAO, ZHAO SI
Принадлежит:

The invention discloses a business keyword prediction method based on text disambiguation, a storage medium and equipment, and the method comprises the steps: obtaining massive text data generated in a business process, carrying out the disambiguation of the text data, dividing the text data into metadata and meta-process data, and carrying out the classification of the metadata and the meta-process data through a classification model constructed through a Pareto analysis method, a Pareto optimal solution is obtained; obtaining a semantic context probability that the business keyword belongs to the current business through the business keyword, and generating a semantic identifier of the business keyword; and inputting the semantic context probability of the business keyword into the Markov chain hot word prediction model, predicting the hot-cold transition probability of the business keyword initiated next time, updating the semantic context probability in the semantic identifier, and ...

Подробнее
11-12-2020 дата публикации

BOOK DISTRIBUTION SYSTEM INCLUDING APPARATUS FOR PROVIDING CLIENT CUSTOMIZED BOOK SERVICE

Номер: KR0102190376B1
Автор:
Принадлежит:

Подробнее
04-11-2021 дата публикации

METHOD AND APPARATUS FOR EXTRACTING TERMINOLOGY IN INTELLIGENT INTERVIEW, DEVICE, AND MEDIUM

Номер: WO2021218027A1
Автор: DENG, Yue, JIN, Ge, XU, Liang
Принадлежит:

A method and apparatus for extracting a terminology in an intelligent interview, a computer device, and a storage medium. The method comprises: performing segmentation for each character in an answer statement according to preset N dimensions to obtain N word group sets corresponding to the character, and using all word groups corresponding to the character as candidate word groups; determining weight information of the word group sets according to the word frequency of each word group in a historical interview word bank; assigning weights more conforming to the interview scene to the candidate word groups; determining characterization information of each character by the weight information; and determining, in combination with the characterization information of the characters and by adopting a preset sequence marking model having a simple structure, a named entity comprised in the answer statement, and storing the named entity in a blockchain network as a terminology. There is no need ...

Подробнее
22-07-2021 дата публикации

INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND INFORMATION PROCESSING PROGRAM

Номер: WO2021144861A1
Принадлежит:

When training a model for outputting results of estimating neighboring words of a word from the word, this information processing device trains the model using words extracted from document data, neighboring words of the extracted words, first words each constituting a compound word, words (compound co-occurrence words) following the first words, and neighboring words of the first words. The information processing device then uses the trained model to create a vector representing features of each word, and calculates the degrees of similarity between words using the created vector for each word.

Подробнее
15-10-2020 дата публикации

DIALOG ACTION ESTIMATION DEVICE, DIALOG ACTION ESTIMATION METHOD, DIALOG ACTION ESTIMATION MODEL LEARNING DEVICE, AND PROGRAM

Номер: WO2020209072A1
Принадлежит:

The present invention enables a dialog action type to be accurately estimated in consideration of an utterance subject. A feature amount extraction unit 130 extracts, from a first utterance sentence and a second utterance sentence that is an utterance sentence before the first utterance sentence and includes an utterance sentence at least right before the first utterance sentence, feature amounts respectively including utterance subject feature amounts, which are feature amounts pertaining to the utterance subject of the utterance sentences, and a dialog action estimation unit 260 estimates a dialog action type of the first utterance sentence by using an integrated feature amount obtained by integrating the extracted feature amounts for the first utterance sentence and the second utterance sentence and a dialog action estimation model that is trained in advance and is for estimating a dialog action type that indicates the kind of a dialog action in consideration of an utterance subject ...

Подробнее
02-12-2021 дата публикации

GENERATIVE-DISCRIMINATIVE LANGUAGE MODELING FOR CONTROLLABLE TEXT GENERATION

Номер: US20210374341A1
Принадлежит:

The embodiments describe a generative-discriminative (GeDi) language modeling for determining a next token in a text sequence. A class conditional language model and a positive control code determine a first class conditional probability for each token candidate. The class conditional language model and a negative control code determine a second class conditional probability for the each token candidate. A logarithmic probability difference between the first class conditional probability and the second class conditional probability is determined for each token candidate. An unconditional language model determines an unconditional probability for each token candidate. A combined probability is determined by combining the unconditional probability and the logarithmic probability difference for each token candidate. The next token is selected from the token candidates based on the combined probabilities of the token candidates.

Подробнее
04-03-2021 дата публикации

SERVER

Номер: US20210064819A1
Принадлежит: LG ELECTRONICS INC.

The present disclosure relates to a server. The server includes a communicator to receive and transmit data from and to an external network, and a processor to detect and monitor an online issue in the external network through the communicator, wherein the processor collects text from a plurality of external servers, performs learning on the collected text, performs an issue detection, and monitors text corresponding to a confirmed issue. Accordingly, an online issue in a network is effectively detected and monitored.

Подробнее
29-11-2022 дата публикации

Information extraction from open-ended schema-less tables

Номер: US0011514235B2

Systems and methods for generating and annotating cell documents include extracting tables from a document using a table extraction engine. Headers are extracted for each of the tables using a header detection engine. Cells are extracted from each of the tables using a cell extraction engine. A cell document is generated for each of the cells which are each correlated to corresponding portions of the headers, each cell document recording the correlation between the cells and the headers. Each cell document is annotated to generate annotated cell documents with a cell recognition model trained to perform natural language processing on the cell documents by classifying each term in each of the cell documents and extracting relationships between the terms of each of the cell documents.

Подробнее
16-03-2023 дата публикации

AUTOMATIC HANDLING OF SECURITY DRIFT IN CLOUD ENVIRONMENTS

Номер: US20230081915A1
Принадлежит:

Security drift can be automatically handled in cloud environments. A security audit engine can be configured to extract security configuration datasets from cloud resources and create text sentences from the datasets as well as from a golden configuration. These text sentences can be encoded as vectors in an n-dimensional space. Probability distributions can then be generated using the vectors such as by using an unsupervised clustering algorithm. Distance matrixes can then be generated from the probability distributions. A probability distribution pertaining to a dataset and a probability distribution pertaining to the golden configuration can then be compared and normalized using a transport to thereby yield a security drift score representing a divergence of the corresponding security settings from the golden configuration. When a security drift score exceeds a threshold, the security audit engine can take appropriate action.

Подробнее
02-02-2023 дата публикации

GENERATION DEVICE, GENERATION METHOD, AND GENERATION PROGRAM

Номер: US20230032372A1

The extraction unit 132 extracts a second word corresponding to a first word included in a first text from among a plurality of words belonging to a predetermined domain. The determination unit 133 determines whether a predetermined condition for the word class of the first word is satisfied or not. When it is determined by the determination unit 133 that the condition is satisfied, the generation unit 134 generates a second text in which the first word of the first text is exchanged with the second word.

Подробнее
17-10-2023 дата публикации

Data extraction and duplicate detection

Номер: US0011790679B2

A system provides an end-to-end solution for invoice processing which includes reading files (such as pdfs and images), extracting key relevant information from the files, organizing the relevant information in a structured template as a key-value pair, and comparing files based on the similarities between different file fields to identify potential duplicate files.

Подробнее
16-01-2024 дата публикации

Computerized system and method for controlling electronic messages and their responses after delivery

Номер: US0011876764B2
Принадлежит: YAHOO ASSETS LLC

Disclosed are systems and methods for improving interactions with and between computers in content searching, hosting and/or providing systems supported by or configured with devices, servers and/or platforms. The disclosed systems and methods for efficiently monitoring and following up on delivered messages for which a user expects and/or requires a reply. The disclosed functionality provides a fully automated, personalized, easy and efficient way to identify and manage outgoing mail messages that require reply by marking outbound messages as RSVP messages, which are those messages determined to require a reply. Such functionality is based on the ability of the disclosed framework to distinguish between a “satisfactory response” (i.e., a response that includes the required information) and a response that is not.

Подробнее
20-07-2022 дата публикации

DETECTING UNRELATED UTTERANCES IN A CHATBOT SYSTEM

Номер: EP4028931A1
Принадлежит:

Подробнее
02-12-2021 дата публикации

Machine language and natural language processing system for conducting a conversation via textual methods

Номер: AU2021106667A4
Принадлежит:

Machine language and natural language processing system for conducting a conversation via textual methods The present invention conducts a conversation via textual methods. Proposed invention use chatbot technology to develop chatbot for bank web site for additional economical approach of communication between the client and management. Clients will use bank web site chatbot for queries rather than about to workplace or authorities directly for information. Chatbot provide virtual assistance to users by guiding them to apply for loans, credit cards, opening of saving accounts, etc by using tools such as natural language processing and machine learning. Present invention creates API's for models of NLP and deep learning to allow easy integration with bank website and containerizing the chatbot tool to maintain microservice architecture.

Подробнее
02-09-2021 дата публикации

Neural network system for text classification

Номер: AU2020321751A1
Принадлежит:

A computer-implemented method is provided to perform text classification with a neural network system. The method includes providing a computing device to receive input datasets including user input question text and feed the datasets to the neural network system. The neural network system includes one or more neural networks configured to extract and concatenate character-based features, word-based features from the question datasets and clickstream embeddings of clickstream data to form a representation vector indicative of the question text and user behavior. A representation vector is fed into fully connected layers of a feed-forward network. The feed-forward network is configured to predict a first class and a second class associated with respective user input questions based on the representation vector.

Подробнее
30-10-2020 дата публикации

Method and device for detecting short message unword sensitive words and computer storage medium

Номер: CN0111859032A
Автор:
Принадлежит:

Подробнее
17-04-2020 дата публикации

Hotspot vocabulary extraction method, device, terminal and medium based on web crawler

Номер: CN0111026942A
Автор:
Принадлежит:

Подробнее
18-12-2020 дата публикации

Weighting LDA-based veterinary drug residue knowledge graph construction method

Номер: CN0112100405A
Автор:
Принадлежит:

Подробнее
18-12-2020 дата публикации

Protection system and method for preventing data leakage

Номер: CN0112104655A
Автор:
Принадлежит:

Подробнее
18-08-2020 дата публикации

Topic mining method and device based on artificial intelligence and electronic equipment

Номер: CN0111553144A
Автор:
Принадлежит:

Подробнее
08-09-2020 дата публикации

Intelligent auxiliary case handling method based on deep learning

Номер: CN0111639479A
Автор:
Принадлежит:

Подробнее
17-07-2020 дата публикации

Error correction method, device, device and storage medium based on language model

Номер: CN0111428474A
Автор:
Принадлежит:

Подробнее
13-10-2020 дата публикации

Keyword extraction method and device, electronic equipment and storage medium

Номер: CN0111767713A
Автор:
Принадлежит:

Подробнее
07-07-2020 дата публикации

Pattern string allocation method and device and device for pattern string allocation

Номер: CN0111382325A
Автор:
Принадлежит:

Подробнее
28-08-2020 дата публикации

Auction recommendation method, system, device and medium based on user portrait

Номер: CN0111597434A
Автор:
Принадлежит:

Подробнее
25-12-2020 дата публикации

Traditional Chinese medicine case data processing method and device and electronic equipment

Номер: CN0112131862A
Автор:
Принадлежит:

Подробнее
07-02-2020 дата публикации

Contract sensitive word verification method and device based on artificial intelligence and storage medium

Номер: CN0110765761A
Принадлежит:

Подробнее
13-11-2020 дата публикации

Data resource labeling method and device, storage medium and electronic equipment

Номер: CN0111930792A
Автор:
Принадлежит:

Подробнее
02-10-2020 дата публикации

Voice emotion recognition method and system for enhancing anger and pistaxis recognition

Номер: CN0108597541B
Автор:
Принадлежит:

Подробнее
12-05-2023 дата публикации

Multimodal knowledge graph construction method and information retrieval method based on media field

Номер: CN116108192A
Принадлежит:

The invention relates to the technical field of news media, discloses a multi-modal knowledge graph construction method and an information retrieval method based on the media field, and aims at solving the problem that an existing retrieval engine constructed based on a knowledge graph is poor in retrieval effect and user experience. Performing sentence segmentation processing on a news text corresponding to the news data, and extracting entity words and field event triples in sentences; training a CLIP fine tuning model, performing matching prediction on the corresponding entities and events according to the CLIP fine tuning model, and expanding the multi-modal attribute set of the entities and the events according to a matching prediction result; and constructing a knowledge base according to the extracted data, performing anaphora resolution and entity disambiguation processing on the data in the knowledge base, constructing ontologies of domain entities and domain events, and writing ...

Подробнее
23-06-2023 дата публикации

Conversation flow chart generation method and related device

Номер: CN116308103A
Принадлежит:

The invention discloses a dialogue flow chart generation method and a related device, a dialogue text can be divided into a plurality of text segments, and a corresponding label is set for each text segment, so that a label sequence is obtained, the label sequence comprises a plurality of text nodes, and one text node corresponds to one text segment; the label sequence is input into a hidden Markov model for unsupervised learning, so that a set of model parameters of the hidden Markov model are obtained, the model parameters comprise a plurality of hidden states and transition probability matrixes among the hidden states, and the transition probability matrixes are used for obtaining the hidden Markov model; each hidden state comprises at least one display state and an emission probability matrix between the hidden state and the included display states, and one display state corresponds to one text node; and generating a corresponding dialogue flow chart based on a graph algorithm according ...

Подробнее
12-01-2012 дата публикации

Electronic dictionary and dictionary writing system

Номер: US20120010870A1
Принадлежит: ABBYY InfoPoisk LLC

Described herein is a computer implemented method for creating content for electronic dictionaries. An exemplary system includes a user interface, entry filtration system, and interface tools for dictionary entry comparison, entry merge, and visual markup of changes. Many dictionaries may be accessed and used in one user interface window. A user may enter a grammatical, syntactic and semantic markup which may be helpful when the user translates a word or a text directly from an electronic document. An appropriate lexical meaning may be selected during translation from among several lexical meanings depending on a grammatical, syntactic and/or semantic context of a word or phrase.

Подробнее
19-01-2012 дата публикации

Parsing culturally diverse names

Номер: US20120016660A1
Принадлежит: International Business Machines Corp

Provided are techniques for parsing a name. A name to be parsed is received. A culture of the name is identified. One or more name phrases from the name are identified. Statistics for the one or more name phrases are identified. It is determined whether to perform a first parsing technique that parses different types of name elements within at least one field of the name. In response to determining that the first parsing technique is to be performed, the name is parsed using the statistics and the first parsing technique. In response to determining that the first parsing technique is not to be performed, the name is parsed using the statistics and a second parsing technique.

Подробнее
29-03-2012 дата публикации

System and Method for Increasing Recognition Rates of In-Vocabulary Words By Improving Pronunciation Modeling

Номер: US20120078617A1
Принадлежит: AT&T INTELLECTUAL PROPERTY I LP

The present disclosure relates to systems, methods, and computer-readable media for generating a lexicon for use with speech recognition. The method includes receiving symbolic input as labeled speech data, overgenerating potential pronunciations based on the symbolic input, identifying potential pronunciations in a speech recognition context, and storing the identified potential pronunciations in a lexicon. Overgenerating potential pronunciations can include establishing a set of conversion rules for short sequences of letters, converting portions of the symbolic input into a number of possible lexical pronunciation variants based on the set of conversion rules, modeling the possible lexical pronunciation variants in one of a weighted network and a list of phoneme lists, and iteratively retraining the set of conversion rules based on improved pronunciations. Symbolic input can include multiple examples of a same spoken word. Speech data can be labeled explicitly or implicitly and can include words as text and recorded audio.

Подробнее
26-04-2012 дата публикации

Formalization of a natural language

Номер: US20120101803A1
Принадлежит: Individual

It is disclosed a method for formalization of a natural language allowing creation of an unambiguous model of a natural language text. It is determined the basic notions for entities that are named by a natural language and for each basic notion it is attached an unique number or name and a description, in addition it is attached a list of words which can name the basic notion for each used natural language. The unambiguous model uses only basic notions. In this way it is possible a machine to interpret the unambiguous model and to input knowledge and data in a base or to make a text generation in another natural language using the unambiguous model. Also it can be generated a text in artificial language such as a program language.

Подробнее
24-05-2012 дата публикации

Fstp expert system

Номер: US20120131055A1
Автор: Sigram Schindler
Принадлежит: Sigram Schindler Beteiligungs GmbH

Be TT.p the “technique teaching” of a patent or venture, RS a “reference set” of prior art “technique teachings TT.i”, any “element” of any TT described by its properties, and all this information be presented as meaningful items. Then the FSTP Expert System supports managing an analysis of TT.p over RS such that it is able to reply automatically and instantly to any query for any item in this information. These answers may describe any interrelation between any items or properties/facts or comment on such interrelations or on some insights into them achieved while generating these items by or interactively with the FSTP Expert System. By formalization of these properties it also supports determining the value of q dependably indicating TT.p as trivial/obvious over RS iff q=0 and for q>0 showing the “creative height of TT.p over RS” and quantifying the “power” of this indication.

Подробнее
31-05-2012 дата публикации

Natural Language Interface

Номер: US20120136649A1
Автор: Marit Rams, Uwe Freising
Принадлежит: SAP SE

The present disclosure involves systems, software, and computer implemented methods for providing a natural language interface for searching a database. One process includes operations for receiving a natural language query. One or more tokens contained in the natural language query are identified. A set of sentences is generated based on the identified tokens, each sentence representing a possible logical interpretation of the natural language query and including a combination of at least one of the identified tokens. At least one sentence in the set of sentences is selected for searching a database based on the identified tokens.

Подробнее
05-07-2012 дата публикации

Method and system for managing semantic and syntactic metadata

Номер: US20120173556A1
Принадлежит: International Business Machines Corp

A method and system for managing semantic and syntactic metadata. Heterogeneous data is received. After the heterogeneous data is received, the semantic metadata associated with the received heterogeneous data is captured and syntactic metadata associated with the received heterogeneous data is captured. The semantic metadata describes contextually relevant or domain-specific information about data based on an industry-specific or enterprise-specific metadata model or ontology. The syntactic metadata included grammatical rules and structural patterns governing an ordered use of formats and arrangement pertaining to specified data. The received heterogeneous data and said captured semantic metadata and said syntactic metadata are logically linked. The heterogeneous data is stored in a repository.

Подробнее
12-07-2012 дата публикации

Word pair acquisition apparatus, word pair acquisition method, and program

Номер: US20120179682A1
Принадлежит: Individual

Conventionally, it has been impossible to appropriately acquire word pairs having a prescribed relationship. Such word pairs can be appropriately acquired with a word pair acquisition apparatus including: a word class information storage unit in which word class information can be stored; a class pair favorableness degree storage unit in which a class pair favorableness can be stored; a seed pattern storage unit in which can be stored one or more seed patterns; a word pair acquisition unit that acquires one or more word pairs co-occurring with the seed pattern from sentence groups; a class pair favorableness degree acquisition unit that acquires a class pair favorableness degree; a score determination unit that uses the class pair favorableness degree to determine a score of each of the word pairs; and a word pair selection unit that acquires one or more word pairs having a high score.

Подробнее
12-07-2012 дата публикации

Methods and apparatus related to document processing based on a document type

Номер: US20120179961A1
Автор: Jeff Stollman
Принадлежит: Individual

In one embodiment, a method includes receiving a portion of text from a document. A document type is associated with the document based on at least one of the portion of text or an identifier associated with the document. The method also includes selecting, based on the document type, a document template having a plurality of sections. Each section from the plurality of sections being associated with a document category. At least one section from the plurality of sections including at least one policy preference.

Подробнее
02-08-2012 дата публикации

System for Identifying Textual Relationships

Номер: US20120197631A1
Принадлежит: Accenture Global Services Ltd

A computer-implemented method identifies textual statement relationships. Textual statement pairs including a first and second textual statement are identified, and parsed word group pairs are extracted from first and second textual statements. The parsed word groups are compared, and a parsed word score for each statement pair is calculated. Word vectors for the first and second textual statements are created and compared. A word vector score is calculated based on the comparison of the word vectors for the first and second textual statements. A match score is determined for the textual statement pair, with the match score being representative of at least one of the parsed word score and the word vector score.

Подробнее
02-08-2012 дата публикации

Objective-function based sentiment

Номер: US20120197903A1
Принадлежит: Individual

A system and article are disclosed for objective-function based sentiment. In one example, the system includes a set of domain information, and a computer programmed with executable instructions which operate a set of modules. The modules include a sentiment polarization module for identifying a domain-aspect opinion-word pair within a set of domain data, and assigning a sentiment polarity score to the domain-aspect opinion-word pair based on an objective function which includes sentiment data from the domain information.

Подробнее
23-08-2012 дата публикации

Semantic web technologies in system automation

Номер: US20120215733A1
Принадлежит: International Business Machines Corp

A method includes maintaining descriptions of a plurality of information technology resources in a computer-readable storage medium. The method includes maintaining a plurality of evaluation strategies, wherein the evaluation strategies associate a plurality of rules with forms of changes to the plurality of information technology resources. Responsive to detecting a command to change a first property of the set of properties of a first information technology resource of the plurality of information technology resources, the method determines that a first of the evaluation strategies associates at least one of the plurality of rules with a form of the change to the first property of the first information technology resource. Also, responsive to detecting the command, the method evaluates the at least one of the plurality of rules and performs the operation of the at least one rule.

Подробнее
22-11-2012 дата публикации

Taxonomy and application of language analysis and processing

Номер: US20120296636A1
Принадлежит: DW ASSOC LLC

Words can be identified in text. Membership numerical values for the words can be determined in categories, or in communication types generated using those categories. The membership numerical values for the words can then be used to generate a signature. The signature can then be used to identify documents with a similar attitude.

Подробнее
10-01-2013 дата публикации

Method of Extracting Experience Sentence and Classifying Verb in Blog

Номер: US20130013289A1

Provided are a method of extracting an experience-revealing sentence from a blog document and a method of classifying verbs into activity verbs and state verbs in a sentence recorded in a blog document. The method of extracting an experience sentence from a blog document includes generating a sentence classifier using a machine learning algorithm based on grammatical features, and classifying experience sentences that represent actual experiences of users and non-experience sentences that represent no experience in the blog document using the sentence classifier. By classifying sentences in a blog document into experience sentences and non-experience sentences, it is possible to extract experiences that a user has actually had or that have actually happened to a user from the document.

Подробнее
07-03-2013 дата публикации

Information processing appratus, natural language analysis method, program and recording medium

Номер: US20130060562A1
Автор: Yuta Tsuboi, Yuya Unno
Принадлежит: International Business Machines Corp

An apparatus and method for calculating a score of matching a sentence with a query pattern having a dependency structure. The apparatus includes: an input unit acquiring an analysis target sentence, a query pattern and an index value indexing how a linguistic unit in the sentence tends to modify another; and a score calculation unit calculating a matching score indexing the degree of matching of the sentence with the query pattern. The matching score is represented by a function having an index value with which a dependency relation included in the query pattern is associated. The score is calculated by attempting association between a substructure of the query pattern and a range in the sentence and by performing recursive calculation in the substructure and the range while storing partial calculation result of the function in a memory area for reuse.

Подробнее
09-05-2013 дата публикации

Knowledge based parsing

Номер: US20130117012A1
Принадлежит: Microsoft Corp

The subject disclosure generally relates to parsing unstructured data based on knowledge of domains related to the unstructured data. A domain identification component can identify a set of domains related to a term in a data set. An inspection component can identify unmatched words, and unmatched related domains. A correlation component can compare the unmatched words to known values for the unmatched domains, and a manager component can match the unmatched words with the unmatched domains based on the comparison. In addition, combinations of the words can be generated based on a set of predetermined rules, and compared to the unmatched domains. Furthermore, delimiter based parsing can be employed to augment the knowledge based parsing.

Подробнее
30-05-2013 дата публикации

Construction of text classifiers

Номер: US20130138641A1
Принадлежит: Google LLC

Methods, systems, and apparatus, including computer program products, for constructing text classifiers. The method includes receiving a collection of candidate phrases for a given topic; filtering the received candidate phrases to remove erroneously included candidate phrases; assigning weights to the candidate phrases including scoring each candidate phrase using an initial classifier and assigning weights to the candidate phrases based on the scores; and generating a linear classifier using the filtered and weighted candidate phrases, where the linear classifier varies the weights for each phrase candidate depending on the length of the document being classified.

Подробнее
27-06-2013 дата публикации

Method and apparatus for rating documents and authors

Номер: US20130166282A1
Автор: Peter Ridge, Tim Musgrove
Принадлежит: Federated Media Publishing LLC

Methods and apparatus for determining a competence rating of an author relating to one or more topics is disclosed. An exemplary method comprises determining semantic information associated with one or more documents related to the one or more topics, determining amplification information associated with the one or more documents, determining occurrence information associated with the author; and determining a competence rating for the author based at least in part on the semantic information associated with the one or more documents, the amplification information associated with the one or more documents, and the occurrence information associated with the author. A document rating for at least one of the one or more documents may also be determined based at least in part on the one or more weighted semantic features and the amplification information.

Подробнее
18-07-2013 дата публикации

Format for displaying text analytics results

Номер: US20130185058A1
Принадлежит: DW ASSOC LLC

A system can receive text. The text can be divided into various portions. One or more significance indicators can be associated with each portion of text: these significance indicators can also be received by the system. The system can then display a portion of text and the associated significance indicators to the user.

Подробнее
15-08-2013 дата публикации

Speech recognition apparatus, speech recognition method, and computer-readable recording medium

Номер: US20130211822A1
Автор: Atsunori SAKAl
Принадлежит: NEC Corp

A speech recognition apparatus 20 includes: an identification language model creation unit 21 that selects, from learning texts 27 for various fields for generating language models 26 for the fields, a phrase that includes a word whose appearance frequency satisfies a set condition on a field-by-field basis, and generates an identification language model 25 for identifying the field of speech using the selected phrases; a speech recognition unit 22 that executes speech recognition on the speech using the identification language model 25 , and outputs text data and word confidences as a recognition result; and a field determination unit 23 that specifies a field that includes the most words whose confidences are greater than or equal to a set value based on the text data, the word confidences, and the words in the learning texts for the fields, and determines that the specified field is the field of the speech.

Подробнее
15-08-2013 дата публикации

Automated interpretation of clinical encounters with cultural cues

Номер: US20130211834A1
Автор: Daniel T. Heinze
Принадлежит: A Life Medical LLC

A method, system and a computer program product for an automated interpretation and/or translation are disclosed. An automated interpretation and/or translation occurs by receiving language-based content from a user. The received language-based content is processed to interpret and/or translate the received language-based content into a target language. Also, a presence of a cultural sensitivity in the received language-based content is detected. Further, an appropriate guidance for dealing with the detected cultural sensitivity is provided.

Подробнее
22-08-2013 дата публикации

Multi-Concept Latent Semantic Analysis Queries

Номер: US20130218554A1
Автор: Paul A. Jakubik
Принадлежит: Individual

A method includes accessing text, identifying a plurality of terms from the text, determining a plurality of term vectors associated with the identified plurality of terms, and clustering the determined plurality of term vectors into a plurality of clusters, the plurality of clusters comprising a first and a second cluster, the first and second clusters each comprising two or more of the determined term vectors. The method further includes creating a first pseudo-document according to the first cluster, creating a second pseudo-document according to the second cluster, identifying a first set of terms associated with the first cluster using latent semantic analysis (LSA) of the first pseudo-document, identifying a second set of terms associated with the second cluster using LSA of the second pseudo-document, and combining the first and second sets of terms into a list of output terms.

Подробнее
05-09-2013 дата публикации

Disambiguating system and method

Номер: US20130231919A1
Автор: Xin-Hua Li, Yu-Kai Xiong

A disambiguating method includes providing a storage unit storing a first database and a second database. The first database includes a dictionary of ambiguous language data, the second database includes a collection of disambiguating algorithms, each piece of ambiguous language data in the dictionary is associated with at least one of the disambiguating algorithms. A sentence input is received from the application system via the interface and recognized if the sentence comprises a piece of ambiguous language date which is defined in the dictionary. The recognized piece of ambiguous language data in the sentence is disambiguated using the at least one associated disambiguating algorithm, and results of disambiguating are generated. An interpretation is selected from the results and output to the application system via the interface. A disambiguating system is also provided.

Подробнее
03-10-2013 дата публикации

Handheld electronic device including indication of a selected data source, and associated method

Номер: US20130262094A1
Принадлежит: BlackBerry Ltd

A method of enabling input into a handheld electronic device having stored therein a number of language objects includes detecting a selection of a languages, making a determination that the language is a default language or a non-default language, detecting as an ambiguous input an actuation of one or more input members, outputting at least a portion of a number of the language objects that corresponds to the ambiguous input, and outputting an indication representative of the language.

Подробнее
10-10-2013 дата публикации

Apparatus for automatic theme detection from unstructured data

Номер: US20130268534A1
Принадлежит: Clarabridge Inc

This apparatus provides a system and method of determining significant repeating themes in a collection of documents. The apparatus operates unsupervised and leverages a natural language processing mechanism supported with lexicon, synonym and taxonomy dictionaries to determine themes and establish their relevance using a two-level hierarchical structure. The apparatus also assigns meaningful names to identified themes and determines a set of rules that describe the theme such that it can be applied to categorize other documents outside of the collection as well.

Подробнее
31-10-2013 дата публикации

Automated self-service user support based on ontology analysis

Номер: US20130290366A1
Принадлежит: International Business Machines Corp

A system for implementing a method that provides information to a user in response to a received user query. A natural language analysis generates substrings relevant to the user query. An ontology analysis outputs: terms of an ontology matching the relevant generated substrings; and relationships between the terms. A query analysis analyzes the user query regarding the outputted terms and relationships, including ascertaining whether the user query is more suitable for service than for an information search. If it is so ascertained, then service actions for the user to perform are identified to the user. If it is not so ascertained, then: the user query is refined based on the outputted terms and relationships; a search query is generated based on the refined user query, a search is initiated based on the search query, and results of the search are provided to the user.

Подробнее
21-11-2013 дата публикации

System for advanced security management

Номер: US20130307682A1
Принадлежит: Honeywell International Inc

A system receives input from a plurality of sensors in a security management system. The input relates to two or more events. The input is stored in a database. A correlation between the two or more events is determined. A priority is dynamically assigned to the two or more events, and the correlation, the priority, and information relating to the two or more events are reported to a system user.

Подробнее
28-11-2013 дата публикации

Entity variant generation and normalization

Номер: US20130317807A1
Принадлежит: International Business Machines Corp

Determining variants of a text entity comprises parsing the text entity into semantic components and generating variants for each of the semantic components. The entity is recomposed in different morphological forms from the different variants of the semantic components.

Подробнее
26-12-2013 дата публикации

Real-time message sentiment awareness

Номер: US20130346067A1
Автор: Dhruv A. Bhatt
Принадлежит: International Business Machines Corp

Provided are techniques for determining a sentiment of an electronic message. The electronic message is parsed to identify one or more sub-constructs. For at least one of the sub-constructs that is not false-positive, a sentiment indicator is assigned from a set of types of sentiment indicators, and a score is assigned for the sentiment indicator. A final score is obtained for at least one type of sentiment indicator in the electronic message by summing scores for that type of sentiment indicator. Based on the final score for the at least one type of sentiment indicator, a sentiment of the electronic message is identified.

Подробнее
09-01-2014 дата публикации

System and method for topic extraction and opinion mining

Номер: US20140012863A1
Принадлежит: eBay Inc

Technique for topic extraction and opinion mining are described. For example, a document that is pertinent to a topic is selected based on searching, using a key phrase, a plurality of documents. A subtopic referenced in the document is identified. A feature of the subtopic is identified based on the document. A rating of the feature of the subtopic is determined based on the document. Using at least one processor, a sentiment of the document is determined based in part on the feature and the rating of the feature.

Подробнее
30-01-2014 дата публикации

Open information extraction

Номер: US20140032209A1

A system for identifying relational tuples is provided. The system extracts a relation phrase from a sentence by identifying a verb in the sentence and then identifying a relation phrase of the sentence as a phrase in the sentence starting with the identified verb that satisfies both a syntactic constraint and a lexical constraint. The system also identifies arguments for a relation phrase. To extract the arguments, the system applies a left-argument-left-bound classifier, a left-argument-right-bound classifier, and a right-argument-right-bound classifier to identify a left argument and right argument for the relation phrase such that the left argument, the relation phrase, and the right argument form a relational tuple.

Подробнее
06-02-2014 дата публикации

Real-time and adaptive data mining

Номер: US20140040301A1
Принадлежит: Rule 14

A method of analyzing data is presented. The method includes generating a query based on a topic of interest, expanding search terms of the query, executing the query on one or more data sources, monitoring a specific data source selected from the one or more data sources. The monitoring is performed to monitor for matches to the query.

Подробнее
06-03-2014 дата публикации

Methods and systems for acquiring user related information using natural language processing techniques

Номер: US20140067369A1
Принадлежит: Xerox Corp

Systems and methods for acquiring information associated with a user by using NLP techniques are disclosed. One or more phrases are classified in one or more categories at least partly on the basis of a period for which a product has been used by the user, the user's experience with the product, preferences of the user, or needs of the user by applying one or more natural language processing (NLP) techniques. The one or more phrases are extractable from an electronic publication at least partly on the basis of on a predefined set of verbs, a predefined set of domain-specific terms, and terms indicative of temporal information. One or more terms from the classified phrases are extracted, in which the one or more terms are indicative of the information about the user.

Подробнее
06-03-2014 дата публикации

Establishing "is a" relationships for a taxonomy

Номер: US20140067832A1
Принадлежит: Wal Mart Stores Inc

Disclosed are methods for returning to a user an answer to the question “what is <string>.” Concepts and classes to which the concepts belong are determined from a corpus, such as taxonomy. The concepts are mapped to categories according to the structure of the taxonomy. Homonyms for words are collected and scored according to likeliness of use. Concept vectors are assembled for the identified concepts based on articles in the corpus and social media usage. Words are evaluated for generic-ness and a generic score is associated therewith. In responding to a query, the generic-ness of the terms of the query is evaluated and additional context solicited if the terms are generic. Candidate homonym concepts for a string in the query are selected according to context vectors for the homonym concepts. One or more homonym concepts are selected and the one or more categories corresponding to these concepts are returned.

Подробнее
20-03-2014 дата публикации

Natural Language Vocabulary Generation and Usage

Номер: US20140081626A1
Принадлежит: Adobe Systems Inc

Natural language vocabulary generation and usage techniques are described. In one or more implementations, one or more search results are mined for a domain to determine a frequency at which words occur in the one or more search results, respectively. A set of the words is selected based on the determined frequency. A sense is assigned to each of the selected set of the words that identifies a part-of-speech for a respective word. A vocabulary is then generated that includes the selected set of the words and a respective said sense, the vocabulary configured for use in natural language processing associated with the domain.

Подробнее
03-04-2014 дата публикации

Emotion identification system and method

Номер: US20140095148A1
Принадлежит: Kanjoya Inc

A system and method for identifying emotion in text that connotes authentic human expression, and training an engine that produces emotional analysis at various levels of granularity and numerical distribution across a set of emotions at each level of granularity. The method may include determining similarity between textual data and an emotion, and classifying emotions as similar emotions.

Подробнее
03-04-2014 дата публикации

Establishing "is a" relationships for a taxonomy

Номер: US20140095411A1
Принадлежит: Wal Mart Stores Inc

Disclosed are methods for returning to a user an answer to the question “what is <string>.” Concepts and classes to which the concepts belong are determined from a corpus, such as taxonomy. The concepts are mapped to categories according to the structure of the taxonomy. Homonyms for words are collected and scored according to likeliness of use. Concept vectors are assembled for the identified concepts based on articles in the corpus and social media usage. Words are evaluated for generic-ness and a generic score is associated therewith. In responding to a query, the generic-ness of the terms of the query is evaluated and additional context solicited if the terms are generic. Candidate homonym concepts for a string in the query are selected according to context vectors for the homonym concepts. One or more homonym concepts are selected and the one or more categories corresponding to these concepts are returned.

Подробнее
06-01-2022 дата публикации

METHOD, APPARATUS, SYSTEM, DEVICE, AND STORAGE MEDIUM FOR ANSWERING KNOWLEDGE QUESTIONS

Номер: US20220004547A1
Принадлежит:

Embodiments of the present specification disclose a method, an apparatus, a system, a device, and a storage medium for answering user questions, including: obtaining a user question; encoding the user question and a schema level of pre-constructed structured data to obtain a first feature vector, wherein the structured data further comprises a data level, wherein the data level comprises knowledge for answering questions structured according to the schema level; retrieving one or more candidate sub-graphs related to the user question from the structured data; encoding the one or more candidate sub-graphs to obtain a second feature vector; performing multi-task classification for the user question based on the first feature vector and the second feature vector; and obtaining answer content for the user question based on a result of the multi-task classification. 1. A method for answering user questions , comprising:obtaining a user question;encoding the user question and a schema level of pre-constructed structured data to obtain a first feature vector, wherein the structured data further comprises a data level, wherein the data level comprises knowledge for answering questions structured according to the schema level;retrieving one or more candidate sub-graphs related to the user question from the structured data;encoding the one or more candidate sub-graphs to obtain a second feature vector;performing multi-task classification for the user question based on the first feature vector and the second feature vector; andobtaining answer content for the user question based on a result of the multi-task classification.2. The method of claim 1 , wherein the encoding the user question and a schema level of pre-constructed structured data to obtain a first feature vector comprises:constructing a standard input text based on the user question and the schema level; andencoding the standard input text using a self-encoding language model to obtain the first feature vector, ...

Подробнее
06-01-2022 дата публикации

Information processing apparatus, information processing method, and non-transitory storage medium

Номер: US20220005365A1
Принадлежит: Toyota Motor Corp

There is provided a technology that can improve the convenience of users who are reading books. An information processing apparatus has a controller that determines a focus sentence in a subject book that a user is reading. The focus sentence is a sentence on which the user is focusing attention. The controller analyzes what the focus sentence describes and obtains related information relating to what the focus sentence describes. Moreover, the controller executes the processing for causing a user's terminal that the user is using to display the related information.

Подробнее
07-01-2016 дата публикации

Method of providing relevant information and electronic device adapted to the same

Номер: US20160004784A1
Принадлежит: SAMSUNG ELECTRONICS CO LTD

Various embodiments of the present disclosure provide a method of providing relevant information and an electronic device adapted to the method. The method includes: displaying first information; extracting one or more retrieval words in a form of keyword or a form of phrase from the first information; obtaining second information as relevant information related to the first information by using the one or more extracted retrieval words; emphasizing objects corresponding to the one or more retrieval words on the first information; and displaying, when detecting the selection of at least one from the emphasized objects, the second information including the selected objects.

Подробнее
07-01-2021 дата публикации

FACILITATING QUERIES OF ENCRYPTED SENSITIVE DATA VIA ENCRYPTED VARIANT DATA OBJECTS

Номер: US20210004373A1
Принадлежит:

Various aspects of this disclosure provide digital data processing systems for using encrypted variant data objects to facilitate queries of sensitive data. In one example, a digital data processing system can receive sensitive data about an entity. The digital data processing system can create, in an identity data repository and from the sensitive data, a searchable secure entity data object for the entity. The searchable secure entity data object is usable for servicing a query regarding the entity. For instance, a transformed query parameter can be generated from a query parameter in the query. The query can be serviced by matching the transformed query parameter to tokenized variant data in the searchable secure entity data object and retrieving tokenized sensitive data from the searchable secure entity data object. 1. A secure data processing system comprising:a processing device;an identity data repository;a non-transitory computer-readable memory coupled to the processing device and storing instructions, receiving sensitive data about an entity;', generating variant data having a modified version of the sensitive data,', 'tokenizing the sensitive data and the variant data,', 'associating a common entity identifier with the tokenized sensitive data and the tokenized variant data, and', 'storing the tokenized sensitive data, the tokenized variant data, and the common entity identifier in the searchable secure entity data object;, 'creating, in the identity data repository and from the sensitive data, a searchable secure entity data object for the entity, wherein creating the searchable secure entity data object comprises, 'receiving a query regarding the entity;', 'generating a transformed query parameter from a query parameter in the query; and', 'servicing the query by matching the transformed query parameter to the tokenized variant data in the searchable secure entity data object and retrieving the tokenized sensitive data from the searchable secure entity ...

Подробнее
07-01-2021 дата публикации

Toxic vector mapping across languages

Номер: US20210004440A1
Принадлежит: Spectrum Labs Inc, Superset Partners Inc

Methods, systems, and devices for language mapping are described. Some machine learning models may be trained to support multiple languages. However, word embedding alignments may be too general to accurately capture the meaning of certain words when mapping different languages into a single reference vector space. To improve the accuracy of vector mapping, a system may implement a supervised learning layer to refine the cross-lingual alignment of particular vectors corresponding to a vocabulary of interest (e.g., toxic language). This supervised learning layer may be trained using a dictionary of toxic words or phrases across the different supported languages in order to learn how to weight an initial vector alignment to more accurately map the meanings behind insults, threats, or other toxic words or phrases between languages. The vector output from this weighted mapping can be sent to supervised models, trained on the reference vector space, to determine toxicity scores.

Подробнее
01-01-2015 дата публикации

Electronically based thesaurus leveraging context sensitivity

Номер: US20150006149A1
Принадлежит: International Business Machines Corp

Arrangements described herein relate to language enhancement. Source text can be automatically gathered from a plurality of text sources, the plurality of text sources including at least one social media website, and storing the source text to a thesaurus data infrastructure. Subject text being exposed to thesaurus processing can be received, a context of the subject text can be identified, and the thesaurus data infrastructure can be accessed to identify source text having context similar to the context of the subject text. The identified source text can be analyzed to identify at least one candidate word or phrase contained in the source text to recommend as a replacement for at least one word or phrase contained in the subject text. The identified at least one candidate word or phrase can be recommended as the replacement for the at least one word or phrase contained in the subject text.

Подробнее
01-01-2015 дата публикации

Business intelligence data models with concept identification using language-specific clues

Номер: US20150006160A1
Принадлежит: International Business Machines Corp

Techniques are described for modeling information from a data source. In one example, a method for modeling information from a data source includes comparing, with one or more computing devices, a data item heading from the data source with concept keywords in a concept library, the concept library comprising a plurality of concepts and one or more of the concept keywords in at least one language associated with each of one or more of the concepts. The method further includes identifying, with one or more computing devices, one or more matches between the data item heading and one or more concept keywords associated with a particular concept from among the concepts comprised in the concept library. The method further includes identifying, with one or more computing devices, the data item heading as being associated with the particular concept.

Подробнее
02-01-2020 дата публикации

Computer Implemented Method for Extracting and Reasoning with Meaning from Text

Номер: US20200004832A1
Принадлежит: Babylon Partners Ltd

A text processing method for improving the accuracy of a response to a query directed to a system comprising concepts and relations defined by a knowledge base, wherein the method comprises: (i) producing a dependency tree from the query, wherein the dependency tree has at least one branch containing nodes and at least one connection between those nodes, wherein each node has a node label which corresponds to a term within the query, and wherein each connection has a label which corresponds to the linguistic relationship between terms within the query; (ii) from the dependency tree, generating a query concept using concepts and relations defined by the knowledge base; (iii) checking if the query concept has a subsumption relationship with a candidate concept retrieved from the system, and if no subsumption relationship is initially identified, optimising the dependency tree by changing the nodes, followed by repeating steps (ii) and (iii); and wherein the query concept and the candidate concept comprise at least one atomic concept.

Подробнее
07-01-2021 дата публикации

AUTOMATIC COMPLIANCE TOOLS

Номер: US20210004535A1
Принадлежит:

A facility for representing a mandate occurring in an authority document with a control is described. For each of one or more controls in a set of existing controls, the facility determines a similarity score measuring the similarity of the mandate and the control; where the similarity score exceeds a similarity threshold, the facility links the mandate to the control. Where the mandate is not linked to any control in the set of controls, the facility adds a control to the set of controls that is based on the mandate, and links the mandate to the added control. 1. A computer-readable medium storing instructions that , when executed by a computing system , cause the computing system to perform a method , the method comprising:accessing at least one source authority document comprising mandates;determining a group of controls representing the mandates of the at least one source authority document; andgenerating a substitute authority document containing one copy of each control of the determined group of controls.2. The computer-readable medium of claim 1 , wherein determining the group of controls comprises constructing at least one of the controls claim 1 , in the group of controls claim 1 , based on one or more of the mandates of the at least one source authority document.3. The computer-readable medium of claim 1 , wherein the method further comprises:accessing a set of contextual authority documents; anddetermining controls representing the mandates from the contextual authority documents;wherein at least two controls in the group of controls are controls from the contextual authority documents.4. The computer-readable medium of claim 1 ,wherein at least one control in the group of controls represents multiple of the mandates; andwherein each of the multiple mandates represented by a corresponding control is represented by that control based on a determination that each of those multiple mandates can be satisfied by an action that satisfies the control.5. The ...

Подробнее
07-01-2021 дата публикации

SYSTEM AND METHOD FOR PERFORMING A MEANING SEARCH USING A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK

Номер: US20210004537A1
Принадлежит:

The present disclosure is directed to an agent automation framework that is capable of extracting meaning from user utterances and suitably responding using a search-based natural language understanding (NLU) framework. The NLU framework includes a meaning extraction subsystem capable of detecting multiple alternative meaning representations for a given natural language utterance. Furthermore, the NLU framework includes a meaning search subsystem that enables elastic confidence thresholds (e.g., elastic beam-width meaning searches), forced diversity, and cognitive construction grammar (CCG)-based predictive scoring functions to provide an efficient and effective meaning search. As such, the disclosed meaning extraction subsystem and meaning search subsystem improve the performance, the domain specificity, the inference quality, and/or the efficiency of the NLU framework. 1. An agent automation system , comprising:a memory configured to store a natural language understanding (NLU) framework and a search space, wherein the NLU framework includes a meaning extraction subsystem and a meaning search subsystem, and wherein the search space includes sample meaning representations; and generating, via the meaning extraction subsystem, an utterance meaning model from a user utterance, wherein the utterance meaning model includes a set of search key meaning representations;', 'performing, via the meaning search subsystem, an elastic beam-width cognitive construction grammar (CCG) form-based search to compare the set of search key meaning representations of the utterance meaning model to the sample meaning representations of the search space and select a set of contender matching meaning representations;', 'performing, via the meaning search subsystem, an elastic beam-width intra-artifact search to compare the set of contender matching meaning representations to sample meaning representations of the search space and select a final set of matching meaning representations; and', ...

Подробнее
07-01-2021 дата публикации

METHOD FOR PROVIDING RICH-EXPRESSION NATURAL LANGUAGE CONVERSATION BY MODIFYING REPLY, COMPUTER DEVICE AND COMPUTER-READABLE RECORDING MEDIUM

Номер: US20210004538A1
Принадлежит:

A method for providing a natural language conversation is implemented by an interactive agent system. The method for providing a natural language conversation according to an embodiment of the present invention includes receiving a natural language input, determining a user intent based on the natural language input by processing the natural language input, and providing a natural language response corresponding to the natural language input, based on at least one of the natural language input and the determined user intent. The natural language response may be provided by changing and providing the natural language response according to speaking characteristics of a user. 1. A method for providing a natural language conversation , which is implemented by an interactive agent system , the method comprising:receiving a natural language input;determining a user intent based on the natural language input by processing the natural language input; andproviding a natural language response corresponding to the natural language input, based on at least one of the natural language input and the determined user intent,wherein the providing the natural language response comprises changing and providing the natural language response according to speaking characteristics of a user.2. The method of claim 1 , wherein the changing and providing the natural language response according to the speaking characteristics of the user comprises analyzing the natural language response and changing the natural language response based on a predetermined response conversion database associated with the natural language response.3. The method of claim 2 , wherein the response conversion database includes at least one of a user database that stores user-specific characteristic data and a vocabulary database claim 2 , wherein the user-specific characteristic data may include information on at least one of a record of user's previous conversation claim 2 , pronunciation feature information claim 2 ...

Подробнее
13-01-2022 дата публикации

DETERMINING SEMANTIC SIMILARITY OF TEXTS BASED ON SUB-SECTIONS THEREOF

Номер: US20220012431A1
Принадлежит:

Systems and methods are provided to compare a target sample of text to a set of textual records, each textual record including a sample of text and an indication of one or more segments of text within the sample of text. Semantic similarity values between the target sample of text and each of the textual records are determined. Determining a particular semantic similarity value between the target sample of text and a particular textual record of the corpus includes: (i) determining individual semantic similarity values between the target sample of text and each of the segments of text indicated by the particular textual record, and (ii) generating the particular semantic similarity value between the target sample of text and the particular textual record based on the individual semantic similarity values. A textual record is then selected based on the semantic similarities. 1. A system comprising:a processor; and accessing a corpus comprising a plurality of textual records;', 'generating, via a machine learning model, indications of one or more respective segments of text within each of the textual records in the corpus;', 'obtaining, from a client device, a target sample of text;', determining one or more respective segment semantic similarity values between the target sample of text and the one or more segments of text within the textual record; and', 'generating the respective record semantic similarity value between the target sample of text and the textual record based on the one or more respective segment semantic similarity values;, 'generating respective record semantic similarity values between the target sample of text and each of the textual records in the corpus, comprising, for each of the textual records in the corpus, 'selecting from the corpus, based on the generated record semantic similarity values, a particular textual record having the highest respective record semantic similarity value for the target sample of text; and', 'providing, to the ...

Подробнее
07-01-2021 дата публикации

HIERARCHICAL SELF-ATTENTION FOR MACHINE COMPREHENSION

Номер: US20210005195A1
Принадлежит:

A method for determining the answer to a query in a document, including: encoding, by an encoder, the query and the document; generating a query-aware context encodings G by a bidirectional attention system using the encoded query and the encoded document; performing a hierarchical self-attention on the query aware document by a hierarchical self-attention system by applying a word to word attention and a word to sentence attention mechanism resulting in a matrix M; and determining the starting word and the ending word of the answer in the document by a span detector based upon the matrix M. 1. A method for determining the answer to a query in a document , comprising:encoding, by an encoder, the query and the document;generating a query-aware context encodings G by a bidirectional attention system using the encoded query and the encoded document;performing a hierarchical self-attention on the query aware document by a hierarchical self-attention system by applying a word to word attention and a word to sentence attention mechanism resulting in a matrix M; anddetermining the starting word and the ending word of the answer in the document by a span detector based upon the matrix M.2. The method of claim 1 , wherein performing a hierarchical self-attention on the query aware document further includes:applying a bidirectional recurrent neural network (BiRNN) on the query-aware context encoding G to produce a matrix G′;extracting sentence-level encodings S′ from G′;producing a word-word self-attention matrix A_w by comparing each word in G′ with each other word in G′; andproducing a word-sentence self-attention matrix A_s by comparing each word in G′ to each sentence in the extracted sentence-level encodings S′,wherein the matrix M is based upon A_w and A_s.3. The method of claim 2 , wherein producing a word-word self-attention matrix A_w further includes using a trilinear function to compute similarity scores for each word-word comparison and normalizing the resulting ...

Подробнее
07-01-2021 дата публикации

Systems and methods for providing dynamic and interactive content in a chat session

Номер: US20210006609A1
Принадлежит: PayPal Inc

Methods and systems are presented for providing dynamic and interactive content in an online chat session. When it is determined that an online chat session is established between two devices, an interactive chat engine may begin monitoring chat messages exchanged between the devices to derive a context for the online chat session. A particular workflow from multiple workflows may be selected and initiated for the online chat session based on the derived context. The interactive chat engine may generate a chat object for presenting dynamic and interactive content to the user related to the workflow within the online chat session. The chat object may be inserted into the chat flow of the online chat session. Once inserted the chat object may generate and subsequently modify presentations for the devices participating in the online chat session.

Подробнее
20-01-2022 дата публикации

USING STORED EXECUTION PLANS FOR EFFICIENT EXECUTION OF NATURAL LANGUAGE QUESTIONS

Номер: US20220019576A1
Автор: Lai Kaycee
Принадлежит:

An analysis system connects to a set of data sources and perform natural language questions based on the data sources. The analysis system connects with the data sources and retrieves metadata describing data assets stored in each data source. The analysis system generates an execution plan for the natural language question. The analysis system finds data assets that match the received question based on the metadata. The analysis system ranks the data assets and presents the ranked data assets to users for allowing users to modify the execution plan. The analysis system may use execution plans of previously stored questions for executing new questions. The analysis system supports selective preprocessing of data to increase the data quality. 1. (canceled)2. A computer-implemented method comprising:storing a plurality of natural language questions, each natural language question processing data obtained from one or more data sources; generating an execution plan comprising instructions for accessing data from the one or more data sources for answering the natural language question;', 'storing the plurality of natural language questions and their execution plans;, 'for each of the plurality of natural language questionsreceiving a new natural language question;matching the new natural language question with one or more stored natural language questions;identifying a subset of stored natural language questions matching the new natural language question;sending information describing the subset of stored natural language questions;receiving a selection of a stored natural language question; andexecuting an execution plan for the new natural language question, the execution plan determined from the execution plan of the selected stored natural language question.3. The computer-implemented method of claim 2 , wherein identifying the subset of stored natural language questions matching the new natural language question comprises:ranking the stored natural language ...

Подробнее
20-01-2022 дата публикации

SELECTIVE ANONYMIZATION OF DATA MAINTAINED BY THIRD-PARTY NETWORK SERVICES

Номер: US20220019695A1
Принадлежит:

A network computer system to selectively anonymize data items of a collection of data structures, where the network computer system is implemented as a backend web service, intercept service, or combination thereof. 1one or more processors;memory resources storing (i) a set of instructions, and (ii) a pool of tokens, each token of the pool comprising a string of characters having a predetermined format; identifying at least a respective data item of the data structure that meets a set of conditions, including at least a first condition in which at least a portion of the respective data item has a format that coincides with the predetermined format;', 'replacing a set of characters of the respective data item having the format with the string of characters of a respective token of the pool; and', 'causing the respective data item to be stored with a third-party network service as part of the data structure, the respective data item having the string of characters of the respective token in place of the replaced set of characters., 'wherein the one or more processors execute the set of instructions to selectively anonymize individual data items of a plurality of data structures, by, for each data structure of the plurality of data structures. A data anonymization computer system comprising: This application is a continuation of U.S. patent application Ser. No. 16/221,261 filed on Dec. 14, 2018; the aforementioned priority application being hereby incorporated by reference in its entirety for all purposes.Examples relate to data anonymization, and more specifically, to selective anonymization of data maintained by third-party network services.Third-party network services, such as cloud, software as a service (“SaaS”), and web services, are used by enterprises and other customers to retain vast quantities of data of which some can often be subject to privacy or other confidentiality restrictions. For example, governmental regulations can specify that certain types of ...

Подробнее
20-01-2022 дата публикации

ACTIVE INTERACTION METHOD, ELECTRONIC DEVICE AND READABLE STORAGE MEDIUM

Номер: US20220019847A1
Автор: He Jingzhou, WANG Fan, XUE Yang
Принадлежит:

An active interaction method, an electronic device and a readable storage medium, relating to the field of deep learning and image processing technologies, are disclosed. According to an embodiment, the active interaction method includes: acquiring a video shot in real time; extracting a visual target from each image frame of the video, and generating a first feature vector of each visual target; for each image frame of the video, fusing the first feature vector of each visual target and identification information of the image frame to which the visual target belongs to generate a second feature vector of each visual target; aggregating the second feature vectors with the same identification information respectively to generate a third feature vector corresponding to each image frame; and initiating active interaction in response to determining that the active interaction is to be performed according to the third feature vector of a preset image frame. 1. An active interaction method , comprising:acquiring a video shot in real time;extracting a visual target from each image frame of the video, and generating a first feature vector of each visual target;for each image frame of the video, fusing the first feature vector of each visual target and identification information of the image frame to which the visual target belongs, to generate a second feature vector of each visual target;aggregating the second feature vectors with the same identification information respectively to generate a third feature vector corresponding to each image frame; andinitiating active interaction in response to determining that the active interaction is to be performed according to the third feature vector of a preset image frame.2. The method according to claim 1 , wherein extracting the visual target from each image frame of the video comprises:extracting a specific target from each image frame of the video as the visual target.3. The method according to claim 1 , wherein extracting the ...

Подробнее
27-01-2022 дата публикации

DISCOVERY AND RECOMMENDATION OF ONLINE LEARNING RESOURCES

Номер: US20220027562A1
Принадлежит:

A system models web content including learning resources available on a website, and makes suggestions of potentially useful learning resources when a user highlights text of interest within the website. In order to facilitate these suggestions, a neural network-based system is trained on learning resources and other content available on the website to create a common word embedding for learning resources and other website text. A student model may then be created to facilitate real time or near real time suggestions of relevant learning resources in response to selections of text from the website. 1. A method comprising:training a site-wide neural language model for a website, the site-wide neural language model including a domain-specific neural language model trained over content for the website, the content including web pages and a number of learning resources;generating an embedding for the learning resources;clustering the learning resources into a number of clusters within the embedding;tuning the site-wide neural language model for each cluster based on one or more learning resources associated with the cluster, thereby providing a group of cluster-specific models including a model for each of the clusters;feeding content of each of the learning resources to the site-wide neural language model and a corresponding one of the cluster-specific models, thereby providing a site-wide embedding and a cluster-specific embedding for each learning resource;concatenating the site-wide embedding and the cluster-specific embedding for each of the learning resources, thereby providing an embedded space for a transformer-based neural model configured to receive a selection of content from the website and provide a probability distribution that one or more of the learning resources are relevant to the selection of content;configuring a student model with fewer layers than the transformer-based neural model to output the probability distribution in response to the selection ...

Подробнее
27-01-2022 дата публикации

DOCUMENT TEXT EXTRACTION TO FIELD-SPECIFIC COMPUTER EXECUTABLE OPERATIONS

Номер: US20220027564A1
Принадлежит: INTUIT INC.

This disclosure describes converting computer-executable predicate-argument structures for a specific field to field-specific predicated-argument structures to improve execution. In some implementations, a method can be performed by one or more processors of a computing device, and can include receiving one or more predicate-argument structures (PASs) associated with taxation-specific text and converting the one or more PASs into one or more tax-specific predicate-argument structures (TPASs). Converting the one or more PASs to one or more TPASs may include one or more of: defining terms in a segment based on a definition of the term from a different segment or line description (including from a different document); reordering nodes, replacing nodes, or removing nodes of a segment (such as based on one or more single segment tree traversal rules); or combining multiple PASs for multiple segments of a single line description based on one or more multiple segment tree traversal rules. 1. A method of generating one or more computer-executable tax-specific predicate-argument structures (TPASs) for text from one or more tax-specific documents , the method performed by one or more processors of a computing device and comprising:receiving one or more computer-executable predicate-argument structures (PASs) generated from the text from the one or more tax-specific documents; andconverting the one or more PASs to one or more TPASs.2. The method of claim 1 , further comprising deserializing the one or more PASs before converting the one or more PASs to one or more TPASs.3. The method of claim 2 , further comprising categorizing the content of the one or more deserialized PASs before converting the one or more deserialized PASs to one or more TPASs.4. The method of claim 3 , wherein categorizing the content of the one or more deserialized PASs includes defining an undefined term in the one or more deserialized PASs based on a defined reference for the undefined term in a label ...

Подробнее
27-01-2022 дата публикации

BEHAVIOR-MODIFICATION MESSAGING WITH PANDEMIC-BIO-SURVEILLANCE MULTI PATHOGEN SYSTEMS

Номер: US20220027565A1
Автор: Klasson Eric
Принадлежит:

Provided is a process, including: obtaining, with a computer system, a next geolocation to be visited by a user or at which the user has arrived; obtaining, with the computer system, a geolocation-pathogen-risk score of the next geolocation; determining, with the computer system, a topic of message based on the geolocation-pathogen-risk score; determining, with the computer system, by a natural-language-text generation model, natural language text within the topic of the message; and causing, with the computer system, the natural language text of the message to be presented to the user. 1. A tangible , non-transitory , machine-readable medium storing instructions that , when executed by one or more processors , effectuate operations comprising:obtaining, with a computer system, a next geolocation to be visited by a user or at which the user has arrived;obtaining, with the computer system, a geolocation-pathogen-risk score of the next geolocation;determining, with the computer system, a topic of message based on the geolocation-pathogen-risk score;determining, with the computer system, by a natural-language-text generation model, natural language text within the topic of the message; andcausing, with the computer system, the natural language text of the message to be presented to the user.2. The medium of claim 1 , wherein:the topic is a warning of risk of infection from a pathogen at the geolocation; andthe natural-language-text is different from natural-language-text of a previous message with the same topic presented to the user.3. The medium of claim 1 , the operations further comprising:obtaining, from a mobile computing device of the user, feedback indicating whether the person complied with an instruction expressed by the natural language text; andadjusting parameters of the natural-language-text generation model based on the feedback to reduce a likelihood of the natural language text being presented again.4. The medium of claim 1 , the operations further ...

Подробнее
27-01-2022 дата публикации

METHOD AND APPARATUS FOR SUMMARIZATION OF DIALOGS

Номер: US20220030110A1
Принадлежит:

A method for summarizing dialogs may include obtaining an agent text stream and a customer text stream, segmenting the agent text stream and customer text stream into sentences, and labeling sentences associated with the segmented agent text stream and the segmented customer text stream. The method may further include extracting sentences from the agent text stream and the customer text stream based upon frequencies of appearance of words and terms of interest; generating an agent summary paragraph based on the extracted sentences from the agent text stream, and generating a customer summary paragraph based on the extracted sentences from the customer text stream. The method may identify keywords associated with each of the agent summary paragraph and the customer summary paragraph. 1. A method comprising:obtaining an agent text stream and a customer text stream;segmenting the agent text stream and customer text stream into sentences;labeling sentences associated with the segmented agent text stream and the segmented customer text stream;extracting sentences from the agent text stream and the customer text stream based upon frequencies of appearance of words and terms of interest;generating an agent summary paragraph based on the extracted sentences from the agent text stream;generating a customer summary paragraph based on the extracted sentences from the customer text stream; andidentifying keywords associated with the agent summary paragraph and the customer summary paragraph.2. The method of claim 1 , wherein obtaining an agent text steam and customer text stream comprises:receiving a transcript of a dialog between the agent and the customer;identifying a first text stream within the transcript associated with the agent; andidentifying a second text stream within the transcript associated with the customer.3. The method of claim 2 , wherein segmenting the agent text stream and customer text stream into sentences comprises:assigning punctuation to the first text ...

Подробнее
11-01-2018 дата публикации

Method, apparatus, and server for generating hotspot content

Номер: US20180011933A1
Автор: Huijuan Chen

A method, an apparatus and a server for generating hotspot content are provided. The method for generating hotspot content includes: acquiring a first keyword based on data relating to usage of a browser by a user; acquiring a second keyword based on information content crawled from an external website; and generating hotspot content based on the first keyword and the second keyword.

Подробнее
14-01-2021 дата публикации

SELECTIVELY GENERATING WORD VECTOR AND PARAGRAPH VECTOR REPRESENTATIONS OF FIELDS FOR MACHINE LEARNING

Номер: US20210011936A1
Принадлежит:

Word vectors are multi-dimensional vectors that represent words in a corpus of text and that are embedded in a semantically-encoded vector space; paragraph vectors extend word vectors to represent, in the same semantically-encoded space, the overall semantic content and context of a phrase, sentence, paragraph, or other multi-word sample of text. Word and paragraph vectors can be used for sentiment analysis, comparison of the topic or content of samples of text, or other natural language processing tasks. However, the generation of word and paragraph vectors can be computationally expensive. Accordingly, word and paragraph vectors can be determined only for user-specified subsets of fields of incident reports in a database. 1. A system , comprising:a processor; and receiving an incident report comprising a plurality of input fields associated with an information technology (IT) issue;', 'identifying a first subset of input fields of the plurality of input fields;', 'identifying a second subset of input fields of the plurality of input fields;', 'generating a word vector representation of the first subset of input fields using a trained artificial neural network (ANN);', 'generating a paragraph vector representation of the second subset of input fields using the trained ANN;', 'generating an aggregate vector representation of the incident report by combining the word vector representation and the paragraph vector representation;', 'comparing the aggregate vector representation of the incident report with a plurality of additional aggregate vector representations for a plurality of additional incident reports relating to a managed network, wherein the plurality of additional aggregate vector representations are stored in a database;', 'identifying one or more similar aggregate vector representations of the plurality of additional aggregate vector representations based on the comparison; and', 'returning one or more incident reports of the plurality of additional ...

Подробнее
14-01-2021 дата публикации

System and method for the automated tracking of personal and emotional information of individuals

Номер: US20210011975A1
Автор: Pegah AARABI
Принадлежит: Individual

The present disclosure provides a system and method for collecting personal information about an individual. The system and method comprises memory for storing personal information schemas, personal data, emotional data, a communication interface to send a plurality of questions to a user interface and to receive a plurality of responses from the user interface, and a processor: to translate the personal information schemas into a plurality of questions, to translate the responses into personal data or emotional data mapped to the personal information schemas and to analyze the personal information schemas, personal data and emotional data to provide an action.

Подробнее
14-01-2021 дата публикации

SUPERVISED CROSS-MODAL RETRIEVAL FOR TIME-SERIES AND TEXT USING MULTIMODAL TRIPLET LOSS

Номер: US20210012061A1
Принадлежит:

A system for cross-modal data retrieval is provided which includes a neural network having a time series encoder and text encoder jointly trained based on a triplet loss relating to two different modalities of (i) time series and (ii) free-form text comments. A database stores training sets with feature vectors extracted from encodings of the training sets. The encodings are obtained by encoding the time series using the time series encoder and encoding the text comments using the text encoder. A processor retrieves the feature vectors corresponding to at least one of the modalities from the database for insertion into a feature space together with a feature vector corresponding to a testing input relating to at least one of a testing time series and a testing free-form text comment, determines a set of nearest neighbors from among the feature vectors based on distance criteria, and outputs testing results. 1. A computer processing system for cross-modal data retrieval , comprising:a neural network having a time series encoder and text encoder which are jointly trained based on a triplet loss, the triplet loss relating to two different modalities of (i) time series and (ii) free-form text comments, which respectively correspond to a training set of time series and a training set of free-form text comments;a database for storing the training sets with feature vectors extracted from encodings of the training sets, the encodings obtained by encoding the time series in the training set of time series using the time series encoder and encoding the free-form text comments in the training set of free-form text comments using the text encoder; anda hardware processor for retrieving the feature vectors corresponding to at least one of the two different modalities from the database for insertion into a feature space together with at least one feature vector corresponding to a testing input relating to at least one of a testing time series and a testing free-form text comment, ...

Подробнее
14-01-2021 дата публикации

COMPUTERIZED ANALYSIS OF TEAM BEHAVIOR AND COMMUNICATION TO QUANTIFY AND OPTIMIZE TEAM FUNCTION

Номер: US20210012068A1
Принадлежит: QNTFY CORP.

A computer evaluates free-form text messages among members of a team, using natural language processing techniques to process the text messages and to assess psychological state of the team members as reflected it the text messages. The computer assembles the psychological state as reflected in the messages to evaluate team collective psychological state. The computer reports a trend of team collective psychological state in natural language text form. 1. A method , comprising the steps of:by computer, evaluating free-form text messages among team members of a team, using natural language processing techniques to process the text messages and to assess psychological states of the team members as reflected in the text messages;by computer, assembling a psychological state of each of the team members as reflected in the messages to evaluate team collective psychological state;in the memory of a computer, modeling the team with a multiplex graph to obtain the team collective psychological state, wherein nodes of the graph correspond to team members, layers of the multiplex graph correspond to respective psychological attributes of the assembled psychological states of the team members, and edges of the graph within each layer represent the respective psychological attributes as reflected between the team members corresponding to the nodes of the edges; andby computer, reporting a trend of the team collective psychological state in natural language text form.2. The method of claim 1 , further comprising the step of:using natural language processing techniques to process the text messages and to assess at least three emotions of the team members as reflected in the text messages, the three emotions drawn from the group consisting of anger, disgust, fear, happiness, sadness, love, surprise, trust, distrust, surprise, or anticipation,3. The method of claim 1 , further comprising the step of:using natural language processing techniques to process the text messages and to ...

Подробнее
14-01-2021 дата публикации

METHOD OF TRAINING A NEURAL NETWORK AND RELATED SYSTEM AND METHOD FOR CATEGORIZING AND RECOMMENDING ASSOCIATED CONTENT

Номер: US20210012200A1
Принадлежит:

A property vector representing extractable measurable properties, such as musical properties, of a file is mapped to semantic properties for the file. This is achieved by using artificial neural networks “ANNs” in which weights and biases are trained to align a distance dissimilarity measure in property space for pairwise comparative files back towards a corresponding semantic distance dissimilarity measure in semantic space for those same files. The result is that, once optimised, the ANNs can process any file, parsed with those properties, to identify other files sharing common traits reflective of emotional perception, thereby rendering a more liable and true-to-life result of similarity/dissimilarity. This contrasts with simply training a neural network to consider extractable measurable properties that, in isolation, do not provide a reliable contextual relationship into the real-world. 1310312314318300302304. A method of training an artificial neural network “ANN” (NN , NN , NN , NN) in a system () configured to identify similarity or dissimilarity in content of a first data file () relative to content in a different data file () , the method comprising:for a selected pair of different data files, extracting measurable signal qualities from each of the first data file and the different data file to define one property for each file;{'sub': x', 'x', 'x', 'x, 'at an output of the ANN tasked with processing said one property, generating a corresponding property vector (OR, OTO, OTIand OTX) in property space for said one property of both the first data file and the different data file of the selected pair;'}{'b': 350', '352, 'assembling a first multi-dimensional vector () for the first data file and a distinct second multi-dimensional vector () for the different data file;'}{'b': 330', '350', '352, 'determining a distance measure () between the first multi-dimensional vector () and the second multi-dimensional vector ();'}{'sub': R', 'TO', 'TI', 'TX, 'b': 310', ' ...

Подробнее
14-01-2021 дата публикации

Answering Questions During Video Playback

Номер: US20210012222A1
Автор: Seokhwan Kim
Принадлежит: Adobe Inc

In implementations of answering questions during video playback, a video system can receive a question related to a video at a timepoint of the video during playback of the video, and determine audio sentences of the video that occur within a segment of the video that includes the timepoint. The video system can generate a classification vector from words of the question and the audio sentences, and determine an answer to the question utilizing the classification vector. The video system can obtain answer candidates, and the answer to the question can be selected as one of the answer candidates based on matching the classification vector to one of the answer vectors.

Подробнее
09-01-2020 дата публикации

Method and system for generating vehicle service content

Номер: US20200013022A1
Принадлежит: Snap On Inc

Methods and systems for using natural language processing and machine-learning algorithms to process vehicle-service data to generate metadata regarding the vehicle-service data are described herein. A processor can discover vehicle-service data that can be clustered together based on the vehicle-service data having common characteristics. The clustered vehicle-service data can be classified (e.g., categorized) into any one of a plurality of categories. One of the categories can be for clustered vehicle-service data that is tip-worthy (e.g., determined to include data worthy of generating vehicle-service content (e.g., a repair hint). Another category can track instances of vehicle-service data that are considered to be common to an instance of vehicle-service data classified into the tip-worthy category. The vehicle-service data can be collected from repair orders from a plurality of repair shops. The vehicle-service content generated by the systems can be provided to those or other repair shops.

Подробнее
03-02-2022 дата публикации

SYSTEM FOR DISCOVERING SEMANTIC RELATIONSHIPS IN COMPUTER PROGRAMS

Номер: US20220035728A1
Принадлежит: The Ultimate Software Group, Inc.

A system for discovering semantic relationships in computer programs is disclosed. In particular, the system may synergistically identify and validate semantic relationships, concepts, and groupings associated with data elements within a static or dynamic, time varying, source input. The system may utilize feature extractors to extract features from the input and reasoners to develop associations using data from multiple feature set types, and, can thus generate reliable, robust, and complete sets of semantic relationships from the input. The system may generate hypotheses associated with the relationships, concepts, and groupings, and validate the hypotheses by testing an application under evaluation by the system and observing the outputs generated from the testing. Information pertaining to validated or invalidated hypotheses may be provided to a learning engine to maximize reasoning and performance in subsequent discovery processes by adjusting models, vocabularies, dictionaries, parameters utilized by the system in identifying the relationships, concepts, and groupings. 1. A system , comprising:a memory that stores instructions; and analyzing, for a discovery process, information provided by a source, wherein the information is associated with an application under evaluation by the system and is extracted from an interaction conducted with the application under evaluation, from the source, or a combination thereof;', 'generating, based on a data element in the information, a hypothesis associated with a first concept associated with the data element, a first relationship associated with the data element, a first grouping associated with the data element, or a combination thereof;', 'testing the hypothesis against the application under evaluation to confirm or reject the hypothesis; and', 'training, based on the testing of the hypothesis and based on a confirmation or a rejection of the hypothesis, a model to enhance determination of a second concept, a second ...

Подробнее
03-02-2022 дата публикации

REPLACING MAPPINGS WITHIN A SEMANTIC SEARCH APPLICATION OVER A COMMONLY ENRICHED CORPUS

Номер: US20220035817A1
Принадлежит:

Techniques include integrating a custom ontology into a semantic search function, the semantic search function being configured to perform a semantic search over a corpus enriched with a separate ontology. The semantic search function is executed using the custom ontology to perform the semantic search of the corpus. Results are generated from the semantic search of the corpus based on input received by the semantic search function. 1. A computer-implemented method comprising:integrating a custom ontology into a semantic search function, the semantic search function being configured to perform a semantic search over a corpus enriched with a separate ontology;executing the semantic search function using the custom ontology to perform the semantic search of the corpus; andgenerating results from the semantic search of the corpus based on input received by the semantic search function.2. The computer-implemented method of claim 1 , wherein the semantic search function uses a typeahead function associated with the custom ontology.3. The computer-implemented method of claim 1 , wherein the semantic search function uses a typeahead function to generate suggestions based on the custom ontology as an alternative to the separate ontology.4. The computer-implemented method of claim 1 , wherein the semantic search function uses a typeahead function to generate suggestions based on the custom ontology in addition to the separate ontology.5. The computer-implemented method of claim 1 , wherein the separate ontology is used to enrich the corpus.6. The computer-implemented method of claim 1 , further comprising indexing the custom ontology to create an index claim 1 , wherein the semantic search function uses the index of the custom ontology to generate suggestions for a user entering the input as a search query.7. The computer-implemented method of claim 1 , wherein integrating the custom ontology into the semantic search function comprises determining congruences between ...

Подробнее
03-02-2022 дата публикации

CASE SEARCH METHOD AND CASE SEARCH SYSTEM

Номер: US20220035846A1
Автор: MORIMOTO Kentaro
Принадлежит:

In order to provide a case search method and a case search system for providing a case of an analysis device, preprocessing, etc., associated with a sample that a user, etc., wishes to analyze, with respect to an analysis case for a presented case, a search unit interprets the semantic content contained in a search string input by a user, and retrieves, based on the interpretation result, an analysis case relating to a sample which coincides with or is similar to the presented sample, out of analysis cases stored in advance in databases A, B, and outputs the retrieved case. 1. A case search method for retrieving an analysis case relating to a sample presented by a user , the method comprising the steps of:receiving a search string input by the user;interpreting a semantic content contained in the search string, with respect to an analysis case of the presented sample;retrieving an analysis case relating to a sample that coincides with or is similar to the presented sample, out of analysis cases stored in advance in a storage unit, based on a result of the interpretation; andoutputting the retrieved analysis case,wherein the step of retrieving includes retrieving an analysis case of a compound similar to a compound contained in the sample.2. The case search method as recited in claim 1 ,wherein the step of interpreting performs a semantic search on the received search string.3. The case search method as recited in claim 1 ,wherein the storage unit is a database composed of a set of a word vector, a sentence vector, and a document vector relating to an analysis case.4. The case search method as recited in claim 3 ,wherein the step of retrieving compares a feature vector acquired by morphologically analyzing the search string with a word vector, a sentence vector, and a document vector in the database within a multidimensional space.5. The case search method as recited in claim 1 ,wherein the retrieved analysis case includes at least one of information about ...

Подробнее
03-02-2022 дата публикации

Custom semantic search experience driven by an ontology

Номер: US20220035866A1
Принадлежит: International Business Machines Corp

Techniques include updating a semantic search function with a custom ontology, the semantic search function initially supporting a separate ontology having been used to enrich a corpus. The custom ontology is used to augment input of a search query for the semantic search function, thereby providing a custom user experience for searching the corpus.

Подробнее
03-02-2022 дата публикации

Determining feasible itinerary solutions

Номер: US20220035880A1
Принадлежит: Amgine Technologies US Inc

A method for changing an itinerary based on a user change request is disclosed. The method may commence with receiving an itinerary request associated with one or more passengers. The method may continue with receiving a user itinerary change request associated with the itinerary network. The method may continue with generating an itinerary object associated with the user itinerary change request. The method may continue with modifying the itinerary network based on the itinerary object. The method may continue with processing the itinerary network using a topology of the itinerary network to create a plurality of tuples, the plurality of tuples including at least flight tuples and hotel tuples. The method may continue with performing a content search for the plurality of tuples for each of the one or more passengers. The method may continue with generating feasible itinerary solutions based on results of the content searches.

Подробнее
03-02-2022 дата публикации

Statistical analysis system and statistical analysis method using conversational interface

Номер: US20220035892A1
Автор: Jin ho YOO, Jin Tae YOU
Принадлежит: Yoojinbiosoft Co Ltd

The present disclosure relates to a statistical analysis system. More particularly, the present disclosure relates to a statistical analysis system capable of inferring the purpose of a user's analysis, etc. through questions and answers with the user so that ordinary people easily acquire clinical and statistical analysis information, the system using a conversational interface adapted to a statistical analysis for clinical data. The present disclosure provides a statistical analysis system using a conversational interface adaptive to a statistical analysis method using clinical data, and the statistical analysis method using the system, wherein a conversational interface is applied to extract variable characteristic information required for a statistical analysis according to the purpose of the statistical analysis that a user wants and to select and set a statistical analysis algorithm according to the extracted information so that the statistical analysis is performed and statistical analysis data that the user wants is generated.

Подробнее
03-02-2022 дата публикации

FEATURE VECTOR GENERATION FOR PROBABALISTIC MATCHING

Номер: US20220036006A1
Принадлежит:

A computer-implemented method increases the efficiency of matching records from two sources. The method includes identifying a first source and a second source wherein each of the sources include one or more records and each record includes one or more attributes. The method further includes determining, based on a corpus, the one or more attributes and generating, based on the attributes, a set of feature vectors which vectors represent the one or more attributes. The method includes comparing each record in the first source against each record in the second source. The method further includes generating, in response to the comparing, a link confidence. The method also includes linking, in response to the link confidence being above a linking threshold, the associated records. The method includes determining a first feature vector of the set of feature vectors used in the linking, and outputting a set of results. 1. A computer-implemented method comprising:identifying a first source and a second source wherein the first source and the second source include one or more records and each record includes one or more attributes;determining, based on a corpus, the one or more attributes;generating, based on the attributes, a set of feature vectors, wherein the set of feature vectors represent the one or more attributes;comparing each record in the first source against each record in the second source;generating, in response to the comparing, a link confidence;linking, in response to the link confidence being above a linking threshold, a first record from the first source and a second record from the second record;determining a first feature vector of the set of feature vectors used in the linking; andoutputting a set of results.2. The method of claim 1 , wherein the link confidence represents a likelihood a first record from the first source and a second record from the second source represent a common entity.3. The method of claim 1 , wherein the corpus is an internal ...

Подробнее
03-02-2022 дата публикации

SEMANTIC LINKAGE QUALIFICATION OF ONTOLOGICALLY RELATED ENTITIES

Номер: US20220036009A1
Принадлежит:

Aspects of the present disclosure include determining, by a processor, an ontology, the ontology comprising a plurality of ontological relationships, receiving, by the processor, a plurality of passages, determining, by the processor, a target set of co-occurring entities comprising a first entity and a second entity, determining a first passage in the plurality of passages that includes the first entity and the second entity, determining, from the ontology, a first ontological relationship between the first entity and the second entity, analyzing the first passage to determine a congruency score for the first ontological relationship, and generating a relationship annotation between the first entity and the second entity in the first passages based on the congruency score being within a threshold. 1. A computer-implemented method comprising:determining, by a processor, an ontology, the ontology comprising a plurality of ontological relationships;receiving, by the processor, a plurality of passages;determining, by the processor, a target set of co-occurring entities comprising a first entity and a second entity;determining a first passage in the plurality of passages that includes the first entity and the second entity;determining, from the ontology, a first ontological relationship between the first entity and the second entity;analyzing the first passage to determine a congruency score for the first ontological relationship; andgenerating a relationship annotation between the first entity and the second entity in the first passages based on the congruency score being within a threshold.2. The computer-implemented method of claim 1 , wherein analyzing the first passage to determine the congruency score for the first ontological relationship comprises:generating a graph of the first passage, the graph comprising a plurality of nodes representing a plurality of words and phrases in the first passage; andanalyzing a semantic linkage between the plurality of nodes and ...

Подробнее
03-02-2022 дата публикации

Systems and Methods for Explainable Fake News Detection

Номер: US20220036011A1
Автор: Shu Kai
Принадлежит:

A news article may include sentences and have associated comments. A embodiment determines semantic correlation between each sentence and each comment to generate correlation degrees between the sentences and the comments, determines sentence attention weights of the sentences and comment attention weights of the comments based on the correlation degrees, and detect whether the news article is fake based on latent representations of the sentences and the comments, the sentence attention weights and the comment attention weights. A list of sentences and a list of comments may be selected based on the sentence attention weights and the comment attention weights, respectively, to provide explanation for a detection result. 1. A computer-implemented method comprising:obtaining a piece of news comprising a plurality of sentences;obtaining a plurality of comments associated with the piece of news;determining semantic correlation between each sentence of the plurality of sentences and each comment of the plurality of comments based on latent representations of the plurality of sentences and latent representations of the plurality of comments, to generate respective correlation degrees between the plurality of sentences and the plurality of comments;determining a sentence attention weight of each sentence of the plurality of sentences and a comment attention weight of each comment of the plurality of comments, based on the respective correlation degrees, the latent representations of the plurality of sentences and the latent representations of the plurality of comments; anddetecting whether the piece of news is fake based on the latent representations of the plurality of sentences weighted by respective sentence attention weights and based on the latent representations of the plurality of comments weighted by respective comment attention weights.2. The method of claim 1 , further comprising:generating a detection result indicating whether the piece of news is fake, the ...

Подробнее
03-02-2022 дата публикации

DERIVING MULTIPLE MEANING REPRESENTATIONS FOR AN UTTERANCE IN A NATURAL LANGUAGE UNDERSTANDING (NLU) FRAMEWORK

Номер: US20220036012A1
Автор: Sapugay Edwin, Sarda Gopal
Принадлежит:

The present approaches are generally related to an agent automation framework that is capable of extracting meaning from user utterances, such as requests received by a virtual agent (e.g., a chat agent), and suitably responding to these user utterances. In certain aspects, the agent automation framework includes a NLU framework and an intent-entity model having defined intents and entities that are associated with sample utterances. The NLU framework may include a meaning extraction subsystem designed to generate meaning representations for the sample utterances of the intent-entity model to construct an understanding model, as well as generate meaning representations for a received user utterance to construct an utterance meaning model. The disclosed NLU framework may include a meaning search subsystem that is designed to search the meaning representations of the understanding model to locate matches for meaning representations of the utterance meaning model. 1. An agent automation system , comprising:a memory configured to store a natural language understanding (NLU) framework, wherein the NLU framework includes a part-of-speech (POS) component, a variability filter component, a parser component, and a final scoring and filtering component; and performing, via the POS component, part-of-speech (POS) tagging to generate a set of potential POS taggings for a set of utterances;', 'performing, via the variability filter component, variability filtering of the set of potential POS taggings to generate a set of final nominee POS taggings, wherein each of the set of final nominee POS taggings is distinct from one another;', 'parsing, via the parsing component, the set of final nominee POS taggings to generate a set of potential meaning representations for the set of final nominee POS taggings; and', 'selecting, via the final scoring and filtering component, a final set of meaning representations for the set of utterances from the set of potential meaning representations ...

Подробнее
03-02-2022 дата публикации

AUTONOMOUS DETECTION OF COMPOUND ISSUE REQUESTS IN AN ISSUE TRACKING SYSTEM

Номер: US20220036014A1
Автор: Bar-on Noam, Chung Sukho
Принадлежит:

An issue tracking system configured to determine whether an issue request submitted by a user of the issue tracking system can, or should, be subdivided into two or more issue requests. In some implementations, the issue tracking system is configured to extract a content item of the issue request (e.g., title, description, and the like) in order to perform a semantic and/or syntactic analysis of that content item. Upon determining that the content item includes two or more clauses linked by a coordinating, subordinating, or correlative conjunction, the system can provide a recommendation to the user to submit discrete two or more issue requests, each one of which corresponds to a single linked clause of the content item. 1. An issue tracking system comprising:a client device executing a client application; and receive an issue request from the client application;', 'determine a divisibility score based on semantic content of a content item of the issue request; and', generate two or more issue request templates based on the semantic content of the content item, the issue request templates at least partially populated with data extracted from the issue request; and', 'transmit the two or more populated issue request templates to the client application., 'in response to a determination that the divisibility score satisfies a divisibility threshold], 'a host service operably coupled to the client application of the client device and comprising a processor configured to2. The issue tracking system of claim 1 , wherein:the content item is an issue request description;the semantic content of the issue request description is a set of lemmatized words extracted from the issue request description;the divisibility score is increased upon a first determination that the set of lemmatized words includes at least a threshold number of lemmatized words associated with compound issue requests; andthe divisibility score is decreased upon a second determination that the set of words ...

Подробнее
21-01-2016 дата публикации

System and method to extract structured semantic model from document

Номер: US20160019192A1
Принадлежит: General Electric Co

According to some embodiments, a document associated with an artifact may be received, the document being at least partially unstructured. In an unstructured portion of the document, an extraction platform may automatically detect a first characteristic. The extraction platform may also automatically detect a second characteristic in the unstructured portion of the document. Using the first and second characteristics, a structured semantic model representing the artifact may automatically be created.

Подробнее
03-02-2022 дата публикации

METHOD FOR TRAINING A LINGUISTIC MODEL AND ELECTRONIC DEVICE

Номер: US20220036880A1

The present disclosure provides a method for training a linguistic model, related to fields of speech, natural language processing, deep learning technologies. A method includes: obtaining grammars corresponding to a plurality of sample texts and a slot value of a slot in each grammar by using semantic analysis; generating a grammar graph corresponding to each grammar based on the corresponding grammar and the slot value of the slot in the corresponding grammar; obtaining a weight of each grammar, a weight of each slot, and a weight of each slot value in each grammar graph based on the sample texts; determining at least one grammar frequency of each order based on the weight of each grammar, the weight of each slot, and the weight of each slot value in each grammar graph; and training the linguistic model based on the at least one grammar frequency of each order. 1. A method for training a linguistic model , comprising:obtaining grammars corresponding to a plurality of sample texts and a slot value of a slot in each grammar by using semantic analysis;generating a grammar graph corresponding to each grammar based on the corresponding grammar and the slot value of the slot in the corresponding grammar;obtaining a weight of each grammar, a weight of each slot, and a weight of each slot value in each grammar graph based on the sample texts;determining at least one grammar frequency of each order based on the weight of each grammar, the weight of each slot, and the weight of each slot value in each grammar graph; andtraining the linguistic model based on the at least one grammar frequency of each order.2. The method according to claim 1 , wherein the semantic analysis comprises natural language understanding semantic analysis.3. The method according to claim 1 , wherein obtaining the weight of each grammar claim 1 , the weight of each slot claim 1 , and the weight of each slot value in each grammar graph based on the sample texts comprises:for each sample text, passing ...

Подробнее
17-01-2019 дата публикации

Method and system for providing real time search preview personalization in data management systems

Номер: US20190018899A1
Принадлежит: Intuit Inc

A method and system provides personalized search results to users of a data management system. The method and system receives a search query from a user and generate initial search results including a plurality of assistance documents relevant to the query data. The method and system utilizes natural language analysis and machine learning processes to analyze the query data, user attributes data, and the assistance documents in order to generate personalized previews of the assistance documents for the user. The method and system output personalized search results to the user including the personalized previews of the assistance documents.

Подробнее
21-01-2021 дата публикации

APPARATUS, SYSTEM, AND METHOD FOR NATURAL LANGUAGE PROCESSING

Номер: US20210019353A1
Принадлежит:

Various embodiments are described for searching and retrieving documents based on a natural language input. A computer-implemented natural language processor electronically receives a natural language input phrase from an interface device. The natural language processor attributes a concept to the phrase with the natural language processor. The natural language processor searches a database for a set of documents to identify one or more documents associated with the attributed concept to be included in a response to the natural language input phrase. The natural language processor maintains the concepts during an interactive session with the natural language processor. The natural language processor resolves ambiguous input patterns in the natural language input phrase with the natural language processor. The natural language processor includes a processor, a memory and/or storage component, and an input/output device. 1. A method , comprising:electronically receiving, by a computer-implemented natural language processor, a natural language input phrase from a user via an interface device;associating, by the natural language processor, a primary goal with the natural language input phrase;initiating an interactive dialogue with the user via the interface device to acquire information associated with the primary goal;associating the natural language input to a tangential goal upon determining that information received from the interface device in the interactive dialogue is not related to the primary goal, the tangential goal based on the information that is not related to the primary goal; andredirecting the interactive dialogue to a tangential request based on the tangential goal, by a context switching module, to acquire information responsive to the tangential goal.2. The method of claim 1 , further comprising claim 1 , upon completion of response to the tangential request claim 1 , redirecting the interactive dialogue claim 1 , by the context switching module ...

Подробнее
21-01-2021 дата публикации

Method and device for processing untagged data, and storage medium

Номер: US20210019372A1
Автор: Xiaotong PAN, Zuopeng LIU

A method for processing untagged data includes: similarity comparison is performed on a semantic vector of untagged data and a semantic vector of each piece of tagged data to obtain similarities corresponding to respective pieces of tagged data; a preset number of similarities are selected according to a preset selection rule; the untagged data is predicted with a tagging model obtained by training through the tagged data, to obtain a prediction result of the untagged data; and the untagged data is divided into untagged data that can be tagged by a device or untagged data that cannot be tagged by the device according to the preset number of similarities and the prediction result.

Подробнее
16-01-2020 дата публикации

Topic models with sentiment priors based on distributed representations

Номер: US20200019611A1
Принадлежит: SAMSUNG ELECTRONICS CO LTD

There are provided a device and method for topic modeling with sentiment priors based on distributed representations. A method for topic modeling with sentiment priors based on distributed representations includes: inputting a review into a topic model; and by the topic model, determining a representation for each word in the review, wherein the representations are word vectors in a semantic space; and estimating the representations using the sentiment priors to determine a topic corresponding to the review, wherein the topic model includes the sentiment priors trained on the basis of the representations distributed by using a regularizer, the regularizer defining the same sentiment to words having similar word vectors, and wherein each sentiment prior is the same for words having similar word vectors.

Подробнее
21-01-2021 дата публикации

AUTOMATIC GENERATION OF STATEMENT-RESPONSE SETS FROM CONVERSATIONAL TEXT USING NATURAL LANGUAGE PROCESSING

Номер: US20210019475A1
Автор: Avedissian Narbeh
Принадлежит:

Systems and methods that access an online networked resource using a locator are disclosed. A first item of content on the networked resource is identified. A trigger rule comprising keywords and a sentiment classifier is accessed. A neural network, including input, hidden, and output layers, is used to assign a sentiment classification to the first item of content. The trigger rule, the sentiment classification, and identified keywords, are used to determine whether response content is to be posted to the online networked resource. In response to determining, using the trigger rule, the assigned sentiment classification, and keywords identified in the first item of content, that response content is to be posted to the online networked resource, the sentiment classification and identified keywords are used to select and/or generate a second item of content, and the second item of content is enabled to be posted to the online networked resource. 1. A content distribution system , the content distribution system comprising:a data repository configured to store uploads of a plurality of media files of media submitters, including one or more media files comprising performance data; and provide access to media files of media submitters stored on the media file data repository to a plurality of different types of user devices, including at least a phone, over a communication network;', 'obtain feedback from users with respect to the media files of media submitters;', 'determine that a selected media submitter meets a threshold value or condition;, 'a computer system configured to the determination that the selected media submitter meets the threshold value or condition, provide a first offer of services to the selected media submitter,', 'deliver media files associated with the selected media submitter and one or more advertisements to user devices;', 'monitor subsequent user interactions with the one or more advertisements associated with the media files associated with ...

Подробнее
21-01-2021 дата публикации

METHODS AND APPARATUS TO IMPROVE DISAMBIGUATION AND INTERPRETATION IN AUTOMATED TEXT ANALYSIS USING TRANSDUCERS APPLIED ON A STRUCTURED LANGUAGE SPACE

Номер: US20210019476A1
Автор: Roche Emmanuel
Принадлежит: CLRV Technologies, LLC

Methods and apparatus for automated processing of natural language text is described. The text can be preprocessed to produce language-space data that includes descriptive data elements for words. Source code that includes linguistic expressions, and that may be written in a programming language that is user-friendly to linguists, can be compiled to produce finite-state transducers and bi-machine transducers that may be applied directly to the language-space data by a language-processing virtual machine. The language-processing virtual machine can select and execute code segments identified in the finite-state and/or bi-machine transducers to disambiguate meanings of words in the text. 1. A method of automated processing of text , the method comprising:processing the text to generate a language space having one or more descriptive data elements associated with one or more words in the text and for which a word has multiple meanings; and identifying a match between a first input element in the first finite-state transducer or first bi-machine transducer and a first identifier of a first descriptive data element in the language space associated with a word in the sentence;', 'identifying an expressive element in the first finite-state transducer or first bi-machine transducer following the first input element, wherein the expressive element indicates a relational aspect between the first descriptive data element and a second descriptive data element in the language space;', 'in response to identifying the expressive element, updating transition data that tracks relational aspects of descriptive data elements in the language space; and', 'executing a first code segment, based at least in part on the updated transition data, to produce a modified language space in which the meaning of the word in the sentence associated with the first descriptive data element is disambiguated., 'executing an operation with a first finite-state transducer or first bi-machine transducer ...

Подробнее
16-01-2020 дата публикации

Generative Adversarial Network Based Modeling of Text for Natural Language Processing

Номер: US20200019863A1
Принадлежит: International Business Machines Corp

Mechanisms are provided to implement a generative adversarial network (GAN) for natural language processing. With these mechanisms, a generator neural network of the GAN is configured to generate a bag-of-ngrams (BoN) output based on a noise vector input and a discriminator neural network of the GAN is configured to receive a BoN input, where the BoN input is either the BoN output from the generator neural network or a BoN input associated with an actual portion of natural language text. The mechanisms further configure the discriminator neural network of the GAN to output an indication of a probability as to whether the input BoN is from the actual portion of natural language text or is the BoN output of the generator neural network. Moreover, the mechanisms train the generator neural network and discriminator neural network based on a feedback mechanism that compares the output indication from the discriminator neural network to an indicator of whether the input BoN is from the actual portion of natural language text of the BoN output of the generator neural network.

Подробнее
21-01-2021 дата публикации

Generating a recommendation associated with an extraction rule for big-data analysis

Номер: US20210019637A1
Автор: Dong Hun Lee, Jose Peter
Принадлежит: Hcl Australia Services Pty Ltd

Disclosed is a system for generating a recommendation associated with an extraction rule for big-data analysis. The system may receive a set of data blocks and an extraction rule. The extraction rule comprises an identifier and a first range associated with the identifier in the set of data blocks. The system identifies a set of words associated with the identifier in one or more data blocks from the set of data blocks based on a provenance knowledge. The provenance knowledge is indicative of a relationship between the word and the identifier. The system identifies a set of locations associated with the set of words in the one or more data blocks. The system generates a second range associated with the identifier based on the set of locations. The system recommends the second range for modifying the extraction rule for big-data analysis based on an accuracy change.

Подробнее
25-01-2018 дата публикации

Methods and systems for communicating information to a user

Номер: US20180024272A1
Принадлежит: Toshiba Corp

A computer implemented method for communicating information to a user; the method including: receiving, from one or more sensors, sensor data containing values of one or more parameters monitored by the sensors; receiving, from one or more users, semantic data for use in interpreting the values of the one or more parameters contained in the sensor data; storing the received semantic data in association with the values of the parameters; receiving a request from a user for information relating to one or more of the parameters at a specified location; determining a value of the one or more parameters at the specified location based on the received sensor data; identifying semantic data that reflects the determined value(s) of the one or more parameters, based on the stored semantic data and stored values of the parameters; and sending the identified semantic data to the user that issued the request.

Подробнее
26-01-2017 дата публикации

Personal knowledge graph population from declarative user utterances

Номер: US20170024375A1
Принадлежит: Microsoft Technology Licensing LLC

An “Utterance-Based Knowledge Tool” monitors user utterances (e.g., user speech or text inputs) to identify relevant statements of facts in declarative utterances of a user. A semantic parser is applied to each statement of facts to parse assertions comprising instances of two or more entities and relations between those entities. As such, each assertion explicitly delimits a relation between two particular entities (one of which may be the user) that are relevant to the particular user. The Utterance-Based Knowledge Tool places or categorizes the identified assertions (which each include entities and relations) into one or more of a plurality of predefined classes. These classified assertions are then applied to construct and/or update a personal knowledge graph for the user. This personal knowledge graph is then applied to respond to user queries, thereby improving personal relevancy of query responses provided to the user.

Подробнее
26-01-2017 дата публикации

Context sensitive query expansion

Номер: US20170024460A1
Принадлежит: International Business Machines Corp

A processor expands a search expression. The processor determines nodes representing query terms of a search expression. The nodes have associated text for search term expansion, and represent at least one concept in a semantic graph of nodes that represents a domain of semantically related concepts. The processor determines i) a center of focus within the semantic graph for the two or more nodes based, at least in part, on a spreading activation in the graph and ii) a contextual relevance for the two or more nodes with respect to node the center of focus. The processor selects, for a query term, a node based on contextual relevance between that node and the query term and expands the search expression using an associated text of that node.

Подробнее
26-01-2017 дата публикации

Identifying errors in medical data

Номер: US20170024517A1
Принадлежит: International Business Machines Corp

A computer processor may receive medical data including a report and an image. The computer processor may analyze the report using natural language processing to identify a condition and a corresponding criterion. The computer processor may also analyze the image using an image processing model to generate an image analysis. The computer processor may determine whether the report has a potential problem by comparing the image analysis to the criterion.

Подробнее
10-02-2022 дата публикации

Translation of verbal directions into a list of maneuvers

Номер: US20220042815A1
Принадлежит: Microsoft Technology Licensing LLC

Natural language directions are received and a set of maneuver/context pairs are generated based upon the natural language directions. The set of maneuver/context pairs are provided to a routing engine to obtain route information based upon the set of maneuver/context pairs. The route information is provided to an output system for surfacing to a user.

Подробнее
10-02-2022 дата публикации

DOCUMENT PROCESSING PROGRAM AND INFORMATION PROCESSING APPARATUS

Номер: US20220043849A1
Принадлежит:

A document processing program and an information processing apparatus that present a contract status of an organization based on the contents of contract documents. The document processing program including instructions that causes the information processing apparatus to: accept a condition for analyzing a contract document by an acceptance unit; extract a contract document by an analysis target extraction unit, wherein the contract document containing extraction information matching the condition accepted by the acceptance unit from a contract document database which includes a plurality of contract documents and in which information indicating a contract status of the plurality of contract documents is extracted as extraction information; analyze the contract document extracted by the analysis target extraction unit based on the condition accepted by the acceptance unit, by an analysis unit; and display and output an analysis result of the analysis unit by the output unit. 1. A non-transitory computer-readable medium storing a program including instructions that , when executed by a processor , causes an information processing apparatus connected to a document processing apparatus through a communication interface , to:accept a condition for analyzing a contract document by an acceptance unit;extract a contract document by an analysis target extraction unit, wherein the contract document containing information that matches to the condition accepted by the acceptance unit is extracted from a contract document database which contains a plurality of contract documents and information indicating a contract status of the plurality of contract documents;analyze the contract document extracted by the analysis target extraction unit based on the condition accepted by the acceptance unit, by an analysis unit; anddisplay and output an analysis result of the analysis unit by the output unit.2. A non-transitory computer-readable medium storing a program including instructions ...

Подробнее
10-02-2022 дата публикации

Cluster analysis method, cluster analysis system, and cluster analysis program

Номер: US20220043851A1
Принадлежит: Aixs Inc

A server 4 executes a similarity calculation step (S 2 ) of calculating similarity between content of one document and content of another document, a cluster classification step (S 3 ) of generating a network in which a document is set as a node based on calculated similarity and similar nodes are connected by an edge, and performing classification based on similar documents, a first index calculation step (S 4 ) of calculating a first index indicating centrality of a document in the network, a second index calculation step (S 5 ) of calculating a second index that is different from the first index in the network and indicates importance of a document, and a display data generation step (S 6 ) of generating, regarding a document, first display data indicating the network by an expression of a size of an object of a node according to the first index, an expression of a gauge having a shape corresponding to a shape of the object according to the second index and a length of the gauge, an expression according to a type of the cluster, and an expression according to magnitude of similarity between documents.

Подробнее
10-02-2022 дата публикации

SYSTEMS AND METHODS FOR INTELLIGENT CONTENT FILTERING AND PERSISTENCE

Номер: US20220043874A1
Принадлежит:

A source content processor receives content from a crawler and calls a text mining engine. The text mining engine mines the content and provides metadata about the content. The source content processor applies a source content filtering rule to the content utilizing the metadata from the text mining engine. The source content filtering rule is previously built based on at least one of a named entity, a category, or a sentiment. The source content processor determines whether to persist the content according to a result from applying the source content filtering rule to the content and either stores the content in a data store or deletes the contents from the data ingestion pipeline such that the content is not persisted anywhere. Embodiments disclosed herein can significantly reduce the amount of irrelevant content through the data ingestion pipeline, prior to data persistence. 1. A method , comprising:receiving, by a source content processor, content from disparate data sources, the source content processor working in conjunction with a data ingestion pipeline running on a server machine, the source content processor communicatively connected to a data store, the server machine operating in an enterprise computing environment;prior to persisting the received content, calling, by the source content processor, a text mining engine with the received content;receiving, by the source content processor from the text mining engine, metadata that describes the received content;applying, by the source content processor, a source content filtering rule to the received content utilizing the metadata that describes the received content, wherein the source content filtering rule is previously built based on at least one of a named entity, a category, and a sentiment;determining, by the source content processor, whether to persist the received content according to a result from the applying; andresponsive to a determination by the source content processor to persist the received ...

Подробнее
10-02-2022 дата публикации

PROMISED NATURAL LANGUAGE PROCESSING ANNOTATIONS

Номер: US20220043967A1
Принадлежит:

Aspects of the invention include a computer-implemented method for generating promise identifiers for documents. Aspects include processing a document including a reference, wherein processing includes performing natural language processing (NLP) the document, and identifying the reference included in the document. Aspects also include generating a promise identifier for the reference in the document, and responsive to processing the document, resolving the promise identifier for the reference by providing data of the reference associated with the promise identifier. Aspects of the invention also include a computer program product and system for generating promise identifiers for documents. 1. A computer-implemented method using promise identifiers for documents , the computer-implemented method comprising:processing, by a processor, a document comprising a reference, wherein processing comprises natural language processing (NLP) the document;identifying, by the processor, the reference included in the document;generating, by the processor, a promise identifier for the reference in the document; andresponsive to processing the document, resolving, by the processor, the promise identifier for the reference by providing data of the reference associated with the promise identifier.2. The computer-implemented method of claim 1 , wherein the reference is at least one of a footnote claim 1 , appendix claim 1 , or supplemental information within the document.3. The computer-implemented method of claim 1 , further comprising identifying a plurality of references in the document.4. The computer-implemented method of claim 3 , further comprising responsive to identifying the plurality of references claim 3 , generating a promise identifier for each of the plurality of references.5. The computer-implemented method of claim 4 , wherein the promise identifier for each of the plurality of references in the document is generated prior to resolving any of the promise identifiers ...

Подробнее
10-02-2022 дата публикации

Future potential natural language processing annotations

Номер: US20220043968A1
Принадлежит: International Business Machines Corp

Aspects of the invention include resolving future reference identifiers for documents. Aspects of the invention include processing a document including a reference to a future event, wherein processing includes performing natural language processing (NLP) on the document, and identifying the reference to the future event included in the document. Aspects of the invention also include generating a future reference identifier for the reference to the future event, and responsive to processing an occurrence of the future event, resolving the future reference identifier by providing data from a subsequent document for the future event associated with the future reference identifier.

Подробнее
10-02-2022 дата публикации

DISENTANGLE SYNTAX AND SEMANTICS IN SENTENCE REPRESENTATION WITH DECOMPOSABLE VARIATIONAL AUTOENCODER

Номер: US20220043975A1
Принадлежит: Baidu USA LLC

Described herein are embodiments of a framework named decomposable variational autoencoder (DecVAE) to disentangle syntax and semantics by using total correlation penalties of Kullback-Leibler (KL) divergences. KL divergence term of the original VAE are decomposed such that the hidden variables generated may be separated in a clear-cut and interpretable way. Embodiments of DecVAE models are evaluated on various semantic similarity and syntactic similarity datasets. Experimental results show that embodiments of DecVAE models achieve state-of-the-art (SOTA) performance in disentanglement between syntactic and semantic representations. 1. A computer-implemented method for representation disentanglement comprising:receiving an input of a sequence of tokens;generating, using a first combination comprising a first embedding layer and a first attention layer, a first sequence of hidden variables based on the sequence of tokens;generating, using a second combination comprising a second embedding layer and a second attention layer, a second sequence of hidden variables based on the sequence of tokens;generating, using a semantic encoder, a sequence of semantic hidden variables based on the first sequence of hidden variables;generating, using a syntax encoder, a sequence of syntactic hidden variables based on the second sequence of hidden variables;generating, using a decoder, a sequence of reconstructed tokens and a corresponding sequence of reconstructed attention weights based on the sequence of semantic hidden variables and the sequence of syntactic hidden variables; andresponsive to a training process, constructing one or more loss functions, using the sequence of reconstructed tokens and the corresponding sequence of reconstructed attention weights, to train at least one of the first embedding layer, the second embedding layer, the first attention layer, the second attention layer, the semantic encoder, the syntax encoder, and the decoder;responsive to an inference ...

Подробнее
10-02-2022 дата публикации

DETERMINING USER COMPLAINTS FROM UNSTRUCTURED TEXT

Номер: US20220043977A1
Принадлежит:

A method, computer system, and a computer program product for complaint identification is provided. The present invention may include processing one or more sentences of a received communication. The present invention may include determining that at least one phrase of the processed sentence has a negative status. The present invention may include contextualizing the at least one phrase of the processed sentence. The present invention may include identifying a complaint within the at least one phrase of the processed sentence. 1. A method for complaint identification , the method comprising:processing one or more sentences of a received communication;determining that at least one phrase of the processed sentence has a negative status;contextualizing the at least one phrase of the processed sentence; andidentifying a complaint within the at least one phrase of the processed sentence.2. The method of claim 1 , further comprising:aggregating the at least one contextualized phrase of the processed sentence.3. The method of claim 1 , wherein the received communication is an email communication containing unstructured text.4. The method of claim 1 , wherein processing the one or more sentences of the received communication further comprises:tokenizing the received communication to generate a list of tokens; andusing a rule-based parts of speech (POS) tagging method to tag each token in the generated list of tokens with a part of speech.5. The method of claim 1 , wherein determining that at least one phrase of the processed sentence has the negative status further comprises:using a sentiment analysis application programming interface (API) to analyze the at least one processed phrase.6. The method of claim 4 , wherein contextualizing the at least one phrase of the processed sentence further comprises:determining a subject of the at least one processed phrase by using the generated list of tokens with the tagged part of speech.7. The method of claim 1 , wherein identifying ...

Подробнее
10-02-2022 дата публикации

NATURAL LANGUAGE PROCESSING BASED ON USER CONTEXT

Номер: US20220043980A1
Принадлежит:

Techniques for natural language processing based on user context include identifying a context of a user and responsive to receiving a request from the user intended for processing by a natural language processing (NLP) model, accounting for the context of the user in relation to the request. A result from the NLP model having accounted for the context of the user is provided. 1. A computer-implemented method comprising:identifying a context of a user;responsive to receiving a request from the user intended for processing by a natural language processing (NLP) model, accounting for the context of the user in relation to the request; andproviding a result from the NLP model having accounted for the context of the user, wherein the result is supplied as an answer to of the request to be output to the user in which the answer is selected from one or more candidate results based on the context of the user.2. The computer-implemented method of claim 1 , wherein accounting for the context of the user in relation to the request comprises associating a weighted score to the one or more candidate results as potential answers to the request according to the context of the user.3. The computer-implemented method of claim 1 , wherein:the one or more candidate results are potential answers to the request, in which each of the one or more candidate results has a weighted score for the context of the user; andaccounting for the context of the user in relation to the request comprises selecting the result from the one or more candidate results according to the weighted score.4. The computer-implemented method of claim 1 , further comprising continuously adjusting a user profile of the user as the context of the user changes claim 1 , the user profile being used to account for the context of the user.5. The computer-implemented method of claim 1 , further comprising updating at least one annotator associated with the NLP model based on the context of the user.6. The computer- ...

Подробнее