Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 84. Отображено 84.
05-02-2015 дата публикации

Speech and noise models for speech recognition

Номер: AU2011267982B2
Принадлежит:

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
12-06-2014 дата публикации

ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION

Номер: AU2014202785A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
08-11-2012 дата публикации

Disambiguation of contact information using historical data

Номер: AU2011255614A1
Принадлежит: Google LLC

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar. Basic idea: Improved method of initiating a call by voice recognition taking the call history with the call recipient or call destination into account when performing the voice recognition process.

Подробнее
13-03-2014 дата публикации

PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT

Номер: AU2014201001A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
19-02-2019 дата публикации

Disambiguation of a spoken query term

Номер: US0010210267B1
Принадлежит: Google LLC, GOOGLE LLC

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.

Подробнее
07-07-2015 дата публикации

Adjusting language models using context information

Номер: US0009076445B1
Принадлежит: Google Inc., GOOGLE INC, GOOGLE INC.

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

Подробнее
04-06-2015 дата публикации

GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY

Номер: AU2014200999B2
Принадлежит: Google LLC

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
10-01-2017 дата публикации

Adjusting language models based on topics identified using context

Номер: US0009542945B2
Принадлежит: Google Inc., GOOGLE INC

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

Подробнее
24-02-2015 дата публикации

Progressive encoding of audio

Номер: US0008965545B2

The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity.

Подробнее
24-11-2011 дата публикации

DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA

Номер: US20110288868A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.

Подробнее
27-09-2012 дата публикации

Predictive pre-recording of audio for voice input

Номер: AU2011229784A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
26-03-2015 дата публикации

DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA

Номер: AU2014201000B2
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar. Basic idea: Improved method of initiating a call by voice recognition taking the call history with the call recipient or call destination into account when performing the voice recognition process.

Подробнее
26-01-2012 дата публикации

DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA

Номер: US20120022874A1
Принадлежит: Google Inc.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar. 1. A system comprising:one or more computers; and receiving an audio signal,', 'generating, for each of two or more items of contact information, an affinity score based on a frequency with which a user has previously communicated with a contact associated with the item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information,', 'inferring, for each of the items of contact information, a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information,', 'generating a communication initiation grammar that includes one or more transitions associated with each of the items of contact information, wherein, for each of the items of contact information, the one or more transitions associated with the item of contact information are weighted according to the probability inferred for the item of contact information,', 'performing speech recognition on the audio signal using the communication initiation grammar, to select a particular item of contact information, and', 'initiating the communication using the particular item of contact ...

Подробнее
13-03-2014 дата публикации

DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA

Номер: AU2014201000A1
Принадлежит: Google LLC

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar. Basic idea: Improved method of initiating a call by voice recognition taking the call history with the call recipient or call destination into account when performing the voice recognition process.

Подробнее
29-10-2015 дата публикации

ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION

Номер: AU2014202785B2
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
06-09-2011 дата публикации

Registering an event

Номер: US0008015571B2
Принадлежит: Google Inc., GOOGLE INC, GOOGLE INC.

A computer-implemented method for registering an event includes detecting occurrence of at least one event to be registered in a sequence. The sequence is to have entries for occurred events, each of the entries being a number indicating at least one of the occurred events and being associated with an aggregation number reflecting a number of times the entry has been aggregated within the sequence. The method includes identifying a new entry for extending the sequence, the new entry comprising a first number corresponding to the detected at least one event. The method includes revising the sequence by adding the numbers of at least two entries whose respective aggregation numbers satisfy a criterion for aggregation. The method includes storing the revised sequence.

Подробнее
17-04-2014 дата публикации

Geotagged environmental audio for enhanced speech recognition accuracy

Номер: AU2011241065B2
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
01-11-2012 дата публикации

Speech and noise models for speech recognition

Номер: AU2011267982A1
Принадлежит:

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
13-03-2014 дата публикации

GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY

Номер: AU2014200999A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
30-04-2015 дата публикации

PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT

Номер: AU2014201001B2
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
27-03-2014 дата публикации

Disambiguation of contact information using historical data

Номер: AU2011255614B2
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar. Basic idea: Improved method of initiating a call by voice recognition taking the call history with the call recipient or call destination into account when performing the voice recognition process.

Подробнее
06-03-2014 дата публикации

Acoustic model adaptation using geographic information

Номер: AU2011258531B2
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
04-03-2014 дата публикации

Speech and noise models for speech recognition

Номер: US0008666740B2

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
05-06-2012 дата публикации

Predictive pre-recording of audio for voice input

Номер: US0008195319B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
04-10-2012 дата публикации

Geotagged environmental audio for enhanced speech recognition accuracy

Номер: AU2011241065A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
27-03-2014 дата публикации

Predictive pre-recording of audio for voice input

Номер: AU2011229784B2
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
04-10-2012 дата публикации

Acoustic model adaptation using geographic information

Номер: AU2011258531A1
Принадлежит: Google LLC

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
08-05-2012 дата публикации

Geotagged and weighted environmental audio for enhanced speech recognition accuracy

Номер: US0008175872B2

Enhancing noisy speech recognition accuracy by receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, selecting a subset of geotagged audio signals and weighting each geotagged audio signal of the subset based on whether the respective audio signal was manually uploaded or automatically updated, generating a noise model for the particular geographic location using the subset of weighted geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
29-09-2011 дата публикации

PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT

Номер: US20110238191A1
Принадлежит: GOOGLE INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
23-10-2012 дата публикации

Speech recognition using device docking context

Номер: US0008296142B2

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for performing speech recognition using dock context. In one aspect, a method includes accessing audio data that includes encoded speech. Information that indicates a docking context of a client device is accessed, the docking context being associated with the audio data. A plurality of language models is identified. At least one of the plurality of language models is selected based on the docking context. Speech recognition is performed on the audio data using the selected language model to identify a transcription for a portion of the audio data.

Подробнее
26-01-2012 дата публикации

GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY

Номер: US20120022870A1
Принадлежит: GOOGLE, INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
05-11-2013 дата публикации

Determining content to be displayed

Номер: US0008577181B1

A computer-implemented method for determining content to be displayed includes determining a first size aspect of first content that is to be presented in a graphical user interface. The method includes obtaining second content and causing the second content to be presented in the graphical user interface with the first content, wherein the second content has a second size aspect with a predefined relationship to the first size aspect. A system includes a size determining module determining a first size aspect of first content that is to be presented in a graphical user interface. The system includes a content module obtaining second content based on the first size aspect, the second content having a second size aspect with a predefined relationship to the first size aspect. The system includes a page generating module causing the second content to be presented in the graphical user interface with the first content.

Подробнее
23-04-2013 дата публикации

Predictive pre-recording of audio for voice input

Номер: US0008428759B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
10-03-2010 дата публикации

Registering an event

Номер: CN0101669094A
Принадлежит:

A computer-implemented method for registering an event includes detecting occurrence of at least one event to be registered in a sequence. The sequence is to have entries for occurred events, each of the entries being a number indicating at least one of the occurred events and being associated with an aggregation number reflecting a number of times the entry has been aggregated within the sequence. The method includes identifying a new entry for extending the sequence, the new entry comprising a first number corresponding to the detected at least one event. The method includes revising the sequence by adding the numbers of at least two entries whose respective aggregation numbers satisfy a criterion for aggregation. The method includes storing the revised sequence.

Подробнее
18-06-2013 дата публикации

Acoustic model adaptation using geographic information

Номер: US0008468012B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
26-01-2012 дата публикации

PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT

Номер: US20120022675A1
Принадлежит: GOOGLE, INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes establishing, as input data, state data that references a state of a mobile device and sensor data that is sensed by one or more sensors of the mobile device, applying a rule or a probabilistic model to the input data, inferring, based on applying the rule or the probabilistic model to the input data, that a user of the mobile device is likely to initiate voice input, and invoking one or more functionalities of the mobile device in response to inferring that the user is likely to initiate voice input.

Подробнее
27-08-2013 дата публикации

Disambiguation of a spoken query term

Номер: US0008521526B1

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.

Подробнее
24-09-2013 дата публикации

Live experiment framework

Номер: US0008543645B1
Принадлежит: Google Inc., GOOGLE INC, GOOGLE INC.

This disclosure generally relates to assigning and simultaneously running multiple client-side experiments on client devices. A file includes information regarding experiments that are available, including information regarding "layers," which are logical, imaginary containers in which each experiment "resides." Each experiment is associated with one layer. For each experiment, the file includes information regarding a location and size of the experiment within the layer. When the client device takes an action, a software module identifies a value of an identifier associated with the action. Each such identifier is associated with one or more of the layers. The software module can calculate, for each of the associated layers, a location within the layer based on the identifier value. The computer software module can identify, based on the information in the file, each experiment that overlaps with the calculated location within each layer and cause each identified experiment to be activated ...

Подробнее
11-10-2012 дата публикации

Speech and Noise Models for Speech Recognition

Номер: US20120259631A1
Принадлежит: GOOGLE INC.

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal. 1. A system comprising:one or more processing devices; and receive an audio signal generated by a device based on audio input from a user, the audio signal including at least background audio and one or more user utterances recorded by the device;', 'determine a location of the user when the one or more utterances are recorded;', 'select a noise model from a plurality of noise models;', 'adapt the selected noise model based on the received audio signal to generate an adapted noise model that models characteristics of background audio surrounding the user at the location; and', 'store the adapted noise model as a noise model for the user in association with the determined location such that the adapted noise model is used for speech recognition when utterances of the user are recorded at the determined location., 'one or more storage devices storing instructions that, when executed by the one or more processing devices, cause the system to2. The system of claim 1 , wherein the instructions further comprise instructions that claim 1 , when executed claim 1 , cause the system to generate a filtered audio signal with reduced background ...

Подробнее
24-11-2011 дата публикации

Registering an Event

Номер: US20110289516A1
Принадлежит: Google Inc.

A computer-implemented method for registering an event includes detecting occurrence of at least one event to be registered in a sequence. The sequence is to have entries for occurred events, each of the entries being a number indicating at least one of the occurred events and being associated with an aggregation number reflecting a number of times the entry has been aggregated within the sequence. The method includes identifying a new entry for extending the sequence, the new entry comprising a first number corresponding to the detected at least one event. The method includes revising the sequence by adding the numbers of at least two entries whose respective aggregation numbers satisfy a criterion for aggregation. The method includes storing the revised sequence.

Подробнее
26-07-2012 дата публикации

SPEECH RECOGNITION USING DOCK CONTEXT

Номер: US20120191449A1
Принадлежит: GOOGLE INC.

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for performing speech recognition using dock context. In one aspect, a method includes accessing audio data that includes encoded speech. Information that indicates a docking context of a client device is accessed, the docking context being associated with the audio data. A plurality of language models is identified. At least one of the plurality of language models is selected based on the docking context. Speech recognition is performed on the audio data using the selected language model to identify a transcription for a portion of the audio data. 15-. (canceled)6. A computer-implemented method , comprising:accessing first audio data that includes encoded speech;accessing information that indicates a first docking context of a client device, the first docking context being associated with the first audio data;identifying a plurality of language models;determining that the first docking context indicates docking of the client device with a first docking station of a first type;selecting at least a first language model of the plurality of language models based on determining that the first docking context indicates docking of the client device with the first docking station of the first type;performing speech recognition on the first audio data using the selected first language model to identify a transcription for a portion of the first audio data;accessing second audio data that includes encoded speech;accessing information that indicates a second docking context of the client device, the second docking context being associated with the second audio data;determining that the second docking context indicates docking of the client device with a second docking station of a second type, the second type being different from the first type;selecting at least a second language model of the plurality of language models based on determining that the second docking context ...

Подробнее
08-04-2014 дата публикации

Disambiguation of contact information using historical data

Номер: US0008694313B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.

Подробнее
05-04-2012 дата публикации

PROGRESSIVE ENCODING OF AUDIO

Номер: US20120084089A1
Принадлежит: GOOGLE INC.

The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity.

Подробнее
21-08-2012 дата публикации

Speech and noise models for speech recognition

Номер: US0008249868B2

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
14-08-2012 дата публикации

Performing an estimation on content to be presented

Номер: US0008245130B1

A computer-implemented method for performing an estimation on content to be presented includes parsing content that is to be presented in a graphical user interface. The method includes estimating, based on the parsing, a size aspect that the content will have when presented in the graphical user interface. The method includes recording the estimated size aspect. A system includes a parser parsing content that is to be presented in a graphical user interface. The system includes an estimation module estimating, based on the parsing, a size aspect that the content will have when presented in the graphical user interface. The system records the estimated size aspect.

Подробнее
20-10-2011 дата публикации

GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY

Номер: US20110257974A1
Принадлежит: Google Inc.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
01-04-2014 дата публикации

Disambiguation of contact information using historical and context data

Номер: US0008688450B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information are described. A method includes determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received; determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received; weighting a contact disambiguation grammar according to the probabilities; and processing audio data using the contact disambiguation grammar to select a particular contact.

Подробнее
19-03-2013 дата публикации

Registering an event

Номер: US0008402476B2

A computer-implemented method for registering an event includes detecting occurrence of at least one event to be registered in a sequence. The sequence is to have entries for occurred events, each of the entries being a number indicating at least one of the occurred events and being associated with an aggregation number reflecting a number of times the entry has been aggregated within the sequence. The method includes identifying a new entry for extending the sequence, the new entry comprising a first number corresponding to the detected at least one event. The method includes revising the sequence by adding the numbers of at least two entries whose respective aggregation numbers satisfy a criterion for aggregation. The method includes storing the revised sequence.

Подробнее
08-01-2013 дата публикации

Adjusting language models

Номер: US0008352246B1

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

Подробнее
08-01-2013 дата публикации

Adjusting language models

Номер: US0008352245B1

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data.

Подробнее
12-03-2013 дата публикации

Speech recognition using device docking context

Номер: US0008396709B2

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for performing speech recognition using dock context. In one aspect, a method includes accessing audio data that includes encoded speech. Information that indicates a docking context of a client device is accessed, the docking context being associated with the audio data. A plurality of language models is identified. At least one of the plurality of language models is selected based on the docking context. Speech recognition is performed on the audio data using the selected language model to identify a transcription for a portion of the audio data.

Подробнее
22-11-2012 дата публикации

PREDICTIVE PRE-RECORDING OF AUDIO FOR VOICE INPUT

Номер: US20120296655A1
Принадлежит: GOOGLE, INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes obtaining sensor data from one or more sensors of a mobile device while the mobile device is operating in an inactive state, determining that a user of the mobile device is interacting with the mobile device based on the sensor data, invoking voice input functionality of the mobile device in response to determining that the user of the mobile device is interacting with the mobile device, detecting a voice input, and activating the mobile device in response to detecting the voice input.

Подробнее
11-02-2014 дата публикации

Position and orientation determination for a mobile computing device

Номер: US0008648799B1

For multiple times in a time period, multiple data points can be received from an accelerometer and from a magnetometer that are included in a mobile computing device. For each of the data points, an orientation and a position of the mobile computing device can be determined based on an acceleration output and a magnetometer output that corresponds to the particular time. A trajectory is determined that represents movement of the mobile computing device during the time period based on the determined orientations and positions of the mobile computing device at the multiple times. Information that characterizes the trajectory is compared to stored information that characterizes a set of one or more base trajectories. Based on the comparison, an operation of the mobile computing device is identified that is associated with a trajectory included in the set of one or more base trajectories.

Подробнее
01-11-2012 дата публикации

DISAMBIGUATION OF CONTACT INFORMATION USING HISTORICAL DATA

Номер: US20120278076A1
Принадлежит: GOOGLE INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information are described. A method includes determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received; determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received; weighting a contact disambiguation grammar according to the probabilities; and processing audio data using the contact disambiguation grammar to select a particular contact. 1. A computer-implemented method comprising:determining, for each of multiple communications that were initiated by a user of a mobile device, a time when the communication was initiated or received;determining, for each of multiple contacts associated with the user, a probability associated with the contact based at least on the times when the communications were initiated or received;weighting a contact disambiguation grammar according to the probabilities; andprocessing audio data using the contact disambiguation grammar to select a particular contact.2. The method of claim 1 , comprising:determining, for each of the multiple communications that were initiated by the user of the mobile device, a date when the communication was initiated or received; andwherein the respective probability associated with each of the multiple contacts associated with the user is determined further based on the dates when the communications were initiated or received.3. The method of claim 1 , comprising:determining, for each of the multiple communications that were initiated by the user of the mobile device, a location of the mobile device when the communication was initiated or received; andwherein the respective probability associated with each of the multiple contacts associated with the user is ...

Подробнее
23-04-2013 дата публикации

Metadata-based weighting of geotagged environmental audio for enhanced speech recognition accuracy

Номер: US0008428940B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, identifying a set of geotagged audio signals that correspond to environmental audio associated with the geographic location, weighting each geotagged audio signal of the set of geotagged audio signals based on metadata associated with the respective geotagged audio signal, and using the set of weighted geotagged audio signals to perform noise compensation on the audio signal that corresponds to the utterance.

Подробнее
26-07-2012 дата публикации

SPEECH RECOGNITION USING DOCK CONTEXT

Номер: US20120191448A1
Принадлежит: GOOGLE INC.

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for performing speech recognition using dock context. In one aspect, a method includes accessing audio data that includes encoded speech. Information that indicates a docking context of a client device is accessed, the docking context being associated with the audio data. A plurality of language models is identified. At least one of the plurality of language models is selected based on the docking context. Speech recognition is performed on the audio data using the selected language model to identify a transcription for a portion of the audio data. 1. A computer-implemented method , comprising:receiving, at a server system, audio data that includes encoded speech, the encoded speech having been detected by a client device;receiving, at the server system, information that indicates a docking context of the client device while the speech encoded in the audio data was detected by the client device;identifying a plurality of language models, each of the plurality of language models indicating a probability of an occurrence of a term in a sequence of terms based on other terms in the sequence;for each of the plurality of language models, determining a weighting value to assign to the language model based on the docking context by accessing a stored weighting value associated with the docking context, the weighting value indicating a probability that using the language model will generate a correct transcription of the encoded speech;selecting at least one of the plurality of language models based on the assigned weighting values; andperforming speech recognition on the audio data using the selected language model to identify a transcription for a portion of the audio data.2. The computer-implemented method of claim 1 , wherein the docking context indicates a type of docking station to which the client device was connected while the speech encoded in the audio data was ...

Подробнее
15-12-2011 дата публикации

Speech and Noise Models for Speech Recognition

Номер: US20110307253A1
Принадлежит: GOOGLE INC.

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
16-08-2016 дата публикации

Disambiguation of a spoken query term

Номер: US0009418177B1
Принадлежит: Google Inc., GOOGLE INC

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for processing spoken query terms. In one aspect, a method includes performing speech recognition on an audio signal to select two or more textual, candidate transcriptions that match a spoken query term, and to establish a speech recognition confidence value for each candidate transcription, obtaining a search history for a user who spoke the spoken query term, where the search history references one or more past search queries that have been submitted by the user, generating one or more n-grams from each candidate transcription, where each n-gram is a subsequence of n phonemes, syllables, letters, characters, words or terms from a respective candidate transcription, and determining, for each n-gram, a frequency with which the n-gram occurs in the past search queries, and a weighting value that is based on the respective frequency.

Подробнее
13-08-2013 дата публикации

Progressive encoding of audio

Номер: US0008509931B2

The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity.

Подробнее
26-01-2012 дата публикации

ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION

Номер: US20120022869A1
Принадлежит: GOOGLE, INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
26-02-2013 дата публикации

Disambiguation of contact information using historical data

Номер: US0008386250B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disambiguating contact information. A method includes receiving an audio signal, generating an affinity score based on a frequency with which a user has previously communicated with a contact associated with an item of contact information, and further based on a recency of one or more past interactions between the user and the contact associated with the item of contact information, inferring a probability that the user intends to initiate a communication using the item of contact information based on the affinity score generated for the item of contact information, and generating a communication initiation grammar.

Подробнее
25-03-2014 дата публикации

Geotagged environmental audio for enhanced speech recognition accuracy

Номер: US0008682659B2
Принадлежит: Google Inc., GOOGLE INC, GOOGLE INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, identifying a set of geotagged audio signals that correspond to environmental audio associated with the geographic location, weighting each geotagged audio signal of the set of geotagged audio signals based on metadata associated with the respective geotagged audio signal, and using the set of weighted geotagged audio signals to perform noise compensation on the audio signal that corresponds to the utterance.

Подробнее
19-02-2014 дата публикации

Registering an event

Номер: CN103593278A
Принадлежит:

A computer-implemented method for registering an event includes detecting occurrence of at least one event to be registered in a sequence. The sequence is to have entries for occurred events, each of the entries being a number indicating at least one of the occurred events and being associated with an aggregation number reflecting a number of times the entry has been aggregated within the sequence. The method includes identifying a new entry for extending the sequence, the new entry comprising a first number corresponding to the detected at least one event. The method includes revising the sequence by adding the numbers of at least two entries whose respective aggregation numbers satisfy a criterion for aggregation. The method includes storing the revised sequence.

Подробнее
24-07-2012 дата публикации

Content item arrangement

Номер: US0008229915B1

A determination is made as to a first series of location rankings for one or more content item locations based on content item location data. Another determination is made as to a first series of content item rankings for one or more content items based on content item data. One of the content items having a first content item ranking is associated with one of the content item locations having a first location ranking.

Подробнее
31-07-2012 дата публикации

Speech and noise models for speech recognition

Номер: US0008234111B2

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
26-01-2012 дата публикации

Speech and Noise Models for Speech Recognition

Номер: US20120022860A1
Принадлежит: GOOGLE INC.

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
13-11-2012 дата публикации

Content item location arrangement

Номер: US0008311875B1

One or more content items associated with a content property are identified, each of the one or more content items associated with one or more performance measures. A rank score is determined for each of the one or more content items. One or more locations are identified for display proximate to the one or more content items based on the rank score for each of the one or more content items, and one or more other content items are provided for display in each of the one or more content item locations.

Подробнее
01-12-2011 дата публикации

ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION

Номер: US20110295590A1
Принадлежит: Google Inc.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
28-08-2012 дата публикации

Position and orientation determination for a mobile computing device

Номер: US0008253684B1

For multiple times in a time period, multiple data points can be received from an accelerometer and from a magnetometer that are included in a mobile computing device. For each of the data points, an orientation and a position of the mobile computing device can be determined based on an acceleration output and a magnetometer output that corresponds to the particular time. A trajectory is determined that represents movement of the mobile computing device during the time period based on the determined orientations and positions of the mobile computing device at the multiple times. Information that characterizes the trajectory is compared to stored information that characterizes a set of one or more base trajectories. Based on the comparison, an operation of the mobile computing device is identified that is associated with a trajectory included in the set of one or more base trajectories.

Подробнее
10-07-2012 дата публикации

Acoustic model adaptation using geographic information

Номер: US0008219384B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location.

Подробнее
05-04-2012 дата публикации

PROGRESSIVE ENCODING OF AUDIO

Номер: US20120083910A1
Принадлежит: GOOGLE INC.

The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity.

Подробнее
22-11-2012 дата публикации

GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY

Номер: US20120296643A1
Принадлежит: GOOGLE, INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, identifying a set of geotagged audio signals that correspond to environmental audio associated with the geographic location, weighting each geotagged audio signal of the set of geotagged audio signals based on metadata associated with the respective geotagged audio signal, and using the set of weighted geotagged audio signals to perform noise compensation on the audio signal that corresponds to the utterance.

Подробнее
11-09-2012 дата публикации

Geotagged environmental audio for enhanced speech recognition accuracy

Номер: US0008265928B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
06-08-2013 дата публикации

Predictive pre-recording of audio for voice input

Номер: US0008504185B2

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for providing predictive pre-recording of audio for voice input. In one aspect, a method includes obtaining sensor data from one or more sensors of a mobile device while the mobile device is operating in an inactive state, determining that a user of the mobile device is interacting with the mobile device based on the sensor data, invoking voice input functionality of the mobile device in response to determining that the user of the mobile device is interacting with the mobile device, detecting a voice input, and activating the mobile device in response to detecting the voice input.

Подробнее
12-09-2013 дата публикации

GEOTAGGED ENVIRONMENTAL AUDIO FOR ENHANCED SPEECH RECOGNITION ACCURACY

Номер: US20130238325A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, identifying a set of geotagged audio signals that correspond to environmental audio associated with the geographic location, weighting each geotagged audio signal of the set of geotagged audio signals based on metadata associated with the respective geotagged audio signal, and using the set of weighted geotagged audio signals to perform noise compensation on the audio signal that corresponds to the utterance. 1. A system comprising:one or more computers; and receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations,', 'receiving an audio signal that corresponds to an utterance recorded by a particular mobile device,', 'determining a particular geographic location associated with the particular mobile device,', 'generating a noise model for the particular geographic location using a subset of the geotagged audio signals, and', 'performing noise compensation on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location., 'a computer-readable medium coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising2. The system of claim 1 , wherein the operations further comprise performing speech recognition on the utterance using the noise-compensated audio signal.3. The system of claim 1 , wherein generating the noise model further comprises generating the noise model before receiving the audio signal that corresponds to the utterance.4. The system of claim 1 , ...

Подробнее
07-11-2013 дата публикации

ACOUSTIC MODEL ADAPTATION USING GEOGRAPHIC INFORMATION

Номер: US20130297313A1
Принадлежит:

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving an audio signal that corresponds to an utterance recorded by a mobile device, determining a geographic location associated with the mobile device, adapting one or more acoustic models for the geographic location, and performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location. 1. A system comprising:one or more computers; and receiving an audio signal that corresponds to an utterance recorded by a mobile device,', 'determining a geographic location associated with the mobile device,', 'adapting one or more acoustic models for the geographic location, and', 'performing speech recognition on the audio signal using the one or more acoustic models model that are adapted for the geographic location., 'one or more computer-readable media coupled to the one or more computers having instructions stored thereon which, when executed by the one or more computers, cause the one or more computers to perform operations comprising2. The system of claim 1 , wherein adapting one or more acoustic models further comprises adapting one or more acoustic models before receiving the audio signal that corresponds to the utterance.3. The system of claim 1 , wherein adapting one or more acoustic models further comprises adapting one or more acoustic models after receiving the audio signal that corresponds to the utterance.4. The system of claim 1 , wherein:the operations further comprise receiving geotagged audio signals that correspond to audio recorded by multiple mobile devices in multiple geographic locations; andadapting one or more acoustic models for the geographic location further comprises adapting one or more acoustic models for the geographic location using a subset of the geotagged audio signals.5. The system of claim 4 ...

Подробнее
24-09-2015 дата публикации

ADJUSTING LANGUAGE MODELS

Номер: US20150269938A1
Автор: Lloyd Matthew I.
Принадлежит:

Methods, systems, and apparatuses, including computer programs encoded on a computer storage medium, for adjusting language models. In one aspect, a method includes accessing audio data. Information that indicates a first context is accessed, the first context being associated with the audio data. At least one term is accessed. Information that indicates a second context is accessed, the second context being associated with the term. A similarity score is determined that indicates a degree of similarity between the second context and the first context. A language model is adjusted based on the accessed term and the determined similarity score to generate an adjusted language model. Speech recognition is performed on the audio data using the adjusted language model to select one or more candidate transcriptions for a portion of the audio data. 120-. (canceled)21. A method performed by data processing apparatus , the method comprising:receiving audio data and context data associated with the audio data;identifying a topic based on a comparison of the received context data with second context data;adjusting a language model based on the identified topic to adjust a likelihood that the language model indicates for one or more terms associated with the topic;determining a transcription of the audio data using the adjusted language model; andoutputting the transcription determined using the adjusted language model.22. The method of claim 21 , wherein identifying the topic based on the comparison of the received context data with second context data includes:identifying the topic based on a comparison of the received context data with context data associated with a particular term that is associated with the topic.23. The method of claim 22 , wherein identifying the topic based on the comparison of the received context data with context data associated with the particular term that is associated with the topic includes:identifying the topic based on a comparison of the ...

Подробнее
10-07-2019 дата публикации

Disambiguation of contact information using historical data

Номер: EP3313055B1
Принадлежит: Google LLC

Подробнее
24-04-2019 дата публикации

Speech recognition using device docking context

Номер: EP2666159B1
Принадлежит: Google LLC

Подробнее
20-10-2011 дата публикации

Geotagged environmental audio for enhanced speech recognition accuracy

Номер: WO2011129954A1
Принадлежит: GOOGLE INC.

Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for enhancing speech recognition accuracy. In one aspect, a method includes receiving geotagged audio signals that correspond to environmental audio recorded by multiple mobile devices in multiple geographic locations, receiving an audio signal that corresponds to an utterance recorded by a particular mobile device, determining a particular geographic location associated with the particular mobile device, generating a noise model for the particular geographic location using a subset of the geotagged audio signals, where noise compensation is performed on the audio signal that corresponds to the utterance using the noise model that has been generated for the particular geographic location.

Подробнее
17-04-2013 дата публикации

Speech and noise models for speech recognition

Номер: EP2580751A1
Принадлежит: Google LLC

An audio signal generated by a device based on audio input from a user may be received. The audio signal may include at least a user audio portion that corresponds to one or more user utterances recorded by the device. A user speech model associated with the user may be accessed and a determination may be made background audio in the audio signal is below a defined threshold. In response to determining that the background audio in the audio signal is below the defined threshold, the accessed user speech model may be adapted based on the audio signal to generate an adapted user speech model that models speech characteristics of the user. Noise compensation may be performed on the received audio signal using the adapted user speech model to generate a filtered audio signal with reduced background audio compared to the received audio signal.

Подробнее
31-01-2024 дата публикации

Noise compensation using geotagged audio signals

Номер: EP3923281B1
Принадлежит: Google LLC

Подробнее
14-06-2012 дата публикации

Progressive encoding of audio

Номер: WO2012050784A3
Принадлежит: GOOGLE INC.

The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity.

Подробнее
19-04-2012 дата публикации

Progressive encoding of audio

Номер: WO2012050784A2
Принадлежит: GOOGLE INC.

The present disclosure includes processing a signal to generate a first sub-set of data, transmitting the first sub-set of data for generation of a reconstructed audio signal, the reconstructed audio signal having a fidelity relative to the signal, processing the signal to generate a second sub-set of data and a third sub-set of data, the second sub-set of data defining a second portion of the signal and comprising data that is different than data of the first sub-set of data, and the third sub-set of data defining a third portion of the signal and comprising data that is different than data of the first and second sub-sets of data, comparing a priority of the second sub-set of data to a priority of the third sub-set of data, and transmitting one of the second sub-set of data and the third sub-set of data over the network for improving the fidelity.

Подробнее