Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 3. Отображено 3.
16-06-2016 дата публикации

SYSTEMS AND METHODS FOR SPEECH TRANSCRIPTION

Номер: US20160171974A1
Принадлежит: Baidu USA LLC

Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained. Embodiments of the system can also handle challenging noisy environments better than widely used, state-of-the-art commercial speech systems. 1. A computer-implemented method for training a transcription model , the method comprising: inputting an utterance that comprises a set of spectrogram frames into a first layer of the transcription model that evaluates each of the spectrogram frames from the set of spectrogram frames with a context of one or more spectrogram frames;', 'outputting from the transcription model a predicted character or character probabilities for the utterance; and', 'computing a loss to measure error in prediction for the utterance;, 'for each of a set of utterancesevaluating a gradient of predicted outputs of the transcription model given the ground-truth characters; andupdating the neural network model using back-propagation.2. The computer-implemented method of further comprising:jittering at least some of the set of utterances prior to inputting into the transcription model.3. The computer-implemented method of wherein the step of jittering at least some of the ...

Подробнее
05-12-2019 дата публикации

DEEP LEARNING MODELS FOR SPEECH RECOGNITION

Номер: US20190371298A1
Принадлежит: Baidu USA LLC

Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained. Embodiments of the system can also handle challenging noisy environments better than widely used, state-of-the-art commercial speech systems. 1. A computer-implemented method for training a transcription neural network , the method comprising:inputting an utterance that comprises a set of spectrogram frames covering time steps of the utterance into a first layer of the transcription neural network that evaluates, for each time step of a set of time steps, a spectrogram frame from the set of spectrogram frames and an associated context of one or more spectrogram frames;obtaining predicted character probabilities for the utterance from the transcription neural network;using the predicted character probabilities for the utterance and a corresponding ground truth transcription for the utterance to determine a loss in predicting the corresponding ground truth transcription for the utterance; andupdating one or more parameters of the transcription neural network using a gradient based upon the loss in predicating the utterance.2. The computer-implemented method of further comprising:jittering ...

Подробнее
21-01-2020 дата публикации

Systems and methods for speech transcription

Номер: US10540957B2
Принадлежит: Baidu USA LLC

Presented herein are embodiments of state-of-the-art speech recognition systems developed using end-to-end deep learning. In embodiments, the model architecture is significantly simpler than traditional speech systems, which rely on laboriously engineered processing pipelines; these traditional systems also tend to perform poorly when used in noisy environments. In contrast, embodiments of the system do not need hand-designed components to model background noise, reverberation, or speaker variation, but instead directly learn a function that is robust to such effects. A phoneme dictionary, nor even the concept of a “phoneme,” is needed. Embodiments include a well-optimized recurrent neural network (RNN) training system that can use multiple GPUs, as well as a set of novel data synthesis techniques that allows for a large amount of varied data for training to be efficiently obtained. Embodiments of the system can also handle challenging noisy environments better than widely used, state-of-the-art commercial speech systems.

Подробнее