Поиск патентов

Настройки

Глубина выборки

Укажите год

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Ключевые слова. Может быть несколько по одной на строку

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка

Автор

Ведите корректный номера.

Владелец

Ведите корректный номера.

Классы IPC

Ведите корректный номера.

Классы CPC

Ведите корректный номера.

Начиная с года

Укажите год

Заканчивая годом

Укажите год

Применить Всего найдено 103. Отображено 103.

31-05-2011 дата публикации

Shape and scale parameters for extended-band frequency coding

Номер: US0007953604B2

Автор: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida, Chao He, MEHROTRA SANJEEV, CHEN WEI-GE, KOISHIDA KAZUHITO, HE CHAO

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

An audio encoder performs frequency extension coding that comprises determining one or more shape parameters using a displacement vector that corresponds to a displacement of an even number (e.g., an even number of sub-bands between a sub-band in a baseband frequency range and a sub-band in an extended-band frequency range). The shape parameters can be determined on a per-audio-block basis. Restricting a displacement to an even number (in frequency extension coding or in other signal modulation schemes) can improve the quality of reconstructed audio. An audio encoder also can perform frequency extension coding that comprises determining one or more scale parameters at one or more audio blocks, and determining one or more anchor points for interpolating the one or more scale parameters.

Подробнее

Номер записи: 1

11-07-2019 дата публикации

CONTINUOUS MOTION CONTROLS OPERABLE USING NEUROLOGICAL DATA

Номер: US20190212810A1

Автор: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida, KESKIN CEM, SHAHID KHURAM, CHAU BILL, KIM JAEYOUN, KOISHIDA KAZUHITO, Keskin, Cem, Shahid, Khuram, Chau, Bill, Kim, Jaeyoun, Koishida, Kazuhito

Принадлежит: Microsoft Technology Licensing LLC

Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.

Подробнее

Номер записи: 2

10-08-2010 дата публикации

Coding of sparse digital media spectral data

Номер: US0007774205B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, CHEN WEI-GE

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.

Подробнее

Номер записи: 3

01-01-2008 дата публикации

LPC-harmonic vocoder with superframe structure

Номер: US0007315815B1

Автор: Allen Gersho, Vladimir Cuperman, Tian Wang, Kazuhito Koishida, GERSHO ALLEN, CUPERMAN VLADIMIR, WANG TIAN, KOISHIDA KAZUHITO

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

An enhanced low-bit rate parametric voice coder that groups a number of frames from an underlying frame-based vocoder, such as MELP, into a superframe structure. Parameters are extracted from the group of underlying frames and quantized into the superframe which allows the bit rate of the underlying coding to be reduced without increasing the distortion. The speech data coded in the superframe structure can then be directly synthesized to speech or may be transcoded to a format so that an underlying frame-based vocoder performs the synthesis. The superframe structure includes additional error detection and correction data to reduce the distortion caused by the communication of bit errors.

Подробнее

Номер записи: 4

23-02-2010 дата публикации

Audio encoding and decoding with intra frames and adaptive forward error correction

Номер: US0007668712B2

Автор: Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen, Mu Han, WANG TIAN, KHALIL HOSAM A, KOISHIDA KAZUHITO, CHEN WEI-GE, HAN MU, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [FEC], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.

Подробнее

Номер записи: 5

31-03-2015 дата публикации

Query and matching for content recognition

Номер: US0008996557B2

Автор: Kazuhito Koishida, David Nister, Ian Simon, Tom Butcher, KOISHIDA KAZUHITO, NISTER DAVID, SIMON IAN, BUTCHER TOM

Принадлежит: Microsoft Technology Licensing, LLC, KOISHIDA KAZUHITO, NISTER DAVID, SIMON IAN, BUTCHER TOM, MICROSOFT TECHNOLOGY LICENSING LLC, MICROSOFT TECHNOLOGY LICENSING, LLC

Various embodiments enable audio data, such as music data, to be captured, by a device, from a background environment and processed to formulate a query that can then be transmitted to a content recognition service. In one or more embodiments, multiple queries are transmitted to the content recognition service. In at least some embodiments, subsequent queries can progressively incorporate previous queries plus additional data that is captured. In one or more embodiments, responsive to receiving the query, the content recognition service can employ a multi-stage matching technique to identify content items responding to the query. This matching technique can be employed as queries are progressively received.

Подробнее

Номер записи: 6

04-02-2014 дата публикации

Bitstream syntax for multi-process audio decoding

Номер: US0008645146B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: Microsoft Corporation, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE, MICROSOFT CORP, MICROSOFT CORPORATION

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 7

20-05-2010 дата публикации

AUDIO ENCODING AND DECODING WITH INTRA FRAMES AND ADAPTIVE FORWARD ERROR CORRECTION

Номер: US20100125455A1

Автор: Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen, Mu Han, WANG TIAN, KHALIL HOSAM A, KOISHIDA KAZUHITO, CHEN WEI-GE, HAN MU, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation

Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [FEC], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.

Подробнее

Номер записи: 8

18-12-2008 дата публикации

CODING OF SPARSE DIGITAL MEDIA SPECTRAL DATA

Номер: US20080312758A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen

Принадлежит: Microsoft Corporation

An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.

Подробнее

Номер записи: 9

22-11-2012 дата публикации

Background Audio Listening for Content Recognition

Номер: US20120296458A1

Автор: Kazuhito Koishida, David Nister, Ian Simon, Tom Butcher, KOISHIDA KAZUHITO, NISTER DAVID, SIMON IAN, BUTCHER TOM

Принадлежит: Microsoft Corporation

Various embodiments enable audio data, such as music data, to be captured, by a device, from a background environment and processed to formulate a query that can then be transmitted to a content recognition service. In one or more embodiments, the audio data is captured prior to receiving user input associated with audio data capture, e.g., launch of an application associated with the content recognition service, provision of user input proactively indicating that audio data capture is desired, and the like. Responsive to transmitting the query, displayable information associated with the audio data is returned by the content recognition service and can be consumed by the device.

Подробнее

Номер записи: 10

22-11-2012 дата публикации

Query and Matching for Content Recognition

Номер: US20120296938A1

Автор: Kazuhito Koishida, David Nister, Ian Simon, Tom Butcher, KOISHIDA KAZUHITO, NISTER DAVID, SIMON IAN, BUTCHER TOM

Принадлежит: Microsoft Corporation

Various embodiments enable audio data, such as music data, to be captured, by a device, from a background environment and processed to formulate a query that can then be transmitted to a content recognition service. In one or more embodiments, multiple queries are transmitted to the content recognition service. In at least some embodiments, subsequent queries can progressively incorporate previous queries plus additional data that is captured. In one or more embodiments, responsive to receiving the query, the content recognition service can employ a multi-stage matching technique to identify content items responding to the query. This matching technique can be employed as queries are progressively received.

Подробнее

Номер записи: 11

14-07-2009 дата публикации

Modification of codewords in dictionary used for efficient coding of digital media spectral data

Номер: US0007562021B2

Автор: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida, MEHROTRA SANJEEV, CHEN WEI-GE, KOISHIDA KAZUHITO

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Coding of spectral data by representing certain portions of the spectral data as a scaled version of a code-vector, where the code-vector is chosen from either a fixed predetermined codebook or a codebook taken from a baseband. Various optional features are described for modifying the code-vectors in the codebook according to some rules which allow the code-vector to better represent the data they are modeling. The code-vector modification comprises a linear or non-linear transform of one or more code-vectors, such as, by exponentiation, negation, reversing, or combining elements from plural code-vectors.

Подробнее

Номер записи: 12

18-01-2007 дата публикации

Modification of codewords in dictionary used for efficient coding of digital media spectral data

Номер: US20070016414A1

Автор: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida

Принадлежит: Microsoft Corporation

Coding of spectral data by representing certain portions of the spectral data as a scaled version of a code-vector, where the code-vector is chosen from either a fixed predetermined codebook or a codebook taken from a baseband. Various optional features are described for modifying the code-vectors in the codebook according to some rules which allow the code-vector to better represent the data they are modeling. The code-vector modification comprises a linear or non-linear transform of one or more code-vectors, such as, by exponentiation, negation, reversing, or combining elements from plural code-vectors.

Подробнее

Номер записи: 13

13-10-2005 дата публикации

Robust real-time speech codec

Номер: US20050228651A1

Автор: Tian Wang, Hosam Khalil, Kazuhito Koishida, Wei-Ge Chen, Mu Han

Принадлежит: Microsoft Corporation.

Various strategies for rate/quality control and loss resiliency in an audio codec are described. The various strategies can be used in combination or independently. For example, a real-time speech codec uses intra frame coding/decoding, adaptive multi-mode forward error correction [“FEC”], and rate/quality control techniques. Intra frames help a decoder recover quickly from packet losses, while compression efficiency is still emphasized with predicted frames. Various strategies for inserting intra frames and signaling intra/predicted frames are described. With the adaptive multi-mode FEC, an encoder adaptively selects between multiple modes to efficiently and quickly provide a level of FEC that takes into account the bandwidth currently available for FEC. The FEC information itself may be predictively encoded and decoded relative to primary encoded information. Various rate/quality and FEC control strategies allow additional adaptation to available bandwidth and network conditions.

Подробнее

Номер записи: 14

15-09-2009 дата публикации

Robust decoder

Номер: US0007590531B2

Автор: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen, KHALIL HOSAM A, WANG TIAN, KOISHIDA KAZUHITO, SUN XIAOQIN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Подробнее

Номер записи: 15

09-10-2007 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: US0007280960B2

Автор: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen, WANG TIAN, KOISHIDA KAZUHITO, KHALIL HOSAM A, SUN XIAOQIN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 16

01-10-2009 дата публикации

LOSSLESS AND NEAR LOSSLESS SCALABLE AUDIO CODEC

Номер: US20090248424A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Radhika Jandhyala

Принадлежит: Microsoft Corporation

A scalable audio codec encodes an input audio signal as a base layer at a high compression ratio and one or more residual signals as an enhancement layer of a compressed bitstream, which permits a lossless or near lossless reconstruction of the input audio signal at decoding. The scalable audio codec uses perceptual transform coding to encode the base layer. The residual is calculated in a transform domain, which includes a frequency and possibly also multi-channel transform of the input audio. For lossless reconstruction, the frequency and multi-channel transforms are reversible.

Подробнее

Номер записи: 17

30-11-2006 дата публикации

Audio codec post-filter

Номер: US20060271354A1

Автор: Xiaoqin Sun, Tian Wang, Hosam Khalil, Kazuhito Koishida, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools are described for processing reconstructed audio signals. For example, a reconstructed audio signal is filtered in the time domain using filter coefficients that are calculated, at least in part, in the frequency domain. As another example, producing a set of filter coefficients for filtering a reconstructed audio signal includes clipping one or more peaks of a set of coefficient values. As yet another example, for a sub-band codec, in a frequency region near an intersection between two sub-bands, a reconstructed composite signal is enhanced.

Подробнее

Номер записи: 18

01-01-2009 дата публикации

BITSTREAM SYNTAX FOR MULTI-PROCESS AUDIO DECODING

Номер: US20090006103A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen

Принадлежит: Microsoft Corporation

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 19

11-11-2003 дата публикации

Rate control strategies for speech and music coding

Номер: US0006647366B2

Автор: Tian Wang, Kazuhito Koishida, Vladimir Cuperman, WANG TIAN, KOISHIDA KAZUHITO, CUPERMAN VLADIMIR

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

A method and a system are provided for controlling the coding rates of a multimode coding system with respect to a sequence of input audio signal frames. The method eliminates or minimizes the overflow and underflow of a bit-stream buffer maintained by the coding system for temporarily recording bit-stream data prior to transmission or storage.

Подробнее

Номер записи: 20

30-11-2006 дата публикации

Robust decoder

Номер: US20060271359A1

Автор: Hosam Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Подробнее

Номер записи: 21

21-07-2020 дата публикации

Location-based audio messaging

Номер: US0010721594B2

Автор: Raja Bose, Hiroshi Horii, Jonathan Lester, Ruchita Bhargava, Kazuhito Koishida, Michelle L. Holtmann, Christina Chen, BOSE RAJA, HORII HIROSHI, LESTER JONATHAN, BHARGAVA RUCHITA, KOISHIDA KAZUHITO, HOLTMANN MICHELLE L, CHEN CHRISTINA, Bose, Raja, Horii, Hiroshi, Lester, Jonathan, Bhargava, Ruchita, Koishida, Kazuhito, Holtmann, Michelle L., Chen, Christina

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC

Mobile devices provide a variety of techniques for presenting messages from sources to a user. However, when the message pertains to the presence of the user at a location, the available communications techniques may exhibit deficiencies, e.g., reliance on the memory of the source and/or user of the existence and content of a message between its initiation and the user's visit to the location, or reliance on the communication accessibility of the user, the device, and/or the source during the user's location visit. Presented herein are techniques for enabling a mobile device, at a first time, to receive a request to present an audio message during the presence of the user at a location; and, at a second time, detecting the presence of the user at the location, and presenting the audio message to the user, optionally without awaiting a request from the user to present the message.

Подробнее

Номер записи: 22

08-02-2011 дата публикации

Bitstream syntax for multi-process audio decoding

Номер: US0007885819B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 23

30-11-2006 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: US20060271357A1

Автор: Tian Wang, Kazuhito Koishida, Hosam Khalil, Xiaoqin Sun, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 24

09-11-2010 дата публикации

Robust decoder

Номер: US0007831421B2

Автор: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen, KHALIL HOSAM A, WANG TIAN, KOISHIDA KAZUHITO, SUN XIAOQIN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Подробнее

Номер записи: 25

07-04-2005 дата публикации

LPC-harmonic vocoder with superframe structure

Номер: US20050075869A1

Автор: Allen Gersho, Vladimir Cuperman, Tian Wang, Kazuhito Koishida

Принадлежит: Microsoft Corporation

An enhanced_low-bit rate parametric voice coder that groups a number of frames from an underlying frame-based vocoder, such as MELP, into a superframe structure. Parameters are extracted from the group of underlying frames and quantized into the superframe which allows the bit rate of the underlying coding to be reduced without increasing the distortion. The speech data coded in the superframe structure can then be directly synthesized to speech or may be transcoded to a format so that an underlying frame-based vocoder performs the synthesis. The superframe structure includes additional error detection and correction data to reduce the distortion caused by the communication of bit errors.

Подробнее

Номер записи: 26

14-02-2008 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: US20080040121A1

Автор: Tian Wang, Kazuhito Koishida, Hosam Khalil, Xiaoqin Sun, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 27

30-11-2006 дата публикации

SUB-BAND VOICE CODEC WITH MULTI-STAGE CODEBOOKS AND REDUNDANT CODING

Номер: US20060271355A1

Автор: Tian Wang, Kazuhito Koishida, Hosam Khalil, Xiaoqin Sun, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 28

28-08-2012 дата публикации

Bitstream syntax for multi-process audio decoding

Номер: US0008255229B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: Microsoft Corporation, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE, MICROSOFT CORP, MICROSOFT CORPORATION

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 29

15-12-2005 дата публикации

Gain constrained noise suppression

Номер: US20050278172A1

Автор: Kazuhito Koishida, Feng Zhuge, Hosam Khalil, Tian Wang, Wei-ge Chen

Принадлежит: Microsoft Corporation

A gain-constrained noise suppression for speech more precisely estimates noise, including during speech, to reduce musical noise artifacts introduced from noise suppression. The noise suppression operates by applying a spectral gain G(m, k) to each short-time spectrum value S(m, k) of a speech signal, where m is the frame number and k is the spectrum index. The spectrum values are grouped into frequency bins, and a noise characteristic estimated for each bin classified as a “noise bin.” An energy parameter is smoothed in both the time domain and the frequency domain to improve noise estimation per bin. The gain factors G(m, k) are calculated based on the current signal spectrum and the noise estimation, then smoothed before being applied to the signal spectral values S(m, k). First, a noisy factor is computed based on a ratio of the number of noise bins to the total number of bins for the current frame, where a zero-valued noisy factor means only using constant gain for all the spectrum ...

Подробнее

Номер записи: 30

25-08-2016 дата публикации

BITSTREAM SYNTAX FOR MULTI-PROCESS AUDIO DECODING

Номер: US20160247515A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE, Koishida Kazuhito, Mehrotra Sanjeev, He Chao, Chen Wei-Ge

Принадлежит: Microsoft Technology Licensing, LLC

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 31

18-12-2008 дата публикации

FLEXIBLE FREQUENCY AND TIME PARTITIONING IN PERCEPTUAL TRANSFORM CODING OF AUDIO

Номер: US20080312759A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen

Принадлежит: Microsoft Corporation

An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.

Подробнее

Номер записи: 32

26-07-2007 дата публикации

Shape and scale parameters for extended-band frequency coding

Номер: US20070174063A1

Автор: Sanjeev Mehrotra, Wei-Ge Chen, Kazuhito Koishida, Chao He

Принадлежит: Microsoft Corporation

An audio encoder performs frequency extension coding that comprises determining one or more shape parameters using a displacement vector that corresponds to a displacement of an even number (e.g., an even number of sub-bands between a sub-band in a baseband frequency range and a sub-band in an extended-band frequency range). The shape parameters can be determined on a per-audio-block basis. Restricting a displacement to an even number (in frequency extension coding or in other signal modulation schemes) can improve the quality of reconstructed audio. An audio encoder also can perform frequency extension coding that comprises determining one or more scale parameters at one or more audio blocks, and determining one or more anchor points for interpolating the one or more scale parameters.

Подробнее

Номер записи: 33

08-03-2011 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: US0007904293B2

Автор: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen, WANG TIAN, KOISHIDA KAZUHITO, KHALIL HOSAM A, SUN XIAOQIN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 34

06-09-2016 дата публикации

Sound source localization using phase spectrum

Номер: US0009435873B2

Автор: Shankar Regunathan, Kazuhito Koishida, Harshavardhana Narayana Kikkeri, REGUNATHAN SHANKAR, KOISHIDA KAZUHITO, KIKKERI HARSHAVARDHANA NARAYANA, Regunathan Shankar, Koishida Kazuhito, Kikkeri Harshavardhana Narayana

Принадлежит: Microsoft Technology Licensing, LLC, REGUNATHAN SHANKAR, KOISHIDA KAZUHITO, KIKKERI HARSHAVARDHANA NARAYANA, MICROSOFT TECHNOLOGY LICENSING LLC, Regunathan Shankar, Koishida Kazuhito, Kikkeri Harshavardhana Narayana

An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.

Подробнее

Номер записи: 35

30-07-2015 дата публикации

BITSTREAM SYNTAX FOR MULTI-PROCESS AUDIO DECODING

Номер: US20150213804A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen

Принадлежит: Microsoft Technology Licensing, LLC

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique. 1. A method of decoding a compressed audio bitstream containing syntax elements conforming to a bitstream syntax , the bitstream syntax defining a base coding layer and a channel extension coding layer for coding a portion of audio content using a channel extension coding , the method comprising:reading the base coding layer and channel extension coding layer of the compressed audio bitstream;parsing a plurality of syntax elements from the channel extension coding layer specifying parameters used in the channel extension coding; andprocessing coded audio content of the channel extension coding layer to reconstruct the portion of audio content in an output audio signal.2. The method of wherein the parameters comprise a band configuration parameterization claim 1 , which comprises a number of bands claim 1 , a size relation among bands claim 1 , and a starting band of the channel extension coding.3. The method of wherein the parameters comprise reverb control parameters claim 1 , which comprise a scaling of an effect signal claim 1 , and filter tap at which the effect signal is generated.4. The method of wherein the parameters comprise channel correlation parameters claim 1 , which comprise choice of an LMRM parameterization or normalized correlation matrix parameterization from which a channel correlation matrix is derived.5. The method of wherein the parameters are associated with control of an automatic scale adjustment.6. The method of wherein the parameters comprise a prediction type from among no prediction claim 1 , prediction across time claim 1 ...

Подробнее

Номер записи: 36

05-05-2015 дата публикации

Bitstream syntax for multi-process audio decoding

Номер: US0009026452B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT CORP, MICROSOFT TECHNOLOGY LICENSING LLC, MICROSOFT CORPORATION, MICROSOFT TECHNOLOGY LICENSING, LLC

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 37

20-12-2012 дата публикации

BITSTREAM SYNTAX FOR MULTI-PROCESS AUDIO DECODING

Номер: US20120323584A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: Microsoft Corporation

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique. 1. A method of decoding a compressed audio bitstream containing syntax elements conforming to a bitstream syntax , the bitstream syntax defining a channel extension coding layer for coding a portion of audio content using a channel extension coding , the method comprising:reading the channel extension coding layer of the compressed audio bitstream;parsing a plurality of syntax elements from the channel extension coding layer specifying parameters used in the channel extension coding; andprocessing coded audio content of the channel extension coding layer to reconstruct the portion of audio content in an output audio signal.2. The method of claim 1 , further comprising playing the output audio signal.3. The method of claim 1 , wherein the parameters comprise a band configuration parameterization including a number of bands claim 1 , a size relation among bands claim 1 , and a starting band of the channel extension coding.4. The method of claim 1 , wherein the parameters comprise reverb control parameters claim 1 , which comprise a scaling of an effect signal and a filter tap at which the effect signal is generated.5. The method of claim 1 , wherein the parameters comprise channel correlation parameters claim 1 , which comprise a choice of an LMRM parameterization or normalized correlation matrix parameterization from which a channel correlation matrix is derived.6. The method of claim 1 , wherein the parameters are associated with an automatic scale adjustment.7. The method of claim 1 , wherein the parameters comprise a prediction type from among no ...

Подробнее

Номер записи: 38

14-02-2008 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: US20080040105A1

Автор: Tian Wang, Kazuhito Koishida, Hosam Khalil, Xiaoqin Sun, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 39

26-02-2013 дата публикации

Lossless and near lossless scalable audio codec

Номер: US0008386271B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Radhika Jandhyala, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, JANDHYALA RADHIKA

Принадлежит: Microsoft Corporation, MICROSOFT CORP, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, JANDHYALA RADHIKA, MICROSOFT CORPORATION

A scalable audio codec encodes an input audio signal as a base layer at a high compression ratio and one or more residual signals as an enhancement layer of a compressed bitstream, which permits a lossless or near lossless reconstruction of the input audio signal at decoding. The scalable audio codec uses perceptual transform coding to encode the base layer. The residual is calculated in a transform domain, which includes a frequency and possibly also multi-channel transform of the input audio. For lossless reconstruction, the frequency and multi-channel transforms are reversible.

Подробнее

Номер записи: 40

13-02-2007 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: US0007177804B2

Автор: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen, WANG TIAN, KOISHIDA KAZUHITO, KHALIL HOSAM A, SUN XIAOQIN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 41

18-02-2020 дата публикации

Continuous motion controls operable using neurological data

Номер: US0010564713B2

Автор: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida, KESKIN CEM, SHAHID KHURAM, CHAU BILL, KIM JAEYOUN, KOISHIDA KAZUHITO, Keskin, Cem, Shahid, Khuram, Chau, Bill, Kim, Jaeyoun, Koishida, Kazuhito

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC

Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.

Подробнее

Номер записи: 42

17-01-2013 дата публикации

SOUND SOURCE LOCALIZATION USING PHASE SPECTRUM

Номер: US20130016852A1

Автор: Shankar Regunathan, Kazuhito Koishida, Harshavardhana Narayana Kikkeri, REGUNATHAN SHANKAR, KOISHIDA KAZUHITO, KIKKERI HARSHAVARDHANA NARAYANA

Принадлежит: MICROSOFT CORPORATION

An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.

Подробнее

Номер записи: 43

16-11-2017 дата публикации

CONTINUOUS MOTION CONTROLS OPERABLE USING NEUROLOGICAL DATA

Номер: US20170329392A1

Автор: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida

Принадлежит:

Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings. 1. A computer system configured for using neurological data to modulate a continuous user interface control , the computer system comprising:one or more processors; and create a continuous motion control that maps neurological data obtained from a plurality of users to a set of physical movements within a continuous range of motion of the plurality of users;', 'tune the continuous motion control to a particular user by at least mapping neurological data obtained from the particular user while the particular user is performing the set of physical movements within the continuous range of motion;', 'associate the continuous motion control to a continuous user interface control;', 'detect user input comprising neurological data associated with a physical movement within the continuous range of movement; and', 'modulate the continuous user interface control in a manner corresponding to the physical movement of the user within the continuous range of motion., 'one or more hardware storage devices having stored thereon computer-executable instructions which are executable by the one or more processors to cause the computing system to perform at least the following2. The computing system of claim 1 , wherein the set of physical movements within the continuous range of motion includes physical movements that are differentiated from one another based on relative position of a body ...

Подробнее

Номер записи: 44

20-07-2010 дата публикации

Flexible frequency and time partitioning in perceptual transform coding of audio

Номер: US0007761290B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, CHEN WEI-GE

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

An audio encoder/decoder performs band partitioning for vector quantization encoding of spectral holes and missing high frequencies that result from quantization when encoding at low bit rates. The encoder/decoder determines a band structure for spectral holes based on two threshold parameters: a minimum hole size threshold and a maximum band size threshold. Spectral holes wider than the minimum hole size threshold are partitioned evenly into bands not exceeding the maximum band size threshold in size. Such hole filling bands are configured up to a preset number of hole filling bands. The bands for missing high frequencies are then configured by dividing the high frequency region into bands having binary-increasing, linearly-increasing or arbitrarily-configured band sizes up to a maximum overall number of bands.

Подробнее

Номер записи: 45

14-06-2011 дата публикации

Robust decoder

Номер: US0007962335B2

Автор: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen, KHALIL HOSAM A, WANG TIAN, KOISHIDA KAZUHITO, SUN XIAOQIN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Подробнее

Номер записи: 46

14-05-2009 дата публикации

TRANSCODER USING ENCODER GENERATED SIDE INFORMATION

Номер: US20090125315A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen

Принадлежит: Microsoft Corporation

An audio encoder encodes side information into a compressed audio bitstream containing encoding parameters used by the encoder for one or more encoding techniques, such as a noise-mask-ratio curve used for rate control. A transcoder uses the encoder generated side information to transcode the audio from the original compressed bitstream having an initial bit-rate into a second bitstream having a new bit-rate. Because the side information is derived from the original audio, the transcoder is able to better maintain audio quality of the transcoding. The side information also allows the transcoder to re-encode from an intermediate decoding/encoding stage for faster and lower complexity transcoding.

Подробнее

Номер записи: 47

09-11-2017 дата публикации

Modifying the Modality of a Computing Device Based Upon a User's Brain Activity

Номер: US20170323220A1

Автор: John C. Gordon, Kazuhito Koishida

Принадлежит:

Technologies are described herein for modifying the modality of a computing device based upon a user's brain activity. A machine learning classifier is trained using data that identifies a modality for operating a computing device and data identifying brain activity of a user of the computing device. Once trained, the machine learning classifier can select a mode of operation for the computing device based upon a user's current brain activity and, potentially, other biological data. The computing device can then be operated in accordance with the selected modality. An application programming interface can also expose an interface through which an operating system and application programs executing on the computing device can obtain data identifying the modality selected by the machine learning classifier. Through the use of this data, the operating system and application programs can modify their mode of operation to be most suitable for the user's current mental state. 1. A computer-implemented method , comprising:training a machine learning model using data identifying a modality for operating a computing device and data identifying first brain activity of a user of the computing device while the computing device is operating in the modality;receiving data identifying second brain activity of the user while operating the computing device;utilizing the machine learning model and the data identifying the second brain activity of the user to select one of a plurality of modalities for operating the computing device; andcausing the computing device to operate in accordance with the selected modality.2. The computer-implemented method of claim 1 , further comprising exposing data identifying the selected one of the plurality of modalities by way of an application programming interface (API).3. The computer-implemented method of claim 1 , wherein the plurality of modalities comprise:a first modality in which a first virtual machine is executed on the computing device; ...

Подробнее

Номер записи: 48

02-04-2013 дата публикации

Noise robust speech classifier ensemble

Номер: US0008412525B2

Автор: Kunal Mukerjee, Kazuhito Koishida, Shankar Regunathan, MUKERJEE KUNAL, KOISHIDA KAZUHITO, REGUNATHAN SHANKAR

Принадлежит: Microsoft Corporation, MUKERJEE KUNAL, KOISHIDA KAZUHITO, REGUNATHAN SHANKAR, MICROSOFT CORP, MICROSOFT CORPORATION

Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.

Подробнее

Номер записи: 49

02-12-2003 дата публикации

Method for coding speech and music signals

Номер: US0006658383B2

Автор: Kazuhito Koishida, Vladimir Cuperman, Amir H. Majidimehr, Allen Gersho, KOISHIDA KAZUHITO, CUPERMAN VLADIMIR, MAJIDIMEHR AMIR H, GERSHO ALLEN, MAJIDIMEHR AMIR H.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

The present invention provides a transform coding method efficient for music signals that is suitable for use in a hybrid codec, whereby a common Linear Predictive (LP) synthesis filter is employed for both speech and music signals. The LP synthesis filter switches between a speech excitation generator and a transform excitation generator, in accordance with the coding of a speech or music signal, respectively. For coding speech signals, the conventional CELP technique may be used, while a novel asymmetrical overlap-add transform technique is applied for coding music signals. In performing the common LP synthesis filtering, interpolation of the LP coefficients is conducted for signals in overlap-add operation regions. The invention enables smooth transitions when the decoder switches between speech and music decoding modes.

Подробнее

Номер записи: 50

23-02-2017 дата публикации

SOUND SOURCE LOCALIZATION USING PHASE SPECTRUM

Номер: US20170052245A1

Автор: Shankar Regunathan, Kazuhito Koishida, Harshavardhana Narayana Kikkeri, REGUNATHAN SHANKAR, KOISHIDA KAZUHITO, KIKKERI HARSHAVARDHANA NARAYANA, Regunathan Shankar, Koishida Kazuhito, Kikkeri Harshavardhana Narayana

Принадлежит:

An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment. 1. A computer implemented process comprising:receiving signals from a plurality of pairs of microphones into a memory;processing the signals from the microphones to identify when the signals are active;computing frequency spectrum data for the signals;for each pair of active signals, determining a candidate angle for the pair using the frequency spectrum data; andselecting an angle from among the candidate angles for a plurality of pairs of microphones.2. The computer-implemented process of claim 1 , wherein receiving signals includes receiving each signal as a plurality of frames claim 1 , and wherein processing claim 1 , computing claim 1 , determining and selecting are performed on a per frame basis.3. The computer-implemented process of claim 1 , wherein selecting the angle from among the candidate angles comprises selecting the angle using information about prior candidate angles.4. The computer-implemented process of claim 3 , wherein selecting further comprises:tracking a history of candidate angles over multiple frames; andupdating the history with the candidate angles from the current frame.5. The computer-implemented process of claim 4 , further comprising selecting the angle from the history which has a phase distortion less than or equal to the minimum phase distortion of all entries.6. The ...

Подробнее

Номер записи: 51

16-11-2017 дата публикации

CHANGING AN APPLICATION STATE USING NEUROLOGICAL DATA

Номер: US20170329404A1

Автор: Cem Keskin, David Kim, Bill Chau, Jaeyoun Kim, Kazuhito Koishida, Khuram Shahid

Принадлежит:

Computer systems, methods, and storage media for changing the state of an application by detecting neurological user intent data associated with a particular operation of a particular application state, and changing the application state so as to enable execution of the particular operation as intended by the user. The application state is automatically changed to align with the intended operation, as determined by received neurological user intent data, so that the intended operation is performed. Some embodiments relate to a computer system creating or updating a state machine, through a training process, to change the state of an application according to detected neurological data. 1. A computer-implemented method for changing a state of an application using neurological data , the method being implemented by a computing system that includes at least one processor and one or more hardware storage devices having stored computer-executable instructions that are executable by the at least one processor to cause the computing system to perform at least the following:operating an application that includes at least a first application state, a second application state, and an application state dependent operation, where invoking the application state dependent operation in response to detecting a particular user input causes a first action within the application while the application is operating in the first application state but causes a second action within the application while the application is operating in the second application state;while the application is operating in the first application state, detecting the particular user input invoking the application state dependent operation, the particular user input including neurological user intention data indicating that the user intends the particular input to cause the second action in accordance with the second application state; andbased on detecting the particular user input, automatically changing the ...

Подробнее

Номер записи: 52

30-11-2006 дата публикации

Robust decoder

Номер: US20060271373A1

Автор: Hosam Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Подробнее

Номер записи: 53

04-06-2013 дата публикации

Audio transcoder using encoder-generated side information to transcode to target bit-rate

Номер: US0008457958B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, CHEN WEI-GE

Принадлежит: Microsoft Corporation, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, CHEN WEI-GE, MICROSOFT CORP, MICROSOFT CORPORATION

An audio encoder encodes side information into a compressed audio bitstream containing encoding parameters used by the encoder for one or more encoding techniques, such as a noise-mask-ratio curve used for rate control. A transcoder uses the encoder generated side information to transcode the audio from the original compressed bitstream having an initial bit-rate into a second bitstream having a new bit-rate. Because the side information is derived from the original audio, the transcoder is able to better maintain audio quality of the transcoding. The side information also allows the transcoder to re-encode from an intermediate decoding/encoding stage for faster and lower complexity transcoding.

Подробнее

Номер записи: 54

02-01-2003 дата публикации

Method for coding speech and music signals

Номер: US20030004711A1

Автор: Kazuhito Koishida, Vladimir Cuperman, Amir Majidimehr, Allen Gersho

Принадлежит: Microsoft Corporation

The present invention provides a transform coding method efficient for music signals that is suitable for use in a hybrid codec, whereby a common Linear Predictive (LP) synthesis filter is employed for both speech and music signals. The LP synthesis filter switches between a speech excitation generator and a transform excitation generator, in accordance with the coding of a speech or music signal, respectively. For coding speech signals, the conventional CELP technique may be used, while a novel asymmetrical overlap-add transform technique is applied for coding music signals. In performing the common LP synthesis filtering, interpolation of the LP coefficients is conducted for signals in overlap-add operation regions. The invention enables smooth transitions when the decoder switches between speech and music decoding modes.

Подробнее

Номер записи: 55

04-11-2010 дата публикации

NOISE ROBUST SPEECH CLASSIFIER ENSEMBLE

Номер: US20100280827A1

Автор: Kunal Mukerjee, Kazuhito Koishida, Shankar Regunathan, MUKERJEE KUNAL, KOISHIDA KAZUHITO, REGUNATHAN SHANKAR

Принадлежит: Microsoft Corporation

Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.

Подробнее

Номер записи: 56

23-10-2007 дата публикации

LPC-harmonic vocoder with superframe structure

Номер: US0007286982B2

Автор: Allen Gersho, Vladimir Cuperman, Tian Wang, Kazuhito Koishida, GERSHO ALLEN, CUPERMAN VLADIMIR, WANG TIAN, KOISHIDA KAZUHITO

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

An enhanced low-bit rate parametric voice coder that groups a number of frames from an underlying frame-based vocoder, such as MELP, into a superframe structure. Parameters are extracted from the group of underlying frames and quantized into the superframe which allows the bit rate of the underlying coding to be reduced without increasing the distortion. The speech data coded in the superframe structure can then be directly synthesized to speech or may be transcoded to a format so that an underlying frame-based vocoder performs the synthesis. The superframe structure includes additional error detection and correction data to reduce the distortion caused by the communication of bit errors.

Подробнее

Номер записи: 57

05-06-2014 дата публикации

BITSTREAM SYNTAX FOR MULTI-PROCESS AUDIO DECODING

Номер: US20140156287A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: Microsoft Corporation

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique. 1. A method of decoding a compressed audio bitstream containing syntax elements conforming to a bitstream syntax , the bitstream syntax defining a base coding layer and a frequency extension coding layer for coding a portion of audio content using a frequency extension coding , the method comprising:reading the base coding layer and frequency extension coding layer of the compressed audio bitstream;parsing a plurality of syntax elements from the frequency extension coding layer specifying parameters used in the frequency extension coding, wherein the parameters comprise parameters specifying frequency extension coding using a different transform window size than a base coding layer;processing coded audio content of the frequency extension coding layer to reconstruct the portion of audio content in an output audio signal;playing the output audio signal.2. The method of wherein the parameters comprise parameters identifying tiles coded using frequency extension coding with a different transform window size than a based coding layer.3. The method of wherein the parameters comprise dynamic band configuration parameters specifying spectral band locations where frequency extension coding is applied.4. The method of wherein said dynamic band configuration parameters specify start and end positions of spectral bands coded using vector quantization techniques.5. The method of wherein the parameters comprise displacement vector search range claim 1 , step size for displacement vector quantization claim 1 , scale factor and codeword modifications. This ...

Подробнее

Номер записи: 58

31-12-2015 дата публикации

LOCATION-BASED AUDIO MESSAGING

Номер: US20150382138A1

Автор: Raja Bose, Hiroshi Horii, Jonathan Lester, Ruchita Bhargava, Kazuhito Koishida, Michelle L. Holtmann, Christina Chen

Принадлежит:

Mobile devices provide a variety of techniques for presenting messages from sources to a user. However, when the message pertains to the presence of the user at a location, the available communications techniques may exhibit deficiencies, e.g., reliance on the memory of the source and/or user of the existence and content of a message between its initiation and the user's visit to the location, or reliance on the communication accessibility of the user, the device, and/or the source during the user's location visit. Presented herein are techniques for enabling a mobile device, at a first time, to receive a request to present an audio message during the presence of the user at a location; and, at a second time, detecting the presence of the user at the location, and presenting the audio message to the user, optionally without awaiting a request from the user to present the message. 1. A method of presenting an audio message to a user of a mobile device having a processor , the method comprising: upon receiving a request to present an audio message from a source to the user during a presence of the user at a location, store the request; and', 'upon detecting a presence of the user at the location, present the audio message to the user., 'executing on the processor instructions that cause the mobile device to2. The method of claim 1 , wherein:the request specifies the location as a location type; anddetecting the presence of the user at the location further comprises: detecting the presence of the user at a location of the location type.3. The method of claim 1 , wherein:the request further specifies a condition; and evaluating the condition specified in the request; and', 'upon determining a condition fulfillment of the message, presenting the audio message to the user., 'presenting the audio message further comprises: upon detecting a presence of the user at the location4. The method of claim 3 , wherein:the condition comprises a time range of the presence of the user ...

Подробнее

Номер записи: 59

07-12-2021 дата публикации

Multi-user intelligent assistance

Номер: US0011194998B2

Автор: Kazuhito Koishida, Alexander A Popov, Uros Batricevic, Steven Nabil Bathiche, KOISHIDA KAZUHITO, POPOV ALEXANDER A, BATRICEVIC UROS, BATHICHE STEVEN NABIL, Koishida, Kazuhito, Popov, Alexander A, Batricevic, Uros, Bathiche, Steven Nabil

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC

An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user.

Подробнее

Номер записи: 60

05-11-2009 дата публикации

ROBUST DECODER

Номер: US20090276212A1

Автор: Hosam A. Khalil, Tian Wang, Kazuhito Koishida, Xiaoqin Sun, Wei-Ge Chen

Принадлежит: Microsoft Corporation

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Подробнее

Номер записи: 61

08-06-2010 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: US0007734465B2

Автор: Tian Wang, Kazuhito Koishida, Hosam A. Khalil, Xiaoqin Sun, Wei-Ge Chen, WANG TIAN, KOISHIDA KAZUHITO, KHALIL HOSAM A, SUN XIAOQIN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 62

11-08-2011 дата публикации

BITSTREAM SYNTAX FOR MULTI-PROCESS AUDIO DECODING

Номер: US20110196684A1

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: MICROSOFT CORPORATION

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 63

22-08-2017 дата публикации

Bitstream syntax for multi-process audio decoding

Номер: US0009741354B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE, Koishida Kazuhito, Mehrotra Sanjeev, He Chao, Chen Wei-Ge

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 64

27-04-2010 дата публикации

Audio codec post-filter

Номер: US0007707034B2

Автор: Xiaoqin Sun, Tian Wang, Hosam A. Khalil, Kazuhito Koishida, Wei-Ge Chen, SUN XIAOQIN, WANG TIAN, KHALIL HOSAM A, KOISHIDA KAZUHITO, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

Techniques and tools are described for processing reconstructed audio signals. For example, a reconstructed audio signal is filtered in the time domain using filter coefficients that are calculated, at least in part, in the frequency domain. As another example, producing a set of filter coefficients for filtering a reconstructed audio signal includes clipping one or more peaks of a set of coefficient values. As yet another example, for a sub-band codec, in a frequency region near an intersection between two sub-bands, a reconstructed composite signal is enhanced.

Подробнее

Номер записи: 65

12-02-2019 дата публикации

Continuous motion controls operable using neurological data

Номер: US0010203751B2

Автор: Cem Keskin, Khuram Shahid, Bill Chau, Jaeyoun Kim, Kazuhito Koishida, KESKIN CEM, SHAHID KHURAM, CHAU BILL, KIM JAEYOUN, KOISHIDA KAZUHITO, Keskin, Cem, Shahid, Khuram, Chau, Bill, Kim, Jaeyoun, Koishida, Kazuhito

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC

Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user's physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.

Подробнее

Номер записи: 66

08-02-2022 дата публикации

Audio-visual speech enhancement

Номер: US0011244696B2

Автор: Kazuhito Koishida, Michael Iuzzolino

Принадлежит: Microsoft Technology Licensing, LLC

Example speech enhancement systems include a spatio-temporal residual network configured to receive video data containing a target speaker and extract visual features from the video data, an autoencoder configured to receive input of an audio spectrogram and extract audio features from the audio spectrogram, and a squeeze-excitation fusion block configured to receive input of visual features from a layer of the spatio-temporal residual network and input of audio features from a layer of the autoencoder, and to provide an output to the decoder of the autoencoder. The decoder is configured to output a mask configured based upon the fusion of audio features and visual features by the squeeze-excitation fusion block, and the instructions are executable to apply the mask to the audio spectrogram to generate an enhanced magnitude spectrogram, and to reconstruct an enhanced waveform from the enhanced magnitude spectrogram.

Подробнее

Номер записи: 67

04-11-2014 дата публикации

Query and matching for content recognition

Номер: US0008880545B2

Автор: Kazuhito Koishida, David Nister, Ian Simon, Tom Butcher

Принадлежит: Microsoft Corporation

Various embodiments enable audio data, such as music data, to be captured, by a device, from a background environment and processed to formulate a query that can then be transmitted to a content recognition service. In one or more embodiments, multiple queries are transmitted to the content recognition service. In at least some embodiments, subsequent queries can progressively incorporate previous queries plus additional data that is captured. In one or more embodiments, responsive to receiving the query, the content recognition service can employ a multi-stage matching technique to identify content items responding to the query. This matching technique can be employed as queries are progressively received.

Подробнее

Номер записи: 68

03-07-2003 дата публикации

Rate control strategies for speech and music coding

Номер: US20030125932A1

Автор: Tian Wang, Kazuhito Koishida, Vladimir Cuperman

Принадлежит: Microsoft Corporation

A method and a system are provided for controlling the coding rates of a multimode coding system with respect to a sequence of input audio signal frames. The method eliminates or minimizes the overflow and underflow of a bit-stream buffer maintained by the coding system for temporarily recording bit-stream data prior to transmission or storage.

Подробнее

Номер записи: 69

14-11-2017 дата публикации

Sound source localization using phase spectrum

Номер: US0009817100B2

Автор: Shankar Regunathan, Kazuhito Koishida, Harshavardhana Narayana Kikkeri, REGUNATHAN SHANKAR, KOISHIDA KAZUHITO, KIKKERI HARSHAVARDHANA NARAYANA, Regunathan Shankar, Koishida Kazuhito, Kikkeri Harshavardhana Narayana

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC

An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.

Подробнее

Номер записи: 70

24-05-2016 дата публикации

Bitstream syntax for multi-process audio decoding

Номер: US0009349376B2

Автор: Kazuhito Koishida, Sanjeev Mehrotra, Chao He, Wei-Ge Chen, KOISHIDA KAZUHITO, MEHROTRA SANJEEV, HE CHAO, CHEN WEI-GE

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC, MICROSOFT TECHNOLOGY LICENSING, LLC

An audio decoder provides a combination of decoding components including components implementing base band decoding, spectral peak decoding, frequency extension decoding and channel extension decoding techniques. The audio decoder decodes a compressed bitstream structured by a bitstream syntax scheme to permit the various decoding components to extract the appropriate parameters for their respective decoding technique.

Подробнее

Номер записи: 71

18-11-2008 дата публикации

Gain constrained noise suppression

Номер: US0007454332B2

Автор: Kazuhito Koishida, Feng Zhuge, Hosam A. Khalil, Tian Wang, Wei-ge Chen, KOISHIDA KAZUHITO, ZHUGE FENG, KHALIL HOSAM A, WANG TIAN, CHEN WEI-GE, KHALIL HOSAM A.

Принадлежит: Microsoft Corporation, MICROSOFT CORP, MICROSOFT CORPORATION

A gain-constrained noise suppression for speech more precisely estimates noise, including during speech, to reduce musical noise artifacts introduced from noise suppression. The noise suppression operates by applying a spectral gain G(m, k) to each short-time spectrum value S(m, k) of a speech signal, where m is the frame number and k is the spectrum index. The spectrum values are grouped into frequency bins, and a noise characteristic estimated for each bin classified as a "noise bin." An energy parameter is smoothed in both the time domain and the frequency domain to improve noise estimation per bin. The gain factors G(m, k) are calculated based on the current signal spectrum and the noise estimation, then smoothed before being applied to the signal spectral values S(m, k). First, a noisy factor is computed based on a ratio of the number of noise bins to the total number of bins for the current frame, where a zero-valued noisy factor means only using constant gain for all the spectrum ...

Подробнее

Номер записи: 72

09-01-2018 дата публикации

Changing an application state using neurological data

Номер: US0009864431B2

Автор: Cem Keskin, David Kim, Bill Chau, Jaeyoun Kim, Kazuhito Koishida, Khuram Shahid, KESKIN CEM, KIM DAVID, CHAU BILL, KIM JAEYOUN, KOISHIDA KAZUHITO, SHAHID KHURAM, Keskin Cem, Kim David, Chau Bill, Kim Jaeyoun, Koishida Kazuhito, Shahid Khuram

Принадлежит: Microsoft Technology Licensing, LLC, MICROSOFT TECHNOLOGY LICENSING LLC, MICROSOFT TECHNOLOGY LICENSING, LLC

Computer systems, methods, and storage media for changing the state of an application by detecting neurological user intent data associated with a particular operation of a particular application state, and changing the application state so as to enable execution of the particular operation as intended by the user. The application state is automatically changed to align with the intended operation, as determined by received neurological user intent data, so that the intended operation is performed. Some embodiments relate to a computer system creating or updating a state machine, through a training process, to change the state of an application according to detected neurological data.

Подробнее

Номер записи: 73

13-01-2022 дата публикации

MULTI-USER INTELLIGENT ASSISTANCE

Номер: US20220012470A1

Автор: Bathiche Steven Nabil, BATRICEVIC Uros, Koishida Kazuhito, POPOV Alexander A.

Принадлежит: Microsoft Technology Licensing, LLC

An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user. 1. An intelligent assistant computer , comprising:a logic machine; anda storage machine holding instructions executable by the logic machine to:recognize another intelligent assistant computer;record speech spoken by a first user;determine a self-selection score for the first user based on the speech spoken by the first user;receive a remote-selection score for the first user from the other intelligent assistant computer;if the self-selection score is greater than the remote-selection score, respond to the first user, determine a disengagement metric of the first user based on recorded speech spoken by the first user, and block subsequent responses to all other users until the disengagement metric of the first user exceeds a blocking threshold;if the self-selection score is less than the remote-selection score, do not respond to the first user; andstop blocking subsequent responses to another user responsive to a new self-selection score for the first user being less than a new remote-selection score for the first user.2. The intelligent assistant computer of claim 1 , wherein the self-selection score is determined based further on a signal-to-noise ratio of recorded speech spoken by the first user. ...

Подробнее

Номер записи: 74

12-06-2014 дата публикации

FACILITATING RECOGNITION OF REAL-TIME CONTENT

Номер: US20140161263A1

Автор: BUTCHER THOMAS C., Koishida Kazuhito, SIMON IAN STUART

Принадлежит: MICROSOFT CORPORATION

Systems, methods, and computer-readable storage media for facilitating recognition of real-time content are provided. In embodiments, a new audio fingerprint associated with live audio being presented is received. In accordance with the received audio fingerprint, at least one previously received fingerprint associated with the live audio from a real-time index is removed. Thereafter, the real-time index is updated to include the new audio fingerprint associated with the live audio being presented. Such a real-time index having the new audio fingerprint can be used to recognize the live audio being presented and, thereafter, an indication of the recognized live audio can be provided to the user device. 1. One or more computer-readable storage media storing computer-useable instructions that , when used by one or more computing devices , cause the one or more computing devices to perform a method for facilitating recognition of real-time content , the method comprising:receiving a new audio fingerprint associated with live audio being presented;removing at least one previously received fingerprint associated with the live audio from a real-time index; andupdating the real-time index to include the new audio fingerprint associated with the live audio being presented, wherein the real-time index having the new audio fingerprint is used to recognize the live audio being presented.2. The one or more computer-readable storage media of claim 1 , wherein the new audio fingerprint is received from a fingerprint extractor that generates the new audio fingerprint in real-time as the live audio is presented.3. The one or more computer-readable storage media of claim 1 , wherein the new audio fingerprint corresponds with a single audio sample corresponding with the live audio being presented.4. The one or more computer-readable storage media of further comprising identifying the at least one previously received fingerprint associated with the live audio to remove from among a ...

Подробнее

Номер записи: 75

19-06-2014 дата публикации

LOCAL RECOGNITION OF CONTENT

Номер: US20140172429A1

Автор: BUTCHER THOMAS C., Koishida Kazuhito, SIMON IAN STUART

Принадлежит: MICROSOFT CORPORATION

Systems, methods, and computer-readable storage media for facilitating local recognition of audio content at a user device. In some embodiments, the method includes capturing, using a user device, audio data, at least some of which is processable to recognize the audio data. Thereafter, an audio fingerprint that uniquely represents perceptual information associated with the audio data is generated, and a local data store within the user device is referenced. Such a local data store can include reference audio fingerprints. Upon referencing the local data store, a determination can be made as to whether the generated audio fingerprint matches a reference audio fingerprint at least to an extent. 1. A computer-implemented method for facilitating local recognition of audio content at a user device , the method comprising:capturing, using a user device, audio data, at least some of which is processable to recognize the audio data;generating an audio fingerprint that uniquely represents perceptual information associated with the audio data;referencing a local data store within the user device, the local data store including one or more reference audio fingerprints; anddetermining that the generated audio fingerprint matches a reference audio fingerprint at least to an extent.2. The method of claim 1 , wherein the capturing occurs prior to receiving user input associated with a request for information regarding the audio data.3. The method of claim 1 , wherein the capturing occurs prior to launch of a content recognition application.4. The method of claim 1 , wherein the capturing occurs when the user device is in a low-power mode.5. The method of claim 4 , wherein generating the audio fingerprint claim 4 , referencing the local data store claim 4 , and determining that the generated audio fingerprint matches a reference audio fingerprint at least to an extent are performed when the user device is in a low-power mode.6. The method of claim 5 , wherein generating the audio ...

Подробнее

Номер записи: 76

16-08-2018 дата публикации

Determining speaker changes in audio input

Номер: US20180233140A1

Автор: Kazuhito Koishida, Uros Batricevic

Принадлежит: Microsoft Technology Licensing LLC

Intelligent assistant systems, methods and computing devices are disclosed for identifying a speaker change. A method comprises receiving audio input comprising a speech fragment. A first voice model is trained with a first sub-fragment from the speech fragment. A second voice model is trained with a second sub-fragment from the speech fragment. The first sub-fragment is analyzed with the second voice model to yield a first confidence value. The second sub-fragment is analyzed with the first voice model to yield a second confidence value. Based at least on the first and second confidence values, the method determines if a speaker of the first sub-fragment is the speaker of the second sub-fragment.

Подробнее

Номер записи: 77

16-08-2018 дата публикации

Multi-user intelligent assistance

Номер: US20180233142A1

Автор: Alexander A. POPOV, Kazuhito Koishida, Steven Nabil Bathiche, Uros Batricevic

Принадлежит: Microsoft Technology Licensing LLC

An intelligent assistant records speech spoken by a first user and determines a self-selection score for the first user. The intelligent assistant sends the self-selection score to another intelligent assistant, and receives a remote-selection score for the first user from the other intelligent assistant. The intelligent assistant compares the self-selection score to the remote-selection score. If the self-selection score is greater than the remote-selection score, the intelligent assistant responds to the first user and blocks subsequent responses to all other users until a disengagement metric of the first user exceeds a blocking threshold. If the self-selection score is less than the remote-selection score, the intelligent assistant does not respond to the first user.

Подробнее

Номер записи: 78

11-10-2018 дата публикации

SPEECH PARSING WITH INTELLIGENT ASSISTANT

Номер: US20180293221A1

Автор: Albadawi Haithem, Bathiche Steven Nabil, FINKELSTEIN Erich-Soren, FUNG Han Yee Mimi, HEROLD Keith Coleman, Holtmann Michelle Lynn, Koishida Kazuhito, Liu Zongyi, Nuesmeyer Christopher Lance, Pradeep Vivek, Quirk Christopher Brian, Sala Pablo Luis, SOLOMON Oz, Uzelac Aleksandar

Принадлежит: Microsoft Technology Licensing, LLC

A method to execute computer-actionable directives conveyed in human speech comprises: receiving audio data recording speech from one or more speakers; converting the audio data into a linguistic representation of the recorded speech; detecting a target corresponding to the linguistic representation; committing to the data structure language data associated with the detected target and based on the linguistic representation; parsing the data structure to identify one or more of the computer-actionable directives; and submitting the one or more of the computer-actionable directives to the computer for processing. 1. A method to store speaker-resolved language data in a data structure in a computer system , the method comprising:receiving audio data recording speech from one or more speakers;converting the audio data into a linguistic representation of the recorded speech;detecting a speaker corresponding to the linguistic representation; andcommitting to the data structure language data associated with the detected speaker and based on the linguistic representation.2. The method of wherein the speaker is detected via a sensor-fusion machine-learning system previously trained to process the linguistic representation and another form of input concurrently.3. The method of wherein the speaker is detected based on directional microphony.4. The method of wherein the speaker is detected based on a voiceprint.5. The method of wherein detecting the speaker includes storing the voiceprint of the speaker during a calibration phase and matching the stored voiceprint to a post-calibration voiceprint acquired from the audio data.6. The method of wherein the speaker is detected based on face recognition.7. The method of wherein the speaker is detected based on posture analysis.8. The method of wherein the speaker is detected based on semantic analysis of the linguistic representation of the recorded speech.9. The method of wherein converting the audio data includes filtering ...

Подробнее

Номер записи: 79

15-12-2010 дата публикации

Redundant audio bit stream and audio bit stream processing methods

Номер: EP1886306B1

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Corp

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 80

07-12-2006 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: CA2611829A1

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Individual

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 81

07-12-2006 дата публикации

Audio codec post-filter

Номер: CA2609539A1

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Individual

Techniques and tools are described for processing reconstructed audio signals. For example, a reconstructed audio signal is filtered in the time domain using filter coefficients that are calculated, at least in part, in the frequency domain. As another example, producing a set of filter coefficients for filtering a reconstructed audio signal includes clipping one or more peaks of a set of coefficient values. As yet another example, for a sub-band codec, in a frequency region near an intersection between two sub-bands, a reconstructed composite signal is enhanced.

Подробнее

Номер записи: 82

28-02-2008 дата публикации

Audio code after filter

Номер: NO20075773L

Автор: Hosam A Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Corp

Det beskrives teknikker og verktøy for prosessering av rekonstruerte audiosignaler. For eksempel filtreres et rekonstruert audiosignal i tidsdomenen ved bruk av filterkoeffisienter som beregnes, i det minste delvis, i frekvensdomenen. Som et annet eksempel, produsering av et sett av filterkoeffisienter for filtrering av et rekonstruert audiosignal inkluderer klipping av en eller flere topper av et sett av koeffisientverdier. Som enda et annet eksempel, for en subbåndkodek, i et frekvensområde nær en skjæring mellom to subbånd, forbedres et rekonstruert komposittsignal.

Подробнее

Номер записи: 83

18-04-2017 дата публикации

Audio coding after filter

Номер: NO340411B1

Автор: Hosam A Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Technology Licensing LLC

Подробнее

Номер записи: 84

07-12-2006 дата публикации

Sub-band voice CODEC with multi-stage codebooks and redundant coding

Номер: AU2006252965A1

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Corp

Подробнее

Номер записи: 85

10-05-2017 дата публикации

Sound source localization using phase spectrum

Номер: EP2732301B1

Автор: Harshavardhana Narayana Kikkeri, Kazuhito Koishida, Shankar Regunathan

Принадлежит: Microsoft Technology Licensing LLC

Подробнее

Номер записи: 86

09-02-2011 дата публикации

Sub-band voice with multi-stage codebooks and redundant coding

Номер: EP2282309A2

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Corp

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 87

07-12-2006 дата публикации

Robust decoder

Номер: CA2607952A1

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Individual

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing fram es is selected based on one or more factors that include a classification of ea ch of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.</SD OAB>

Подробнее

Номер записи: 88

07-12-2006 дата публикации

Sub-band voice codec with multi-stage codebooks and redundant coding

Номер: WO2006130229A1

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: MICROSOFT CORPORATION

Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.

Подробнее

Номер записи: 89

06-05-2011 дата публикации

REDUNDING FLOW OF AUDIO BITS AND PROCESSING METHODS OF AUDIO BIT FLOW.

Номер: ES2358213T3

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Corp

Un flujo de bits que representa una señal de audio, que comprende información principal codificada para una trama actual (740) que hace referencia a un segmento de una trama previa; e información codificada redundante (760), caracterizado porque se selecciona el segmento de una trama previa para ser utilizado en la descodificación de la trama actual; y la información codificada redundante es para descodificar la trama actual, la información codificada redundante comprendiendo información del histórico de señal asociada con el segmento de referencia de la trama previa. A bit stream representing an audio signal, comprising main information encoded for a current frame (740) that refers to a segment of a previous frame; and redundant encoded information (760), characterized in that the segment of a previous frame is selected for use in decoding the current frame; and the redundant encoded information is to decode the current frame, the redundant encoded information comprising signal history information associated with the reference segment of the previous frame.

Подробнее

Номер записи: 90

13-07-2021 дата публикации

method of presenting an audio message to a user from a mobile device, method of configuring a server and mobile device

Номер: BR112016027192A8

Автор: Christina Chen, Hiroshi Horii, Jonathan Lester, Kazuhito Koishida, Michelle L Holtmann, Raja Bose, Ruchita Bhargava

Принадлежит: Microsoft Technology Licensing LLC

troca de mensagens de áudio baseada em localização. a invenção diz respeito a dispositivos móveis que proporcionam várias técnicas para apresentar mensagem a partir de fontes para um usuário. entretanto, quando a mensagem pertence à presença do usuário em uma localização, as técnicas de comunicação disponíveis podem exibir deficiências, por exemplo, dependência da memória da fonte e/ou do usuário da existência e do conteúdo de uma mensagem entre seu início e a visita do usuário à localização, ou dependência da acessibilidade de comunicação do usuário, do dispositivo, e/ou da fonte durante a visita à localização pelo usuário. neste documento são apresentadas técnicas para permitir a um dispositivo móvel, em um primeiro momento, receber uma solicitação para apresentar uma mensagem de áudio durante a presença do usuário em uma localização; e em um segundo momento, detectar a presença do usuário na localização, e apresentar a mensagem de áudio para o usuário, opcionalmente sem aguardar uma solicitação a partir do usuário para apresentar a mensagem.

Подробнее

Номер записи: 91

07-12-2006 дата публикации

Robust decoder

Номер: AU2006252972A1

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Corp

Подробнее

Номер записи: 92

19-06-2014 дата публикации

Local recognition of content

Номер: WO2014093749A2

Автор: Ian Stuart Simon, Kazuhito Koishida, Thomas C. Butcher

Принадлежит: MICROSOFT CORPORATION

Systems, methods, and computer-readable storage media for facilitating local recognition of audio content at a user device. In some embodiments, the method includes capturing, using a user device, audio data, at least some of which is processable to recognize the audio data. Thereafter, an audio fingerprint that uniquely represents perceptual information associated with the audio data is generated, and a local data store within the user device is referenced. Such a local data store can include reference audio fingerprints. Upon referencing the local data store, a determination can be made as to whether the generated audio fingerprint matches a reference audio fingerprint at least to an extent.

Подробнее

Номер записи: 93

22-03-2016 дата публикации

Robust decoder

Номер: CA2607952C

Автор: Hosam A. Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Technology Licensing LLC

Techniques and tools related to delayed or lost coded audio information are described. For example, a concealment technique for one or more missing frames is selected based on one or more factors that include a classification of each of one or more available frames near the one or more missing frames. As another example, information from a concealment signal is used to produce substitute information that is relied on in decoding a subsequent frame. As yet another example, a data structure having nodes corresponding to received packet delays is used to determine a desired decoder packet delay value.

Подробнее

Номер записи: 94

15-01-2011 дата публикации

Redundanter audio bitstrom und verfahren zur vearbeitung von audio bitströmen

Номер: ATE492014T1

Автор: Hosam A Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen, Xiaoqin Sun

Принадлежит: Microsoft Corp

Подробнее

Номер записи: 95

08-11-2023 дата публикации

Multi-user intelligent assistance

Номер: EP3583497B1

Автор: Alexander A. POPOV, Kazuhito Koishida, Steven Nabil Bathiche, Uros Batricevic

Принадлежит: Microsoft Technology Licensing LLC

Подробнее

Номер записи: 96

16-11-2017 дата публикации

Modifying the modality of a computing device based upon a user's brain activity

Номер: WO2017196580A1

Автор: John C. Gordon, Kazuhito Koishida

Принадлежит: Microsoft Technology Licensing, LLC

Technologies are described herein for modifying the modality of a computing device based upon a user's brain activity. A machine learning classifier is trained using data that identifies a modality for operating a computing device and data identifying brain activity of a user of the computing device. Once trained, the machine learning classifier can select a mode of operation for the computing device based upon a user's current brain activity and, potentially, other biological data. The computing device can then be operated in accordance with the selected modality. An application programming interface can also expose an interface through which an operating system and application programs executing on the computing device can obtain data identifying the modality selected by the machine learning classifier. Through the use of this data, the operating system and application programs can modify their mode of operation to be most suitable for the user's current mental state.

Подробнее

Номер записи: 97

22-12-2005 дата публикации

Lpc-harmonischer sprachkodierer mit überrahmenformat

Номер: DE60024123D1

Автор: Allen Gersho, Kazuhito Koishida, Tian Wang, Vladimir Cuperman

Принадлежит: Microsoft Corp

Подробнее

Номер записи: 98

15-02-2007 дата публикации

Verstärkungsgesteuerte geräuschunterdrückung

Номер: ATE353466T1

Автор: Feng Zhuge, Hosam A Khalil, Kazuhito Koishida, Tian Wang, Wei-ge Chen

Принадлежит: Microsoft Corp

Подробнее

Номер записи: 99

15-12-2005 дата публикации

Lpc-harmonischer sprachkodierer mit überrahmenformat

Номер: ATE310304T1

Автор: Allen Gersho, Kazuhito Koishida, Tian Wang, Vladimir Cuperman

Принадлежит: Microsoft Corp

Подробнее

Номер записи: 100

14-09-2022 дата публикации

Audio-visual speech enhancement

Номер: EP4055595A1

Автор: Kazuhito Koishida, Michael Iuzzolino

Принадлежит: Microsoft Technology Licensing LLC

Example speech enhancement systems include a spatio-temporal residual network configured to receive video data containing a target speaker and extract visual features from the video data, an autoencoder configured to receive input of an audio spectrogram and extract audio features from the audio spectrogram, and a squeeze-excitation fusion block configured to receive input of visual features from a layer of the spatio-temporal residual network and input of audio features from a layer of the autoencoder, and to provide an output to the decoder of the autoencoder. The decoder is configured to output a mask configured based upon the fusion of audio features and visual features by the squeeze-excitation fusion block, and the instructions are executable to apply the mask to the audio spectrogram to generate an enhanced magnitude spectrogram, and to reconstruct an enhanced waveform from the enhanced magnitude spectrogram.

Подробнее

Номер записи: 101

14-05-2021 дата публикации

Audio-visual speech enhancement

Номер: WO2021091679A1

Автор: Kazuhito Koishida, Michael Iuzzolino

Принадлежит: Microsoft Technology Licensing, LLC

Example speech enhancement systems include a spatio-temporal residual network configured to receive video data containing a target speaker and extract visual features from the video data, an autoencoder configured to receive input of an audio spectrogram and extract audio features from the audio spectrogram, and a squeeze-excitation fusion block configured to receive input of visual features from a layer of the spatio-temporal residual network and input of audio features from a layer of the autoencoder, and to provide an output to the decoder of the autoencoder. The decoder is configured to output a mask configured based upon the fusion of audio features and visual features by the squeeze-excitation fusion block, and the instructions are executable to apply the mask to the audio spectrogram to generate an enhanced magnitude spectrogram, and to reconstruct an enhanced waveform from the enhanced magnitude spectrogram.

Подробнее

Номер записи: 102

16-11-2017 дата публикации

Continuous motion controls operable using neurological data

Номер: WO2017196641A1

Автор: Bill Chau, Cem Keskin, Jaeyoun Kim, Kazuhito Koishida, Khuram Shahid

Принадлежит: Microsoft Technology Licensing, LLC

Computer systems, methods, and storage media for generating a continuous motion control using neurological data and for associating the continuous motion control with a continuous user interface control to enable analog control of the user interface control. The user interface control is modulated through a user' s physical movements within a continuous range of motion associated with the continuous motion control. The continuous motion control enables fine-tuned and continuous control of the corresponding user interface control as opposed to control limited to a small number of discrete settings.

Подробнее

Номер записи: 103