27-08-2016 дата публикации


Номер: RU2596033C2

Изобретение относится к передаче речи и может быть использовано для получения улучшенной частотной характеристики и временного фазирования способом расширения полосы аудиосигналов в фазовом вокодере. Устройство для получения широкополосного расширенного аудиосигнала из входного сигнала, состоящее из генератора патчей для получения одного или более сигналов патчей из входного сигнала, где генератор патчей предназначен для расширения временной шкалы (1800, 1808) полосовых сигналов, поступающих от банка фильтров анализа, и где генератор патчей содержит блок регулятора фазы (1806) для регулировки фазы сигналов поддиапазонов, используя коррекцию фазы, зависящую от канала банка фильтров. 3 н. и 17 з.п. ф-лы, 16 ил.

21-04-2020 дата публикации


Номер: RU2719690C2

Изобретение относится к средствам для формирования кодированного битового аудиопотока. Технический результат заключается в повышении эффективности кодирования. Входной буфер хранит по меньшей мере один фрейм кодированного битового аудиопотока. Кодированный битовый аудиопоток содержит аудиоданные и контейнер метаданных. Контейнер метаданных содержит одно или более информационных наполнений метаданных и защитные данные. Аудиодекодер соединен с буферной памятью, декодирует аудиоданные для создания декодированных аудиоданных. Синтаксический анализатор соединен с аудиодекодером или встроен в него. Синтаксический анализатор сконфигурирован для проведения синтаксического анализа аудиоданных. Выходной буфер хранит декодированные аудиоданные. Контейнер метаданных начинается с синхрослова, идентифицирующего начало контейнера метаданных. Защитные данные расположены после одного или более информационных наполнений метаданных. Каждое из одного или более информационных наполнений метаданных начинается ...

30-06-2017 дата публикации


Номер: RU2624099C1

Изобретение относится к средствам генерирования кодированного битового аудиопотока. Технический результат заключается в повышении эффективности обнаружения и исправления ошибок идентификации метаданных аудиопотока за счет наличия в аудиопотоке метаданных звукового содержимого аудиопотока. Буферная память является энергонезависимым носителем, сконфигурированным для хранения по меньшей мере одного кадра кодированного битового аудиопотока. При этом кодированный битовый аудиопоток содержит аудиоданные и контейнер метаданных. Контейнер метаданных содержит заголовок и одно или более наполнений метаданных после заголовка. Одно или более наполнение метаданных содержит метаданные сжатия динамического диапазона (DRC), и метаданные DRC представляют собой или содержат метаданные профиля, указывающие, содержат ли метаданные DRC управляющие значения сжатия динамического диапазона для использования при осуществлении сжатия динамического диапазона в соответствии с одним профилем сжатия для звукового содержимого ...

16-09-2021 дата публикации

Номер: RU2018118173A3

11-11-2021 дата публикации

Номер: RU2018126300A3

29-01-2019 дата публикации


Номер: RU2678487C2

Изобретение относится к средствам для кодирования и декодирования аудио. Технический результат заключается в повышении эффективности кодирования. Формируют кодированный поток (ABS) битов аудио из аудиосигнала (AS), содержащего последовательные аудиокадры (AFP, AFR, AFS). Формируют кодированный поток (DBS) битов управления динамическим диапазоном из последовательности (DS) управления динамическим диапазоном, соответствующей аудиосигналу (AS) и содержащей последовательные кадры (DFP, DFR, DFS) управления динамическим диапазоном. Каждый кадр (DFP, DFR, DFS) управления динамическим диапазоном содержит один или более узлов (A...A; B...B; C), при этом каждый узел содержит информацию усиления для аудиосигнала и информацию времени, указывающую то, какому моменту времени соответствует информация усиления. Кодированный поток битов управления динамическим диапазоном содержит для каждого кадра управления динамическим диапазоном соответствующую часть потока битов. Выполняют процедуру сдвига, при этом ...

16-05-2017 дата публикации


Номер: RU2619536C1

Изобретение относится к средствам генерирования кодированного битового аудиопотока. Технический результат заключается в повышении качества кодированных аудиоданных. Принимают кодированный битовый аудиопоток, который содержит метаданные и аудиоданные. Извлекают указанные метаданные и аудиоданные из указанного кодированного битового аудиопотока. Метаданные представляют собой или содержат метаданные сведений о программе и метаданные структуры вложенных потоков. Кодированный битовый аудиопоток содержит последовательность кадров, служащих признаками по меньшей мере одной звуковой программы, указанные метаданные сведений о программе и метаданные структуры вложенных потоков служат признаками указанной программы. Каждый из кадров содержит по меньшей мере один сегмент аудиоданных. Каждый указанный сегмент аудиоданных содержит по меньшей мере некоторые из аудиоданных, каждый кадр из по меньшей мере подмножества кадров содержит сегмент метаданных и каждый указанный сегмент метаданных содержит по меньшей ...

27-06-2011 дата публикации


Номер: RU2423009C1

Изобретение относится к беспроводным системам связи. Технический результат - повышение точности синхронизации воспроизведения речевых потоков. Представленные способы и устройство с буфером устранения дрожания изменяют воспроизведение пакетов в зависимости от того, обнаруживаются ли периоды молчания между предложениями или внутри предложения, чтобы оптимизировать качество передачи голоса в системе связи. В одном примере буфер устранения дрожания определяет длительность, по меньшей мере, одного периода молчания, связанного с множеством принятых пакетов, и определяет момент времени для передачи части пакетов на основании определенной длительности этого периода молчания. В другом примере эту функцию выполняет модуль получения характеристик молчания. 3 н. и 10 з.п. ф-лы, 15 ил.

23-01-2019 дата публикации


Номер: RU2678136C1

Изобретение относится к средствам для обработки кодированного аудиосигнала, включающего в себя множество микшированных с понижением сигналов, связанных с множеством входных звуковых объектов и параметрами объектов. Технический результат заключается в повышении эффективности обработки аудиосигнала. Группируют множество микшированных с понижением сигналов во множество групп микшированных с понижением сигналов на основании информации в упомянутом кодированном аудиосигнале. Каждая группа микшированных с понижением сигналов связана с набором входных звуковых объектов из множества входных звуковых объектов. Индивидуально выполняют по меньшей мере один этап обработки параметров () объектов каждого набора входных звуковых объектов, чтобы обеспечить результаты группы. Объединяют результаты групп, чтобы обеспечить декодированный аудиосигнал. Группирование множества микшированных с понижением сигналов в множество групп микшированных с понижением сигналов конфигурировано таким образом, что каждый входной ...

23-07-2019 дата публикации


Номер: RU2695504C1

Изобретение относится к системам обработки медиаданных и, в частности, к адаптивной обработке медиаданных на основе состояний обработки медиаданных для медиаданных. Технический результат – обеспечение декодирования аудиоданных, нормализованных до согласованных значений громкости. Данные способы адаптивной обработки медиаданных основаны на отдельных данных, указывающих состояние медиаданных. Устройство обработки медиаданных может определять, была ли обработка медиаданных какого-либо типа уже выполнена на входной версии медиаданных. Если была, то устройство может адаптировать обработку медиаданных так, чтобы отменять выполнение обработки медиаданных определенного типа. Если нет, то устройство выполняет обработку медиаданных определенного типа. Устройство может создавать состояние медиаданных, указывающее тип обработки медиаданных. Устройство может передавать состояние медиаданных и выходную версию медиаданных устройству-получателю по цепочке обработки медиаданных для поддержки адаптивной ...

29-11-2018 дата публикации


Номер: RU2673691C1
Принадлежит: НТТ ДОКОМО, ИНК. (JP)

Изобретение относится к области кодирования аудио и речи. Технический результат – обеспечение эффективного сокращения объема вычислений при преобразовании коэффициентов линейного предсказания. Устройство преобразования коэффициентов линейного предсказания преобразует первые коэффициенты линейного предсказания, вычисляемые при первой частоте F1 дискретизации, во вторые коэффициенты линейного предсказания при второй частоте F2 дискретизации, отличной от первой частоты дискретизации, и содержит: средство для вычисления, на действительной оси единичной окружности, спектра мощности, соответствующего вторым коэффициентам линейного предсказания при второй частоте дискретизации, на основе первых коэффициентов линейного предсказания или эквивалентного параметра; средство для вычисления, на действительной оси единичной окружности, коэффициентов автокорреляции из спектра мощности; и средство для преобразования коэффициентов автокорреляции во вторые коэффициенты линейного предсказания при второй частоте ...

20-04-2014 дата публикации


1. Устройство для генерирования расширенной полосы аудиосигнала из входного сигнала, включающий патч-генератор (82, 102а, 102b) для генерирования одного или более патч-сигналов из входного сигнала, в котором патч-сигнал имеет центральную частоту патча, отличающуюся от центральной частоты патча другого патча или от центральной частоты входного аудиосигнала, при этом патч-генератор (82, 102а, 102b) предназначен для удлинения временной составляющей (90а, 90b, 90c; 1808; 130) сигналов поддиапазона из банка фильтров анализа (101), и патч-генератор (82, 102а, 102b) включает в себя регулятор фазы (1806, 124а, 124b, 124c) для регулирования фаз полосовых сигналов, с использованием фазовой коррекции (151, 152, 153), зависящей от канала банка фильтров.

27-11-2016 дата публикации


... 1. Постфильтр (440; 550; 740; 1040; 1140) ослабления межгармонического шума с варьируемым коэффициентом усиления, управляющим межгармоническим ослаблением, причем фильтр адаптирован для приема входного сигнала, который включает предварительный звуковой сигнал, декодированный в соответствии с одним из нескольких режимов декодирования, и для подачи выходного звукового сигнала,отличающийся наличием секции управления с регулятором усиления, действующим для присвоения абсолютному значению коэффициента усиления нулевого значения, посредством чего постфильтр подает звуковые сигналы как выходной звуковой сигнал, в зависимости только от значения сигнала постфильтрации.2. Постфильтр по п. 1, отличающийся тем, что активность постфильтра традиционно связана с конкретными режимами декодирования,причем регулятор усиления действует для присвоения абсолютному значению коэффициента усиления нулевого значения в по меньшей мере одном традиционном режиме декодирования с фильтрацией, который таким образом применяется ...

19-12-2024 дата публикации


Номер: RU2832121C1

Изобретение относится к области вычислительной техники для обработки аудиоданных. Технический результат заключается в повышении качества копирования спектральной полосы аудиосигнала. Технический результат достигается за счет этапов, на которых: демультиплексируют блок кодированного битового аудиопотока; и декодируют часть блока кодированного битового аудиопотока, причем блок кодированного битового аудиопотока включает в себя: заполняющий элемент с идентификатором, указывающим начало заполняющего элемента, и заполняющие данные после идентификатора, причем заполняющие данные включают в себя: флаг, идентифицирующий, должна ли быть выполнена обработка расширенного копирования спектральной полосы для аудиосодержимого по меньшей мере одного блока кодированного битового аудиопотока; и метаданные расширенного копирования спектральной полосы, которые не включают в себя один или более параметров, используемых и для спектральной вставки, и для гармонической транспозиции, причем метаданные расширенного ...

17-04-2000 дата публикации

Номер: AU0005921199A

19-12-2013 дата публикации

Номер: AU2011226206B2

An apparatus for generating a bandwidth extended audio signal from an input signal, comprises a patch generator for generating one or more patch signals from the input signal, wherein the patch generator is configured for performing a time stretching (1800, 1808) of subband signals from an analysis filterbank, and wherein the patch generator further comprises a phase adjuster (1806) for adjusting phases of the subband signals using a filterbank-channel dependent phase correction.

28-04-1992 дата публикации


Номер: AU0008856791A

10-11-2016 дата публикации

Audio encoder device and an audio decoder device having efficient gain coding in dynamic range control

Принадлежит: FB Rice

The invention provides an audio encoder device comprising: an audio encoder configured for producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from an dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the dynamic range control encoder is configured in such way that the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the dynamic range control encoder is configured for executing ...

17-05-2018 дата публикации

Номер: AU2015251609B2
Принадлежит: Spruson & Ferguson

The purpose of the present invention is to estimate, with a small amount of computation, a linear prediction synthesis filter for which an internal sampling frequency has been converted. A linear prediction coefficient device of the present invention converts a first linear prediction coefficient calculated on the basis of a first sampling frequency to a second linear prediction coefficient of a second sampling frequency different from the first sampling frequency. The linear prediction coefficient device is provided with: a means for calculating, on a real axis of the unit circle, a power spectrum corresponding to the second linear prediction coefficient for the second sampling frequency, on the basis of the first linear prediction coefficient or a parameter equivalent thereto; a means for calculating, on a real axis of the unit circle, an auto-correlation coefficient from the power spectrum; and a means for converting the auto-correlation coefficient to the second linear prediction coefficient ...

23-05-2013 дата публикации

Номер: AU2009291547B2

Improvements are disclosed for in-band signaling, i.e., transmission of data in a voice channel of a digital wireless network during a voice call session. A family of narrow-band signaling methods is disclosed to successfully pass data-carrying signals through the low-bit rate modes of the EVRC-B vocoder commonly used in CDMA wireless channels. Some embodiments generate a tapered signaling waveform in tandem with another waveform using FSK-modulation. These features can be used in cell phones or other wireless communication devices, including automotive applications.

11-02-2016 дата публикации

Номер: AU2014203424B2

The application relates to HFR (High Frequency Reconstruction/Regeneration) of audio signals. In particular, the application relates to a method and system for performing HFR of audio signals having large variations in energy level across the low frequency range which is used to reconstruct the high frequencies of the audio signal. A system configured to generate a plurality of high frequency subband signals covering a high frequency interval from a plurality of low frequency subband signals is described. The system comprises means for receiving the plurality of low frequency subband signals; means for receiving a set of target energies, each target energy covering a different target interval within the high frequency interval and being indicative of the desired energy of one or more high frequency subband signals lying within the target interval; means for generating the plurality of high frequency subband signals from the plurality of low frequency subband signals and from a plurality ...

14-04-2016 дата публикации

Номер: AU2013301864B2

An apparatus for adapting input audio information, encoding one or more audio objects, to obtain adapted audio information is provided. The input audio information comprises two or more input audio downmix channels and further comprises input parametric side information. The adapted audio information comprises one or more adapted audio downmix channels and further comprises adapted parametric side information. The apparatus comprises a downmix signal modifier (110) for adapting, depending on adaptation information, the two or more input audio downmix channels to obtain the one or more adapted audio downmix channels. Moreover, the apparatus comprises a parametric side information adapter (120) for adapting, depending on the adaptation information, the input parametric side information to obtain the adapted parametric side information.

13-08-2015 дата публикации

Номер: AU2014207590B2

Apparatus and methods for generating an encoded audio bitstream, including by including program loudness metadata and audio data in the bitstream, and optionally also program boundary metadata in at least one segment (e.g., frame) of the bitstream. Other aspects are apparatus and methods for decoding such a bitstream, e.g., including by performing adaptive loudness processing of the audio data of an audio program indicated by the bitstream, or authentication and/or validation of metadata and/or audio data of such an audio program. Another aspect is an audio processing unit (e.g., an encoder, decoder, or post-processor) configured (e.g., programmed) to perform any embodiment of the method or which includes a buffer memory which stores at least one frame of an audio bitstream generated in accordance with any embodiment of the method.

12-07-2000 дата публикации

Номер: AU0003088200A

04-06-1985 дата публикации


Номер: CA0001188424A1

26-10-2021 дата публикации


Номер: CA2998689C

It is shown an encoder for encoding an audio signal with reduced background noise using linear predictive coding. The encoder comprises a background noise estimator configured to estimate background noise of the audio signal, a background noise reducer configured to generate background noise reduced audio signal by subtracting the estimated background noise of the audio signal from the audio signal, and a predictor configured to subject the audio signal to linear prediction analysis to obtain a first set of linear prediction filter (LPC) coefficients and to subject the background noise reduced audio signal to linear prediction analysis to obtain a second set of linear prediction filter (LPC) coefficients. Furthermore, the encoder comprises an analysis filter composed of a cascade of time-domain filters controlled by the obtained first set of LPC coefficients and the obtained second set of LPC coefficients.

22-09-2016 дата публикации


Номер: CA0002989595A1

Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one block of an encoded audio bitstream. The block includes a fill element that begins with an identifier followed by fill data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. A corresponding method for decoding an encoded audio bitstream is also provided.

05-01-2012 дата публикации


Номер: CA0002976490A1

An audio decoder includes first and second decoding modules and a pitch filter. The pitch filter is selectively enabled or disabled based on a value of a first parameter encoded in an audio bitstream. Switching of the pitch filter may be reduced. This may result in reduction of perceived noise caused by switching artefacts.

19-04-2016 дата публикации


Номер: CA0002898891C

Apparatus and methods for generating an encoded audio bitstream, including by including substream structure metadata (SSM) and/or program information metadata (PIM) and audio data in the bitstream. Other aspects are apparatus and methods for decoding such a bitstream, and an audio processing unit (e.g., an encoder, decoder, or post-processor) configured (e.g., programmed) to perform any embodiment of the method or which includes a buffer memory which stores at least one frame of an audio bitstream generated in accordance with any embodiment of the method.

05-01-2012 дата публикации


Номер: CA0002929090A1

In one aspect, the invention provides an audio encoding method characterized by a decision being made as to whether the device which will decode the resulting bit stream should apply post filtering including attenuation of interharmonic noise. Hence, the decision whether to use the post filter, which is encoded in the bit stream, is taken separately from the decision as to the most suitable coding mode. In another aspect, there is provided an audio decoding method with a decoding step followed by a post-filtering step, including interharmonic noise attenuation, and being characterized in a step of disabling the post filter in accordance with post filtering information encoded in the bit stream signal. Such a method is well suited for mixed-origin audio signals by virtue of its capability to deactivate the post filter in dependence of the post filtering information only, hence independently of factors such as the current coding mode.

13-11-2018 дата публикации


Номер: CA0002942743C

The invention provides an audio encoder device comprising: an audio encoder configured for producing an encoded audio bitstream from an audio signal comprising consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from an dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds; wherein the dynamic range control encoder is configured in such way that the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion; wherein the dynamic range control encoder is configured for executing ...

01-05-2018 дата публикации


Номер: CA0002816889C

Techniques for adaptive processing of media data based on separate data specifying a state of the media data are provided. A device in a media processing chain may determine whether a type of media processing has already been performed on an input version of media data. If so, the device may adapt its processing of the media data to disable performing the type of media processing. If not, the device performs the type of media processing. The device may create a state of the media data specifying the type of media processing. The device may communicate the state of the media data and an output version of the media data to a recipient device in the media processing chain, for the purpose of supporting the recipient device's adaptive processing of the media data.

18-04-1991 дата публикации


Номер: CA0002027705A1

24-08-1995 дата публикации

Номер: CA0002157024A1

29-01-2002 дата публикации


Номер: CA0002091754C

In a Code Excited Linear Prediction (CELP) analogue signal coding system sequences from a master codebook (40), which may be a one dimensional codebook, are filtered (42) and then stored in slav e codebooks (70, 72). Input analogue signals (20) are filtered (34, 36) and compared orthogonally (66, 78, 80) with sequences from the slave codebooks and an optimum pair of se- quences are selected. Since the comparisons are orthogonal, sequences can be selected from the codebooks (70, 72) and compared (78, 80) with the filtered incoming analogue signals entirely independently. Reduced length sequences from the master codebook may be compared with orthogonalised analogue signals since orthogonalised signals contain some redundancy. The master code- book may not need to be orthogonalised in some circumstances. Various means of orthogonalisation of the sequences are possible including separation into odd and even sequences. Further orthogonalisations are possible, for example to give four comparisons ...

22-10-2002 дата публикации


Номер: CA0002232446C

A coding and decoding system includes first filter means for representing an input signal with first linear prediction coefficients indicative of a coarse spec tral distribution of the input signal, second filter means for representing the input signal with second linear prediction coefficients indicative of a fine spectral distribution of the in put signal and third filter means connected in series with or parallel to the second filter means for representing the input signal with third linear prediction coefficients indicative of a p eriodic component of the input signal. A coding and decoding of the input signal is performed on the basis of parameters of the input signal which is produced on the basis of a residual signal between the input signal and a reproduced signal obtained through the first, second and third filter means.

26-09-1998 дата публикации


Номер: CA0002232446A1

A coding and decoding system includes first filter means for representing an inp ut signal with first linear prediction coefficients indicative of a coarse spectral distribution of the input signal, second filter means for representing the input signal with sec ond linear prediction coefficients indicative of a fine spectral distribution of the input signal and third filter means connected in series with or parallel to the second filter means for representing the input signal with third linear prediction coefficients indicative of a perio dic component of the input signal. A coding and decoding of the input signal is performed on t he basis of parameters of the input signal which is produced on the basis of a residual sign al between the input signal and a reproduced signal obtained through the first, second and third filter means.

24-01-2006 дата публикации


Номер: CA0002158660C

A method and apparatus for implementing a vocoder in an application specific integrated circuit (ASIC) is disclosed. The apparatus contains a DSP core (4 ) that performs computations in accordance with a reduced instruction set (RIS C) architecture. The circuit further comprises a specifically designed slave processor to the DSP core (4) referred to as the minimization processor (6). The apparatus further comprises a specifically designed block normalization circuitry.

12-03-2019 дата публикации

Номер: CN0109461452A

02-12-2015 дата публикации

Номер: CN0105122357A

02-01-2013 дата публикации

Номер: CN0101802906B

19-11-2019 дата публикации

Номер: CN0108461088B

28-11-2017 дата публикации

Номер: KR1020170130627A

... 본 발명은 오디오 신호들의 HFR(High Frequency Reconstruction/Regeneration)에 관련된다. 특히, 본 발명은 오디오 신호의 고주파들의 복원을 위해 사용되는 저주파 범위에 걸쳐 에너지 레벨에서 많은 변화를 가지는 오디오 신호의 HFR을 수행하는 방법 및 시스템에 관련된 것이다. 복수의 저주파 부대역 신호들로부터 고주파 인터벌을 커버하는 복수의 고주파 부대역 신호들을 생성하도록 구성된 시스템이 설명된다. 상기 시스템은 상기 복수의 저주파 부대역 신호들을 수신하기 위한 수단; 타겟 에너지들을 수신하기 위한 수단으로서, 각 타겟 에너지는 고주파 인터벌 내에서 서로 다른 타겟 인터벌을 커버하며, 상기 타겟 인터벌 내에 있는 하나 이상의 고주파 부대역 신호들의 요구되는 에너지의 지시인 것을 특징으로 하는, 타겟 에너지들을 수신하기 위한 수단; 상기 복수의 저주파 부대역 신호들 및 상기 복수의 저주파 부대역 신호들 각각과 관련된 복수의 스펙트럼 이득 계수들로부터 상기 복수의 고주파 부대역 신호들을 생성하기 위한 수단; 및 타겟 에너지들의 세트를 이용하여 상기 복수의 고주파 부대역 신호들의 에너지를 조절하기 위한 수단을 포함한다.

18-07-2017 дата публикации


Номер: BR112015019435A2

27-09-2007 дата публикации

Номер: US20070223885A1

According to one embodiment, a playback apparatus includes first to third digital signal processors. The first digital signal processor includes decode functions corresponding to a plurality of kinds of compression-decoding schemes and decodes first audio data, which is compression-encoded by using an arbitrary one of the plurality of kinds of compression-encoding schemes, thereby generating a first digital audio signal. The second digital signal processor includes decode functions corresponding to the plurality of kinds of compression-decoding schemes and decodes second audio data, which is compression-encoded by using an arbitrary one of the plurality of kinds of compression-encoding schemes, thereby generating a second digital audio signal. The third digital signal processor executes a mixing process of mixing the first digital audio signal and the second digital audio signal, thereby generating a digital audio output signal.

08-08-2000 дата публикации

Номер: US6101464A

A coding and decoding system includes a first filter for representing an input signal with first linear prediction coefficients indicative of a coarse spectral distribution of the input signal, a second filter for representing the input signal with second linear filter means connected in series with or parallel to the second filter for representing the input signal with third linear prediction coefficients indicative of a periodic component of the input signal. A coding and decoding of the input signal is performed on the basis of parameters of the input signal which is produced on the basis of a residual signal between the input signal and a reproduced signal obtained through the first, second and third filters.

22-08-2017 дата публикации

Номер: US0009741352B2

The present invention relates to a method for processing an audio signal, comprising: a step of performing a frequency conversion process on an audio signal to obtain a plurality of frequency transform coefficients; a step of selecting either a general mode or a non-general mode, on the basis of a pulse ratio, for the frequency transform coefficients having a high frequency band from among the plurality of frequency transform coefficients; and a step of performing, if the non-general mode is selected, the following steps: extracting a predetermined number of pulses from the frequency transform coefficients having the high frequency band, and generating pulse information; generating an original noise signal from the frequency transform coefficients having the high frequency band, excluding the pulses; generating a reference noise signal using the frequency transform coefficient having a low frequency band from among the plurality of frequency transform coefficients; and generating noise ...

11-12-2018 дата публикации

Номер: US0010152983B2

A method and apparatus for performing coding and decoding for high-frequency bandwidth extension. The coding apparatus may down-sample an input signal, perform core coding on the down-sampled input signal, perform frequency transformation on the input signal, and perform bandwidth extension coding by using a base signal of the input signal in a frequency domain.

15-06-2017 дата публикации


Номер: US20170171683A1

A method includes extracting a difference value through extraction of features of a front audio channel signal and a surround channel of multichannel sound content by setting the front audio channel signal and the surround channel as input and output channel signals, respectively, training a deep neural network (DNN) model by setting the input channel signal and the difference value as an input and an output of the DNN model, respectively, normalizing a frequency-domain signal of the input channel signal by converting the input channel signal into the frequency-domain signal, and extracting estimated difference values by decoding the normalized frequency-domain signal through the DNN model, deriving an estimated spectral amplitude of the surround channel based on the front audio channel signal and the difference value, and deriving an audio signal of a final surround channel by converting the estimated spectral amplitude of the surround channel into the time domain.

20-08-2020 дата публикации


Номер: US20200265852A1

There are disclosed apparatus and methods for encoding and/or decoding information signals (e.g., audio signals). An encoder apparatus includes a plurality of frequency domain (FD) encoder tools for encoding an information signal, and an encoder bandwidth detector and controller configured to select a bandwidth for at least a subgroup of the FD encoder tools. The subgroup includes less FD encoder tools than the plurality of FD encoder tools. The selection is based on information signal characteristics, so that one of the FD encoder tools of the subgroup has a different bandwidth with respect to at least one of the FD encoder tools which are not in the subgroup.

03-06-2021 дата публикации


Номер: US20210166710A1

An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.

04-01-1984 дата публикации


Номер: EP0000097167A1
Автор: WILLIAMS, Tim A.

An n-ordered digital filter (64) in the form of a Partial Autocorrelation (PARCOR) lattice structure having two multipliers (88 and 102) and two adders (90 and 110). Time multiplexing eliminates the use of individual multipliers and adders for each order of the filter. Speech synthesis of a time varying digital input signal is provided by performing n stages of Linear Predictive Coding (LPC) difference equation operation, where n is an integer. Delay means (86, 94, 96 and 98) and storage registers (70, 72, 74 and 76) minimize the control circuitry and circuit die size to calculate the difference equations to an nth order.

20-12-2023 дата публикации


Номер: EP4293660A1

A method for controlling an electronic device includes obtaining a text, obtaining, by inputting the text into a first neural network model, acoustic feature information corresponding to the text and alignment information in which each frame of the acoustic feature information is matched with each phoneme included in the text, identifying an utterance speed of the acoustic feature information based on the alignment information, identifying a reference utterance speed for each phoneme included in the acoustic feature information based on the text and the acoustic feature information, obtaining utterance speed adjustment information based on the utterance speed of the acoustic feature information and the reference utterance speed for each phoneme, and obtaining, based on the utterance speed adjustment information, speech data corresponding to the text by inputting the acoustic feature information into a second neural network model.

27-05-2002 дата публикации


Номер: RU2183034C2

Изобретение относится к обработке речевых сигналов. Его использование позволяет получить технический результат в виде сокращения количества циклов обработки. Технический результат достигается благодаря тому, что вокодер содержит цифровой процессор сигналов, предназначенный для выполнения рекурсивного вычисления свертки и для выдачи результата этой рекурсивной свертки, и минимизирующий процессор, отдельный от цифрового процессора сигналов, предназначенный для приема упомянутого результата рекурсивной свертки и для выполнения поиска минимизации в соответствии с результатом рекурсивной свертки, причем минимизирующий процессор прерывает выполнение поиска, когда удовлетворен критерий минимума. 4 с. и 11 з.п. ф-лы, 12 ил., 4 табл.

03-10-2024 дата публикации


Номер: RU2827903C2

Изобретение относится к области вычислительной техники для обработки аудиоданных. Технический результат заключается в повышении качества копирования спектральной полосы. Технический результат достигается за счет того, что блок кодированного битового аудиопотока включает в себя: заполняющий элемент с идентификатором, указывающим начало заполняющего элемента, и заполняющие данные после идентификатора, причем заполняющие данные включают в себя: флаг, идентифицирующий, должна ли быть выполнена обработка расширенного копирования спектральной полосы для аудиосодержимого по меньшей мере одного блока кодированного битового аудиопотока, и метаданные расширенного копирования спектральной полосы, которые не включают в себя один или более параметров, используемых и для спектральной вставки, и для гармонической транспозиции, причем метаданные расширенного копирования спектральной полосы представляют собой метаданные, выполненные с возможностью обеспечения по меньшей мере одного инструмента eSBR, который ...

27-05-2001 дата публикации


Номер: RU2168202C2
Автор: ЛИ Уг-Ги (KR)

Изобретение относится к технике связи и предназначено для добавления усовершенствованной функции кодирования переменной скорости (УКПС) к вокодеру в контроллере базовой станции (КБС) сотовой системы. Технический результат заключается в добавлении функции кодирования к вокодеру без увеличения объема памяти. КБС включает в себя множество процессоров цифровой сигнализации речевого сигнала (ПЦСР), процессор цифровой сигнализации управления вызовом (ПЦСУ), процессор селектора и вокодера (ПСВ) и процессор интерфейса селектора (ПИС). КБС связывает приемопередающую подсистему базовой станции (ППБС) и центр коммуникации мобильных устройств (ЦКМ). Каждый из ПЦСР соответственно управляет каждым из вокодеров для сжатия речевого сигнала. Согласно заявленному способу конкретную область памяти подразделяют на кодовую часть и информационную часть, загрузчик ПСВ запрашивает ПИС о загрузке выполняемой программы и кода, хранящегося в кодовой части, программа загрузки процессора (ПЗП) ПИС загружает в ПСВ выполняемую ...

06-09-2018 дата публикации


Номер: RU2666282C2

Изобретение относится к средствам для формирования одного или более аудиоканалов. Технический результат заключается в повышении эффективности кодирования метаданных. Принимают один или более сжатых сигналов метаданных. Каждый из сжатых сигналов метаданных содержит множество первых выборок метаданных. Первые выборки метаданных каждого из сжатых сигналов метаданных указывают информацию, ассоциированную с сигналом аудиообъекта для одного или более сигналов аудиообъектов. Формируют один или более восстановленных сигналов метаданных таким образом, что каждый восстановленный сигнал метаданных содержит первые выборки метаданных сжатого сигнала метаданных, причем упомянутый восстановленный сигнал метаданных ассоциирован с упомянутым сжатым сигналом метаданных и дополнительно содержит множество вторых выборок метаданных. Формирование восстановленных сигналов метаданных содержит этап, на котором формируют вторые выборки метаданных каждого из восстановленных сигналов метаданных посредством формирования ...

10-04-2015 дата публикации


Номер: RU2547238C2

Изобретение относится к средствам кодирования комбинированной обновляющей кодовой книги. Технический результат заключается в обеспечении возможности быстрого поиска даже с очень большими кодовыми книгами. Устройство содержит модуль предварительного квантования первого остатка возбуждения адаптивной кодовой книги и модуль поиска обновляющей кодовой книги CELP, реагирующий на второй остаток возбуждения, создаваемый из первого остатка возбуждения адаптивной кодовой книги. В декодере CELP комбинированная обновляющая кодовая книга содержит модуль деквантования предварительно квантованных параметров кодирования в первый вклад возбуждения и структуру обновляющей кодовой книги CELP, реагирующую на параметры обновляющей кодовой книги CELP, чтобы создавать второй вклад возбуждения. 16 н. и 22 з.п. ф-лы, 4 ил.

12-05-2017 дата публикации


Номер: RU2618848C2

Изобретение относится к кодированию аудиосигналов, в частности к переключаемому кодированию аудиосигналов, где для различных частей аудиосигнала кодированный сигнал генерируется с использованием различных алгоритмов кодирования. Технический результат – обеспечение выбора алгоритма кодирования с хорошим качеством. Устройство для выбора одного из первого алгоритма кодирования, имеющего первую характеристику, и второго алгоритма кодирования, имеющего вторую характеристику, для кодирования части аудиосигнала для получения кодированной версии части аудиосигнала включает первый блок оценки для оценки для части аудиосигнала первой меры качества, которая связана с первым алгоритмом кодирования, без фактического кодирования и декодирования части аудиосигнала с использованием первого алгоритма кодирования, второй блок оценки предусматривается для оценки для части аудиосигнала второй меры качества, которая связана со вторым алгоритмом кодирования, без фактического кодирования и декодирования части ...

25-01-2018 дата публикации


Номер: RU2642553C2

Изобретение относится, к кодированию цифрового звука, а точнее к способам кодирования звуковых сигналов, содержащих составляющие разного характера. Постфильтр (440; 550; 740; 1040; 1140) ослабления межгармонического шума, адаптированный для приема входного сигнала, который включает предварительный звуковой сигнал, декодированный в соответствии с одним из нескольких режимов декодирования, где активность постфильтра традиционно связана с конкретными режимами кодирования, и для подачи выходного звукового сигнала. При этом содержит секцию управления, предназначенную для избирательного приведения в действие постфильтра в одном из следующих режимов: i) режим фильтрации, где он фильтрует предварительный звуковой сигнал с целью получения фильтрованного сигнала и его подачи как выходного звукового сигнала; и ii) режим пропускания, где он подает предварительный звуковой сигнал как выходной звуковой сигнал. Причем указанная секция управления выполнена с возможностью перехода в режим пропускания в ...

10-05-2015 дата публикации


Номер: RU2550550C2

Настоящее изобретение относится к средствам для обработки сигналов. Технический результат заключается в улучшении качества звука при расширении полосы частот. Кодер устанавливает интервал, включающий в себя 16 фреймов, в качестве участка интервала, подлежащего обработке, выводит кодированные данные высокочастотной полосы для получения компонента высокочастотной полосы входного сигнала, и кодированные данные низкочастотной полосы, полученные в результате кодирования сигнала низкочастотной полосы входного сигнала для каждого участка, подлежащего обработке. В этом случае для каждого фрейма выбирают коэффициент, используемый при оценке компонента высокочастотной полосы, и участок, подлежащий обработке, разделяют на последовательные сегменты фреймов, включающие в себя последовательные фреймы, в которых выбран коэффициент, в пределах того же участка, подлежащего обработке. Кроме того, формируют кодированные данные высокочастотной полосы, которые включают в себя данные, включающие в себя информацию ...

10-10-2014 дата публикации


Номер: RU2530254C2

Изобретение относится к HFR (высокочастотной реконструкции/регенерации) звуковых сигналов и предназначено для выполнения HFR звуковых сигналов, содержащих большие изменения в уровне энергии в пределах низкочастотного диапазона, который используется для реконструкции высоких частот звукового сигнала. Технический результат - предотвращение нежелательных шумов, вызываемых разрывами огибающей спектра высокочастотного звукового сигнала. Система сконфигурирована для генерирования ряда сигналов высокочастотных поддиапазонов, покрывающих высокочастотный интервал, исходя из ряда сигналов низкочастотных поддиапазонов. Система включает средства для приема ряда сигналов низкочастотных поддиапазонов; средства для приема набора целевых энергий, где каждая целевая энергия покрывает отличающийся целевой интервал в пределах высокочастотного интервала и служит признаком требуемой энергии одного или нескольких сигналов высокочастотных поддиапазонов, лежащих в пределах целевого интервала; средства для генерирования ...

22-06-2018 дата публикации

Номер: RU2658535C1

Изобретение относится к средствам для декодирования битовых потоков аудио с метаданными расширенного копирования спектральной полосы. Технический результат заключается в повышении эффективности копирования спектральной полосы. Принимают блок закодированного битового потока аудио. Демультиплексируют часть блока закодированного битового потока аудио. Декодируют часть блока закодированного битового потока аудио. Причем по меньшей мере один блок закодированного битового потока аудио включает в себя: заполняющий элемент с идентификатором, указывающим начало заполняющего элемента, и заполняющие данные после идентификатора, причем заполняющие данные включают в себя: по меньшей мере один флаг, идентифицирующий, должна ли быть выполнена базовая форма обработки копирования спектральной полосы или расширенная форма копирования спектральной полосы для содержимого аудио по меньшей мере одного блока закодированного битового потока аудио. 2 н. и 6 з.п. ф-лы, 7 ил., 3 табл.

21-12-2017 дата публикации


Номер: RU2639694C1

Изобретение относится к кодированию и декодированию аудиосигналов и предназначено для кодирования и декодирования сигнала, соответствующего диапазону высоких частот аудиосигнала. Технический результат – расширение диапазона высоких частот путем извлечения основного сигнала входного сигнала и регулирование энергии входного сигнала с использованием тональности диапазона высоких частот входного сигнала и тональности основного сигнала. Кодирующее устройство может понизить дискретизацию входного сигнала, выполнить базовое кодирование входного сигнала с пониженной дискретизацией, выполнить преобразование частоты входного сигнала и выполнить кодирование с расширением полосы частот, используя базовый сигнал входного сигнала в частотной области. 6 з.п. ф-лы, 38 ил.

10-06-2016 дата публикации


Номер: RU2568372C9

Изобретение относится к адаптивной обработке медиаданных. Технический результат - улучшение представления и обработки содержимого медиаданных. Способ обработки медиаданных, включающий этапы, на которых: определяют первым устройством в цепочке обработки медиаданных то, выполнялась ли на выходной версии медиаданных операция обработки медиаданных; в ответ на определение первым устройством того, что операция обработки медиаданных была выполнена на выходной версии медиаданных, выполняют: создание или модифицирование первым устройством состояния медиаданных, где состояние указывает тип обработки медиаданных, выполненной на выходной версии медиаданных; передачу выходной версии медиаданных и состояния медиаданных из первого устройства во второе устройство в нисходящем направлении по цепочке обработки медиаданных. 6 н. и 24 з.п. ф-лы, 26 ил.

20-05-2014 дата публикации


Номер: RU2012148280A

1. Устройство кодирования комбинированной обновляющей кодовой книги, содержащее:модуль предварительного квантования первого остатка возбуждения адаптивной кодовой книги, причем модуль предварительного квантования работает в области преобразования; имодуль обновляющей кодовой книги CELP, реагирующий на второй остаток возбуждения, создаваемый из первого остатка возбуждения адаптивной кодовой книги.

10-02-2016 дата публикации


Номер: RU2014127177A

... 1. Кодер (901), сконфигурированный для генерирования управляющих данных (905) из звукового сигнала (903), при этом кодер (901) звукового сигнала содержит:средства для анализа формы спектра звукового сигнала (903) и для определения степени разрывов огибающей спектра, вносимых при регенерации высокочастотной составляющей звукового сигнала (903) из ряда сигналов (602) низкочастотных поддиапазонов звукового сигнала (903); исредства для генерирования управляющих данных (905) для управления регенерацией высокочастотной составляющей на основе степени разрывов.2. Кодер (901) по п. 1, отличающийся тем, чтокодер (901) содержит систему (601, 703) высокочастотной реконструкции, называемой HFR, сконфигурированную для осуществления процесса HFR для генерирования высокочастотной составляющей из ряда сигналов (602) низкочастотных поддиапазонов;управляющие данные (905) служат признаком необходимости применения ряда коэффициентов усиления спектра в ходе процесса HFR; иряд коэффициентов усиления спектра связан ...

10-04-2014 дата публикации


Номер: RU2012142677A

... 1. Устройство обработки сигналов, содержащее:модуль демультиплексирования, выполненный с возможностью демультиплексирования входных кодированных данных на данные, включающие в себя информацию о сегменте, включающем в себя фреймы, в которых выбран один и тот же коэффициент в качестве коэффициента, используемого при формировании сигнала высокочастотной полосы, на участке, подлежащем обработке, включающем в себя множество фреймов, и информацию о коэффициенте для получения коэффициента, выбранного в указанных фреймах сегмента, и на кодированные данные низкочастотной полосы;модуль декодирования низкочастотной полосы, выполненный с возможностью декодирования кодированных данных низкочастотной полосы для формирования сигнала низкочастотной полосы;модуль выбора, выполненный с возможностью выбора коэффициента фрейма, подлежащего обработке, из множества коэффициентов на основе указанных данных;модуль вычисления мощности подполосы высокочастотной полосы, выполненный с возможностью вычисления мощности ...

02-01-1991 дата публикации


Номер: GB0002233195A

In a vocoder speech transmission process a synthesiser component (ST') at the receiving end has no spectral channel filter, but simulates its pulse signal responses and noise signal responses (hk(i), rk(J)), which are respectively stored in separate read-only memories (ROM H, ROM R), and read-out from each of these read-only memories is effected during a synthesis interval in dependence upon the tuning character criterion (Sc) for the weighting of the signal responses with the associated transmitted envelope curve values (ak), and the signal responses, together with the envelope curve values, are fed to a multiplier-adder. ...

06-10-2004 дата публикации

Номер: GB0002400286A

The present invention relates to a circuit for encoding a speech signal to a digital signal for use in a mobile telephone. Hitherto such encoding has involved a large number of multiply accumulate operations in the algebraic codebook search. The present invention comprises replacing one of more multiply accumulate units with a shift operation. Thus, the present invention is directed to a shift accumulate unit to perform on said speech signal an algebraic codebook search. Significant power efficiencies can be obtained by reducing the reliance upon so many multiply accumulate units.

10-05-2012 дата публикации

Номер: US20120116757A1

Provided are a method and apparatus for encoding and decoding a high frequency signal by using a low frequency signal. The high frequency signal can be encoded by extracting a coefficient by linear predicting a high frequency signal, and encoding the coefficient, generating a signal by using the extracted coefficient and a low frequency signal, and encoding the high frequency signal by calculating a ratio between the high frequency signal and an energy value of the generated signal. Also, the high frequency signal can be decoded by decoding a coefficient, which is extracted by linear predicting a high frequency signal, and a low frequency signal, and generating a signal by using the decoded coefficient and the decoded low frequency signal, and adjusting the generated signal by decoding a ratio between the generated signal and an energy value of the high frequency signal.

21-06-2012 дата публикации

Номер: US20120158408A1
Автор: James W. McGowan
Принадлежит: Alcatel Lucent SAS

A method and apparatus for reducing rendering latency in a terminal device which receives audio data from a communication network such as, for example, Voice over Internet Protocol (VoIP) communications networks. Received packets are advantageously decoded “immediately” upon receipt, and the decoded data is placed directly in the rendering buffer at a location corresponding to the time appropriate for rendering, without using any intermediate buffer. Then, in accordance with the principles of the present invention and more particularly in accordance with certain illustrative embodiments thereof, packet loss concealment (PLC) routines are advantageously applied preemptively, without first determining whether or not any subsequent packets have or have not been received by any particular time.

12-07-2012 дата публикации

Номер: US20120179700A1
Принадлежит: Apple Inc

An apparatus and method are described for reading a file into a universal representation and translating from that universal representation into various file formats. For example, a method according to one embodiment comprises: reading compressed audio data from a first audio file, the first audio file comprising audio data compressed using a first compression algorithm and bookkeeping data having a first format, the bookkeeping data specifying a location of the compressed audio data within the first audio file; and generating a universal representation of the first audio file without decompressing and recompressing the audio data, the universal representation having bookkeeping data of a second format specifying the location of compressed audio data within the universal representation.

23-08-2012 дата публикации

Номер: US20120215329A1
Принадлежит: Dolby Laboratories Licensing Corp

Techniques for re-associating dynamic metadata with media data are provided. A media processing system creates, with a first media processing stage, binding information comprising dynamic metadata and a time relationship between the dynamic metadata and media data. The binding information may be derived from the media data. While the first media processing stage delivers the media data to a second media processing stage in a first data path, the first media processing stage passes the binding information to the second media processing stage in a second data path. The media processing system re-associates, with the second media processing stage, the dynamic metadata and the media data using the binding information.

20-09-2012 дата публикации

Номер: US20120239408A1

A method of processing an audio signal is disclosed. The present invention includes receiving, by an audio processing apparatus, coding identification information indicating whether to apply a first coding scheme or a second coding scheme to a current frame; when the coding identification information indicates that the second coding scheme is applied to the current frame, receiving window type information indicating a particular window for the current frame, from among a plurality of windows; identifying that a current window is a long stop window based on the window type information, wherein the long stop window is followed by only long window of a following frame, wherein the long stop window includes a gentle long stop window and a steep long stop window; and, when the first coding scheme is applied to a previous frame, applying the gentle long stop window to the current frame, wherein: the gentle long stop window comprise an ascending line with first slope, the steep long stop window comprise an ascending line with second slope, and, the first slope is gentler than the second slope.

15-11-2012 дата публикации

Номер: US20120288117A1
Автор: Eun-mi Oh, Mi-young Kim

A noise filling method is provided that includes detecting a frequency band including a part encoded to 0 from a spectrum obtained by decoding a bitstream; generating a noise component for the detected frequency band; and adjusting energy of the frequency band in which the noise component is generated and filled by using energy of the noise component and energy of the frequency band including the part encoded to 0.

29-11-2012 дата публикации

Номер: US20120303375A1
Автор: Yuli You
Принадлежит: Digital Rise Technology Co Ltd

Provided are, among other things, systems, methods and techniques for decoding an audio signal from a frame-based bit stream. At least one frame includes processing information pertaining to the frame and entropy-encoded quantization indexes representing audio data within the frame. The processing information includes: (i) code book indexes, and (ii) code book application information specifying ranges of entropy-encoded quantization indexes to which the code books are to be applied. The entropy-encoded quantization indexes are decoded by applying the identified code books to the corresponding ranges of entropy-encoded quantization indexes.

31-01-2013 дата публикации

Номер: US20130028426A1

The invention provides methods and devices for stereo encoding and decoding using complex prediction in the frequency domain. In one embodiment, a decoding method, for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels, comprises the upmixing steps of: (i) computing a second frequency-domain representation of a first input channel; and (ii) computing an output channel on the basis of the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The upmixing can be suspended responsive to control data.

18-04-2013 дата публикации


Номер: US20130094653A1

1. A voice over Internet protocol (VoIP) device, comprising:a spatial renderer associated with a second endpoint, the spatial renderer configured to receive audio data from a first endpoint, and render the audio data among a plurality of speakers based, at least in part, on a first set of spatial information for a plurality of microphones associated with the first endpoint, and a second set of spatial information for the plurality of speakers associated with the second endpoint.

11-07-2013 дата публикации

Номер: US20130179161A1
Автор: Kelly Hale, Robert W. Zopf
Принадлежит: Broadcom Corp

A communications network is used to transfer user attribute information about participants in a communication session to their respective communication terminals for storage and use thereon to configure a speech codec to operate in a speaker-dependent manner, thereby improving speech coding efficiency. In a network-assisted model, the user attribute information is stored on the communications network and selectively transmitted to the communication terminals while in a peer-assisted model, the user attribute information is derived by and transferred between communication terminals.

08-08-2013 дата публикации

Номер: US20130202118A1
Принадлежит: Sony Corp

The present invention relates to a signal processing apparatus and a signal processing method, an encoder and an encoding method, a decoder and a decoding method, and a program capable of reproducing music signal having a better sound quality by expansion of frequency band. A sampling frequency conversion unit converts a sampling frequency of an input signal, and a sub-band division circuit divides the input signal after the sampling conversion into sub-band signals of sub-bands having the number corresponding to the sampling frequency. A pseudo high band sub-band power calculation circuit calculates pseudo high band sub-band powers based on low band signals of the input signal and coefficient tables having coefficients for the respective high band sub-bands. A pseudo high band sub-band power difference calculation circuit compares high band sub-band powers and the pseudo high band sub-band powers to each other and selects a coefficient table from plural coefficient tables. In addition, a coefficient index which specifies the coefficient table is encoded and set as high band encoded data. The present invention can be applied to an encoder.

19-12-2013 дата публикации

Номер: US20130339034A1
Принадлежит: Apple Inc

An apparatus and method are described for reading a file into a universal representation and translating from that universal representation into various file formats. For example, a method according to one embodiment comprises: reading compressed audio data from a first audio file, the first audio file comprising audio data compressed using a first compression algorithm and bookkeeping data having a first format, the bookkeeping data specifying a location of the compressed audio data within the first audio file; and generating a universal representation of the first audio file without decompressing and recompressing the audio data, the universal representation having bookkeeping data of a second format specifying the location of compressed audio data within the universal representation.

09-01-2014 дата публикации


Номер: US20140012570A1

33. An apparatus for decoding wireless signals, the apparatus comprising:a memory; a dynamic memory management module; a demodulation module coupled to the dynamic memory management module; a physical layer segment planner module coupled to the dynamic memory management module; a deinterleaver module coupled to the physical layer segment planner module; an address filter module coupled to the physical layer segment planner module and the deinterleaver module; a convolutional decoding module coupled to the physical layer segment planner module; and a service boundary predictor module coupled to the dynamic memory management module and the convolutional decoding module.

06-01-2022 дата публикации


Номер: US20220005488A1

1. An encoding method comprising:computing the first feature information of an input signal using a recurrent encoding model;quantizing the first feature information and producing the first feature bitstream;computing the first output signal from the quantized first feature information using a recurrent decoding model;computing the second feature information of the input signal using a nonrecurrent encoding model;quantizing the second feature information and producing the second feature bitstream;computing the second output signal from the quantized second feature information using a nonrecurrent decoding model;determining an encoding mode based on the input signal, the first output signal, the second output signal, the first feature bitstream, and the second feature bitstream; andoutputting an overall bitstream by multiplexing an encoding mode bit and one of the first feature bitstream and the second feature bitstream depending on the encoding mode.

01-01-2015 дата публикации

Номер: US20150003632A1

The present document relates to the technical field of audio coding, decoding and processing. It specifically relates to methods of recovering high frequency content of an audio signal from low frequency content of the same audio signal in an efficient manner. A method for determining a first banded tonality value ( 311, 312 ) for a first frequency subband ( 205 ) of an audio signal is described. The first banded tonality value ( 311, 312 ) is used for approximating a high frequency component of the audio signal based on a low frequency component of the audio signal. The method comprises determining a set of transform coefficients in a corresponding set of frequency bins based on a block of samples of the audio signal; determining a set of bin tonality values ( 341 ) for the set of frequency bins using the set of transform coefficients, respectively; and combining a first subset of two or more of the set of bin tonality values ( 341 ) for two or more corresponding adjacent frequency bins of the set of frequency bins lying within the first frequency subband, thereby yielding the first banded tonality value ( 311, 312 ) for the first frequency subband.

07-01-2016 дата публикации

Versatile music distribution

Номер: US20160005411A1
Принадлежит: Meridian Audio Limited

Methods and devices are described whereby a representation of an original PCM signal may be reversibly degraded in a controlled manner and information losslessly embedded to produce a streamable PCM signal, which provides a controlled audio quality when played on standard players and conditional access to a lossless presentation of the original PCM signal. Using such techniques allows control over the level of degradation of the signal and also flexibility in the type information of information embedded. Some methods require a song key, which is employed in one or both of the degrading and embedding steps and for creating a token. These methods may further require a user key, which is used to encrypt the song key before creating the token.

07-01-2021 дата публикации


Номер: US20210005210A1

An audio encoder for encoding an audio signal having a lower frequency band and an upper frequency band includes: a detector for detecting a peak spectral region in the upper frequency band of the audio signal; a shaper for shaping the lower frequency band using shaping information for the lower band and for shaping the upper frequency band using at least a portion of the shaping information for the lower band, wherein the shaper is configured to additionally attenuate spectral values in the detected peak spectral region in the upper frequency band; and a quantizer and coder stage for quantizing a shaped lower frequency band and a shaped upper frequency band and for entropy coding quantized spectral values from the shaped lower frequency band and the shaped upper frequency band. 1a detector for detecting a peak spectral region in the upper frequency band of the audio signal;a shaper for shaping the lower frequency band using shaping information for the lower band and for shaping the upper frequency band using at least a portion of the shaping information for the lower frequency band, wherein the shaper is configured to additionally attenuate spectral values in the detected peak spectral region in the upper frequency band; anda quantizer and coder stage for quantizing a shaped lower frequency band and a shaped upper frequency band and for entropy coding quantized spectral values from the shaped lower frequency band and the shaped upper frequency band.. Audio encoder for encoding an audio signal comprising a lower frequency band and an upper frequency band, comprising: This application is a continuation of U.S. patent application Ser. No. 16/143,716, filed Sep. 25, 2018, which is a continuation of copending International Application No. PCT/EP2017/058238, filed Apr. 6, 2017, which is incorporated herein by reference in its entirety, which additionally claimed priority from European Application No. EP 16 164 951.2, filed Apr. 12, 2016, which is incorporated herein by ...

07-01-2021 дата публикации


Номер: US20210005211A1

A technique including receiving and decoding a coded bitstream encoded with audio content including first audio objects corresponding to a first media content type of two consecutive media content types and second audio objects corresponding to a second media content type of the two consecutive media content types, and audio metadata corresponding to the audio content. The audio metadata including first and second audio object gains, for the first and second audio objects, generated in part based on a first fading curve of the first media content type and a second fading curve of the second media content type, respectively. The technique further includes applying the first and second audio object gains to the first and second audio objects, and rendering a sound field represented by the first audio object with the applied first audio object gain and the second audio object with the applied second audio object gain. 1. A method , performed by a downstream audio rendering stage in an end-to-end audio processing chain , comprising:receiving and decoding a coded bitstream generated by an upstream audio processor, wherein the coded bitstream is encoded with audio content and audio metadata corresponding to the audio content;wherein the audio content includes first audio objects corresponding to a first media content type of two consecutive media content types and second audio objects corresponding to a second media content type of the two consecutive media content types;wherein the audio metadata includes first and second audio object gains, respectively for the first and second audio objects, generated at least in part based on a first fading curve of the first media content type and a second fading curve of the second media content type, respectively;applying the first and second audio object gains generated at least in part based on the first and second fading curves to the first and second audio objects, respectively;rendering a sound field represented by the first ...

04-01-2018 дата публикации


Номер: US20180005640A1
Автор: Tsukagoshi Ikuo

It is attempted to reduce the processing load of a receiver at the time of integrating plural audio streams. 1. A transmission device comprising:an encoding unit configured to generate a predetermined number of audio streams; anda transmission unit configured to transmit a container of a predetermined format including the predetermined number of audio streams,wherein the audio streams are constituted by an audio frame including a first packet that includes encoded data as payload information and a second packet that includes configuration information representing a configuration of the payload information of the first packet as payload information, andcommon index information is inserted in payloads of the first packet and the second packet that are related.2. The transmission device according to claim 1 , wherein the encoded data that the first packet include as payload information is encoded channel data or encoded object data.3. A transmission method comprising:an encoding step of generating a predetermined number of audio streams; anda transmission step of using a transmission unit to transmit a container of a predetermined format including the predetermined number of audio streams,wherein the audio streams are constituted by an audio frame including a first packet that includes encoded data as payload information and a second packet that includes configuration information representing a configuration of the payload information of the first packet as payload information,and common index information is inserted in payloads of the first packet and the second packet that are related.4. A receiving device comprising:a receiving unit configured to receive a container of a predetermined format including a predetermined number of audio streams,wherein the audio streams are constituted by an audio frame including a first packet that includes encoded data as payload information and a second packet that includes configuration information representing a configuration of ...

02-01-2020 дата публикации

Audio processing device and audio playback system thereof

Номер: US20200005800A1
Автор: Lichao Shi
Принадлежит: Lichao Shi

The present disclosure is provided an audio processing device and an audio playback system thereof. The audio processing device includes a receiving module configured to receive an audio signal and identify a transmission mode of the audio signal, with the transmission mode at least including a Bluetooth transmission mode and a WIFI transmission mode; a processing module configured to decode the audio signal into an analog audio signal and a digital audio signal; an output module configured to receive the analog audio signal and the digital audio signal and then output the analog audio signal to a conventional audio via an AUX analog output port and output the digital audio signal to an HiFi audio via an optical fiber output port. The present disclosure can receive audio signals with different transmission types and output the audio signals of different types, which enriches audio selectivity and is of high interest.

02-01-2020 дата публикации


Номер: US20200005804A1

Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one block of an encoded audio bitstream. The block includes a fill element that begins with an identifier followed by fill data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. A corresponding method for decoding an encoded audio bitstream is also provided. 1. An audio processing device comprising:a bitstream payload deformatter configured to demultiplex a block of an encoded audio bitstream; at least one flag identifying whether enhanced spectral band replication processing is to be performed on audio content of the block of the encoded audio bitstream, and', 'enhanced spectral band replication metadata which does not include one or more parameters used for both spectral patching and harmonic transposition, wherein the enhanced spectral band replication metadata is metadata configured to enable at least one eSBR tool which is described or mentioned in the MPEG USAC standard and which is not described or mentioned in the MPEG-4 AAC standard,, 'a fill element with an identifier indicating a start of the fill element and fill data after the identifier, wherein the fill data includes, 'a decoding subsystem coupled to the bitstream payload deformatter and configured to decode at least a portion of the block of the encoded audio bitstream, wherein the block of the encoded audio bitstream includeswherein the enhanced spectral band replication metadata includes a parameter indicating whether to perform signal adaptive frequency domain oversampling, and the decoding subsystem is further configured to perform signal adaptive frequency domain oversampling if the parameter indicates that signal adaptive frequency domain oversampling is to be performed.2. The audio processing device of claim 1 , wherein the ...

03-01-2019 дата публикации


Номер: US20190005974A1

System, methods and apparatus are described that relate to aligning timing of bi-directional, multi-stream I2S audio transmitted between IC devices, and to support audio streams that are digitized using multiple sampling rates. A method includes time-division multiplexing a first stream of digitized audio data with a second stream of digitized audio data at a primary device to obtain a first multiplexed signal, transmitting the first multiplexed signal over a serial bus to a secondary device is configured to extract the first stream of digitized audio data from the first multiplexed signal and provide the first stream of digitized audio data to a first audio peripheral coupled to the secondary device, extracting the second stream of digitized audio data from the first multiplexed signal at the primary device, and providing the extracted second stream of digitized audio data to a second audio peripheral coupled to the first device 1. A method , comprising:at a primary device, time-division multiplexing a first stream of digitized audio data with a second stream of digitized audio data to obtain a first multiplexed signal;transmitting the first multiplexed signal over a serial bus to a secondary device that is configured to extract the first stream of digitized audio data from the first multiplexed signal and provide the first stream of digitized audio data to a first audio peripheral coupled to the secondary device;at the primary device, extracting the second stream of digitized audio data from the first multiplexed signal to provide an extracted second stream of digitized audio data; andproviding the extracted second stream of digitized audio data to a second audio peripheral coupled to the primary device,wherein the first audio peripheral and the second audio peripheral include digital-to-analog converters configured to produce analog signals from respective digitized audio data.2. The method of claim 1 , wherein time-division multiplexing the first stream of ...

03-01-2019 дата публикации


Номер: US20190005975A1
Автор: Bruhn Stefan
Принадлежит: Telefonaktiebolaget lM Ericsson (publ)

In accordance with an example embodiment of the present invention, disclosed is a method and an apparatus thereof for formatting a payload for transmission of multi-mode speech/audio codec data. The method comprises deciding whether a header-less or a header-full payload format is used for transmission of a coded frame. The decision is based on a codec mode and a required functionality. The payload data is packetized with or without the payload header depending on the decision. 1. A method for decoding coded audio frames , the method comprising:receiving a Real-time Transport Protocol (RTP) packet, the RTP packet comprising an RTP header and an RTP payload, wherein the RTP payload comprises a coded audio frame encoded with a bit-rate identified by the size of the RTP payload, and further wherein the RTP payload does not comprise any RTP payload header;determining the size of the RTP payload of the RTP packet, wherein the determined size uniquely identifies the bit-rate;determining the bit-rate based on the determined size of the RTP payload of the RTP packet; anddecoding the coded audio frame based on the determined bit-rate.2. The method of claim 1 , further comprising:prior to determining the bit-rate, determining that the determined size of the RTP payload is one of a plurality of protected payload sizes.3. The method of claim 2 , wherein determining the bit-rate based on the determined size of the RTP payload of the RTP packet is performed as a result of determining that the determined size of the RTP payload is one of the plurality of protected payload sizes.4. The method of claim 3 , wherein the coded audio frame represents a speech frame of having predetermined length.5. the method of claim 4 , wherein the predetermined length is 20 milliseconds (ms).6144160192264328488. The method of claim 4 , wherein the plurality of protected payload sizes comprises at least the following seven payload sizes expressed in units of bits: claim 4 , claim 4 , claim 4 , claim 4 ...

07-01-2021 дата публикации


Номер: US20210006865A1

Disclosed are a 5G TV, a video playback method based on 5G TV and a computer readable storage medium. The 5G TV includes a 5G communication module, a decode module, and an audio and video playback module; the 5G communication module is connected to the decode module, and is configured to receive 5G audio and video signals and send the 5G audio and video signals to the decode module; and the decode module is connected to the audio and video playback module, and is configured to decode the received 5G audio and video signals to obtain audio data and video data, and is configured to send the audio data and the video data to the audio and video playback module, to make the audio and video playback module synchronously play the audio data and the video data. 1. A 5G TV , comprising: a 5G communication module , a decode module , and an audio and video playback module , wherein:the 5G communication module is connected to the decode module, and is configured to receive 5G audio and video signals and send the 5G audio and video signals to the decode module; andthe decode module is connected to the audio and video playback module, and is configured to decode the received 5G audio and video signals to obtain audio data and video data, and is configured to send the audio data and the video data to the audio and video playback module, to make the audio and video playback module synchronously play the audio data and the video data.2. The 5G TV of claim 1 , wherein:the 5G communication module comprises a 5G communication component, a 5G communication antenna, and a SIM card interface for connecting with a SIM card;a first terminal of the 5G communication component is connected to the decode module as an output terminal of the 5G communication module, a second terminal of the 5G communication component is connected to the SIM card through the SIM card interface, the 5G communication component is configured to send an online request comprising identification information after ...

04-01-2018 дата публикации


Номер: US20180007398A1

An encoder for encoding secondary media data including metadata and control data for primary media data is shown, wherein the encoder is configured to encode the secondary media data using adding redundancy or bandlimiting and wherein the encoder is configured to output the encoded secondary media data as a stream of digital words. Therefore, the stream of digital words may be formed such that it is capable to resist a typical processing of a digital audio stream. Furthermore, processors for processing a digital audio stream are able to process the stream of digital words, since the stream of digital words may be designed as an audio-like or analog-like digital stream. 1. An encoder for encoding secondary media data comprising metadata or control data for primary media data , the encoder comprising:a grouper configured for grouping a bitstream of secondary media data to form grouped secondary media data, representing data words;a reference signal generator for generating a reference pattern indicating a reference amplitude or a predetermined timing instant in the primary media data;a stream builder comprising a filter to low-pass filter the data words and the reference pattern to acquire data pulses comprising a length of more than one sample of a predetermined sample rate, wherein an amplitude of a data pulse is weighted according to the grouped secondary media data or the data words, and wherein the filter is configured to add up consecutive weighted data pulses at instants of the predetermined sample rate to acquire the stream of digital words;wherein the encoder is configured to output the stream of digital words.2. The encoder according to claim 1 , wherein the encoding comprises adding redundancy by the digital modulation.3. The encoder according to claim 1 , wherein the digital modulation is so that two or more bits of the secondary media data are transmitted per digital word of the stream of digital words.4. The encoder according to claim 1 , wherein the ...

07-01-2021 дата публикации


Номер: US20210006918A1

In general, techniques are described for adapting audio streams for rendering. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store a plurality of audio streams that include one or more sub-streams. The one or more processors may determine, based on the plurality of audio streams, a total number of the one or more sub-streams for all of the plurality of audio streams, and adapt, when the total number of the sub-streams is greater than a render threshold, the plurality of audio streams to decrease the number of the one or more sub-streams and obtain an adapted plurality of audio streams. The one or more processors may also apply the renderer to the adapted plurality of audio streams to obtain the one or more speaker feeds, and output the one or more speaker feeds to one or more speakers. 1. A device configured to play one or more of a plurality of audio streams , the device comprising:a memory configured to store a plurality of audio streams, each of the plurality of audio streams representative of a soundfield and include one or more sub-streams; andone or more processors coupled to the memory, and configured to:determine, based on the plurality of audio streams, a total number of the one or more sub-streams for all of the plurality of audio streams;adapt, when the total number of the one or more sub-streams is greater than a render threshold indicative of a total number of sub-streams a renderer supports when rendering the plurality of audio streams to one or more speaker feeds, the plurality of audio streams to decrease the number of the one or more sub-streams and obtain an adapted plurality of audio streams including a reduced total number of the one or more sub-streams that is equal to or less than the render threshold;apply the renderer to the adapted plurality of audio streams to obtain the one or more speaker feeds; andoutput the one or more speaker feeds to one or more speakers.2. The ...

03-01-2019 дата публикации

Switching Binaural Sound

Номер: US20190007776A1

A method provides binaural sound to a person through electronic earphones. The binaural sound localizes to a sound localization point (SLP) in empty space that is away from but proximate to the person. When an event occurs, the binaural sound switches or changes to stereo sound, to mono sound, or to altered binaural sound. 120.-. (canceled)21. A method executed by one or more electronic devices in a computer system to switch binaural sound to one of stereo sound and mono sound during an electronic communication between a person and a user , the method comprising:executing, by the one or more electronic devices in the computer system, the electronic communication that provides a voice of the user in binaural sound to the person such that the voice of the user in the binaural sound externally localizes to the person to a sound localization point (SLP) that is at least three feet away from a head of the person;determining, by the one or more electronic devices in the computer system during the electronic communication, when an object enters an area of the SLP;switching, by the one or more electronic devices in the computer system during the electronic communication, the binaural sound to the one of stereo sound and mono sound when the object enters the area of the SLP; andproviding, by the one or more electronic devices in the computer system during the electronic communication and in response to the switching, the voice of the user to the person in the one of stereo sound and mono sound.22. The method of claim 21 , further comprising:determining, by the one or more electronic devices in the computer system during the electronic communication, when a packet loss is above a threshold; andswitching, by the one or more electronic devices in the computer system during the electronic communication and in response to the determining that the packet loss is above the threshold, the binaural sound to the one of stereo sound and mono sound.23. The method of claim 21 , wherein ...

12-01-2017 дата публикации


Номер: US20170011749A1

An audio encoder device includes an audio encoder configured for producing an encoded audio bitstream from an audio signal having consecutive audio frames; a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from an dynamic range control sequence corresponding to the audio signal and having consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames has one or more nodes, wherein each node of the one or more nodes has gain information for the audio signal and time information indicating to which point in time the gain information corresponds. 1. An audio encoder device comprising:an audio encoder configured for producing an encoded audio bitstream from an audio signal comprising consecutive audio frames;a dynamic range control encoder configured for producing an encoded dynamic range control bitstream from an dynamic range control sequence corresponding to the audio signal and comprising consecutive dynamic range control frames, wherein each dynamic range control frame of the dynamic range control frames comprises one or more nodes, wherein each node of the one or more nodes comprises gain information for the audio signal and time information indicating to which point in time the gain information corresponds;wherein the dynamic range control encoder is configured in such way that the encoded dynamic range control bitstream comprises for each dynamic range control frame of the dynamic range control frames a corresponding bitstream portion;wherein the dynamic range control encoder is configured for executing a shift procedure, wherein one or more nodes of the nodes of one reference dynamic range control frame of the dynamic range control frames are selected as shifted nodes, wherein a bit representation of each of the one or more shifted nodes of the one reference dynamic range control frame is embedded in the bitstream portion corresponding to the dynamic range ...

12-01-2017 дата публикации


Номер: US20170011750A1
Автор: Liu Zongxian, Tanaka Naoya

An apparatus for decoding surround audio signal, includes a Bitstream De-multiplexer for unpacking a bitstream into spatial parameters and core parameters, a set of Core Decoder for decoding the core parameters into a set of core signal, a matrix derivation unit for deriving the rendering matrix from the spatial parameters and playback speaker layout information, a renderer for rendering of the decoded core signal to playback signals using the rendering matrix. 1. An apparatus for decoding a surround audio signal , comprising:a Bitstream De-multiplexer for unpacking a bitstream into predominant sound parameters, ambiance parameters, channel assignment parameters and core parameters;a set of Core Decoders for decoding the core parameters into a set of core signals;a predominant sound ambiance switch for assigning the decoded core signal to predominant sound and ambiance according to the channel assignment parameters;a matrix derivation unit for deriving a predominant sound rendering matrix from the predominant sound parameters and playback speaker layout information;a matrix derivation unit for deriving an ambiance rendering matrix from the ambiance parameters and playback speaker layout information;a predominant sound renderer for rendering of the predominant sound to playback signals using the predominant sound rendering matrix;an ambiance renderer for rendering of ambient sound to the playback signals using the ambiance rendering matrix; andan output signal composition unit for composing the playback signals using the rendered predominant sound and the rendered ambient sound.2. An apparatus according to claim 1 , wherein said core decoder corresponds to MPEG-1 Audio Layer III or AAC or HE-AAC or Dolby AC-3 or MPEG USAC standard.3. An apparatus according to claim 1 , wherein said surround audio signal is High Order Ambisonics signal.4. An apparatus according to claim 1 , wherein said spatial parameters comprising of Principal Component Analysis (PCA) or Singular ...

11-01-2018 дата публикации


Номер: US20180012609A1

This disclosure falls into the field of audio coding, in particular it is related to the field of providing a framework for providing loudness consistency among differing audio output signals. In particular, the disclosure relates to methods, computer program products and apparatus for encoding and decoding of audio data bitstreams in order to attain a desired loudness level of an output audio signal. 1. A method comprising:obtaining, by a decoding device, an encoded bitstream;extracting, by the decoding device, an audio signal and metadata from the encoded bitstream, the metadata including compression curve data and loudness data;generating, by the decoding device, loudness values using the loudness data;mapping, by the decoding device, the loudness values to dynamic range compression (DRC) gains using the compression curve data; andapplying, by the decoding device, the DRC gains to the audio signal.2. The method of claim 1 , wherein the audio signal includes at least a dialog content stream and a non-dialog content stream claim 1 , and applying the DRC gains to the audio signal comprises:applying the DRC gains to a time segment of the non-dialog content stream of the audio signal to increase a loudness of the dialog content stream.3. The method of claim 1 , wherein the DRC data applies to groups of channels.4. The method of claim 3 , wherein at least some of the loudness data is associated with a specific channel in the groups of channels.5. The method of claim 1 , wherein the DRC data comprises multiple DRC profiles corresponding to DRC modes claim 1 , each DRC profile tailored to a particular audio signal to which the DRC gains can be applied.6. The method of claim 1 , wherein mapping the loudness values to DRC gains comprises a smoothing operation of the DRC gains.7. The method of claim 6 , wherein the metadata includes time-constants for use in the smoothing operation.8. The method of claim 7 , wherein the time-constants are different depending on properties ...

11-01-2018 дата публикации


Номер: US20180012610A1

An audio processing unit (APU) is disclosed. The APU includes a buffer memory configured to store at least one frame of an encoded audio bitstream, where the encoded audio bitstream includes audio data and a metadata container. The metadata container includes a header and one or more metadata payloads after the header. The one or more metadata payloads include dynamic range compression (DRC) metadata, and the DRC metadata is or includes profile metadata indicative of whether the DRC metadata includes dynamic range compression (DRC) control values for use in performing dynamic range compression in accordance with at least one compression profile on audio content indicated by at least one block of the audio data. 1. An audio processing method , comprising:receiving a block of encoded audio data;receiving encoded audio metadata associated with the block of audio data;performing a cryptographic hash on the audio metadata and on at least a portion of the audio data, to produce a currently-computed hash;retrieving, from a data field associated with the encoded audio metadata, a previously-computed hash; andperforming an authentication process that involves comparing the currently-computed hash with the previously-computed hash.2. The audio processing method of claim 1 , further comprising disabling or altering at least one operation to be performed on the audio data if the authentication process succeeds.3. The audio processing method of claim 1 , wherein the metadata comprises dynamic range compression metadata.4. The audio processing method of claim 1 , wherein the metadata comprises loudness processing state metadata.5. The audio processing method of claim 1 , further comprising decoding at least a portion of the audio metadata or the audio data.6. A non-transitory medium having software stored thereon claim 1 , the software including instructions for performing an audio processing method claim 1 , the audio processing method comprising:receiving a block of encoded ...

11-01-2018 дата публикации


Номер: US20180012611A1

Systems and methods are presented for cross-fading (or other multiple clip processing) of information streams on a user or client device, such as a telephone, tablet, computer or MP3 player, or any consumer device with audio playback. Multiple clip processing can be accomplished at a client end according to directions sent from a service provider that specify a combination of (i) the clips involved; (ii) the device on which the cross-fade or other processing is to occur and its parameters; and (iii) the service provider system. For example, a consumer device with only one decoder, can utilize that decoder (typically hardware) to decompress one or more elements that are involved in a cross-fade at faster than real time, thus pre-fetching the next element(s) to be played in the cross-fade at the end of the currently being played element. The next elements(s) can, for example, be stored in an input buffer, then decoded and stored in a decoded sample buffer, all prior to the required presentation time of the multiple element effect. At the requisite time, a client device component can access the respective samples of the decoded audio clips as it performs the cross-fade, mix or other effect. Such exemplary embodiments use a single decoder and thus do not require synchronized simultaneous decodes. 131-. (canceled)32. An audio playback device , comprising:an input buffer for storing audio clips;a decoder connected to the input buffer and configured to decode the audio clips stored in the input buffer;a decoded audio buffer connected to the decoder and configured to store the audio clips decoded by the decoder; download the audio clips to be stored by the input buffer; and', 'play back the decoded audio clips stored in the decoded audio buffer with at least one fade or transition effect at a boundary between successive audio clips; and, 'a playout controller configured to determine network conditions and performance of hardware of the audio playback device; and', 'adjust ...

09-01-2020 дата публикации


Номер: US20200013414A1

In general, techniques are described by which to embed enhanced audio transports in backward compatible bitstreams. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store the backward compatible bitstream, which conforms to a legacy transport format. The processor(s) may obtain, from the backward compatible bitstream, legacy audio data that conforms to a legacy audio format, and obtain, from the backward compatible bitstream, extended audio data that enhances the legacy audio data. The processor(s) may also obtain, based on the legacy audio data and the extended audio data, enhanced audio data that conforms to an enhanced audio format, and output the enhanced audio data to one or more speakers. 1. A device configured to process a backward compatible bitstream , the device comprising:one or more memories configured to store at least a portion of the backward compatible bitstream, the backward compatible bitstream conforming to a legacy transport format; andone or more processors configured to:obtain, from the backward compatible bitstream, legacy audio data that conforms to a legacy audio format;obtain, from the backward compatible bitstream, extended audio data that enhances the legacy audio data;obtain, based on the legacy audio data and the extended audio data, enhanced audio data that conforms to an enhanced audio format; andoutput the enhanced audio data to one or more speakers.2. The device of claim 1 , wherein the legacy transport format comprises a psychoacoustic codec transport format.3. The device of claim 2 , wherein the psychoacoustic coded transport format comprises an Advanced Audio Coding (AAC) transport format or an AptX transport format.4. The device of claim 1 ,wherein the legacy transport format comprises an Advanced Audio Coding transport format or an AptX transport format,wherein the one or more processors are configured to obtain the enhanced audio data from one or more fill ...

09-01-2020 дата публикации

Synchronizing enhanced audio transports with backward compatible audio transports

Номер: US20200013426A1
Принадлежит: Qualcomm Inc

In general, techniques are described by which to synchronize enhanced audio transports with backward compatible audio transports. A device comprising a memory and one or more processors may be configured to perform the techniques. The memory may store a backward compatible bitstream conforming to a legacy transport format. The processor may obtain, from the backward compatible bitstream, a first audio transport stream, and obtain, from the backward compatible bitstream, a second audio transport stream. The processor(s) may also obtain, from the backward compatible bitstream, indications representative of synchronization information for the first audio transport stream and the second audio transport stream. The processor(s) may synchronize, based on the indications, the first audio transport stream and the second audio transport to obtain synchronized audio data stream. The processor(s) may obtain, based the synchronized audio data, enhanced audio data, and output the enhanced audio data to one or more speakers.

14-01-2021 дата публикации


Номер: US20210014546A1

A transmitting method according to one aspect of the present disclosure includes: encoding a video signal and generating encoded data including a plurality of access units; storing the plurality of access units in a packet in a unit that defines one access unit as one unit or in a unit defined by dividing one access unit, and generating a packet group; transmitting the generated packet group as data; generating first information and second information, the first information indicating a presentation time of a first access unit that is presented first among the plurality of access units, and the second information being used to calculate a decoding time of the plurality of access units; and transmitting the first information and the second information as control information. 1. A transmitting method comprising:dividing an access unit into slice segments or tiles, encoding the slice segments or the tiles, and storing at least one of the encoded slice segments or at least one of the encoded tiles in a network abstraction layer (NAL) unit, the access unit being an image included in a video signal;storing the NAL unit in a data unit;storing, in a NAL unit different from the NAL unit in which the video signal is stored, a parameter set for encoding the video signal;storing the data unit in a packet, in units of one data unit, in units of a plurality of data units including the data unit, or in units of portions into which the data unit is divided, and generating a packet group, the packet in which the data unit is stored being different from a packet in which a data unit including the parameter set is stored;transmitting the generated packet group as data;generating control information, the control information including presentation time information of a first access unit, and information used to calculate a decoding time of the plurality of access units; andtransmitting the control information,wherein the control information is stored and transmitted in a payload of a ...

21-01-2016 дата публикации


Номер: US20160019902A1

The invention relates to a method for combining a plurality of audio streams encoded by frequency sub-band encoding, comprising the following steps: decoding (E) a portion of the encoded streams over at least one frequency sub-band; combining (E) the streams thus encoded to form a mixed stream; selecting (E), from among the plurality of encoded audio streams, at least one encoded replication stream, over at least one frequency sub-band that is different from that of the decoding step. The method is such that the selection of the at least one encoded replication stream is carried out according to a criterion which takes into consideration the presence of a predetermined frequency band in the encoded stream (E). The invention also relates to a device which implements the described method and can be integrated into a conference bridge, a communication terminal or a communication gateway. 1. A method for combining a plurality of audio streams coded according to a frequency sub-band coding , comprising the following steps:{'b': '301', 'decoding (E) of a part of the streams coded on at least one frequency sub-band;'}{'b': '302', 'addition (E) of the streams thus decoded to form a mixed stream;'}{'b': '303', 'claim-text': {'b': '304', 'the method being characterized in that the selection of the at least one replication coded stream is effected according to a criterion taking into account the presence of a predetermined frequency band in the coded stream (E).'}, 'selection (E), from among the plurality of coded audio streams, of at least one replication coded stream, on at least one frequency sub-band different from that of the decoding step;'}2. The method as claimed in claim 1 , characterized in that it furthermore comprises a step of preselecting the coded audio streams according to a predetermined criterion.3. The method as claimed in claim 1 , characterized in that claim 1 , in the case where several coded streams are selected in the selection step claim 1 , an ...

21-01-2016 дата публикации


Номер: US20160019903A1

The invention relates to a method for mixing a plurality of audio streams coded according to a frequency sub-band coding, comprising the steps for decoding (E) a part of the coded streams over at least a first frequency sub-band, for summing (E) the streams thus decoded so as to form at least a first mixed stream. The method is such that it comprises the steps for detection (E), over at least a second frequency sub-band different from the at least first sub-band, of the presence of a predetermined frequency band in the plurality of coded audio streams and for summing (E) the decoded audio streams (E) for which the presence of the predetermined frequency band has been detected, over said at least a second sub-band, so as to form at least a second mixed stream. 1. A method for mixing a plurality of coded audio streams according to a coding by frequency sub-bands , comprising the following steps:{'b': '201', 'decoding (E) of a part of the coded streams over at least a first frequency sub-band;'}{'b': '202', 'summing (E) of the streams thus decoded so as to form at least a first mixed stream;'}the method being characterized in that it comprises the steps for:{'b': '203', 'detection (E), over at least a second frequency sub-band different from the at least first sub-band, of the presence of a predetermined frequency band within the plurality of coded audio streams;'}{'b': 205', '204, 'summing (E) of the decoded audio streams (E) for which the presence of the predetermined frequency band has been detected, over said at least second sub-band, so as to form at least a second mixed stream.'}2. The method as claimed in claim 1 , characterized in that it furthermore comprises a step for pre-selection of the coded audio streams according to a predetermined criterion claim 1 , prior to the detection step.3. The method as claimed in claim 1 , characterized in that it furthermore comprises a step of re-coding the mixed streams.4. The method as claimed in claim 1 , characterized in ...

19-01-2017 дата публикации

Method and apparatus for encoding/decoding an audio signal

Номер: US20170018280A1
Автор: Hyun-Wook Kim, Nam-Suk Lee

Provided are a method and apparatus for encoding an audio signal and a method and apparatus for decoding an audio signal, in which errors generated during encoding and decoding of the audio signal are reduced to enhance the audio quality of a reconstructed audio signal. The method of encoding the audio signal includes detecting a pitch of the audio signal, determining a filter coefficient based on the detected pitch, performing second filtering on the audio signal, based on the determined filter coefficient; and encoding an audio signal resulting from the second filtering.

03-02-2022 дата публикации

Optimized Audio Forwarding

Номер: US20220038818A1

Methods and systems for optimizing a routing of audio data to audio transmitting devices using a Bluetooth network are disclosed. One method includes receiving an encoded audio bitstream at a first speaker of the audio rendering system comprising a first and a second audio channels, separating a first set of spectral components of the first audio channel and a second set of spectral components of the second audio channel from the encoded audio bitstream, without decoding the audio bitstream, generating a first encoded bitstream from the first set of spectral components, and forwarding the first encoded bitstream to a second speaker of the audio rendering system over the wireless link. 1. An audio system comprising:a first device with a first processor; anda second device;wherein the first device is configured to transmit data to the second device over a wireless link; andwherein the first processor is configured to:receive an encoded audio bitstream comprising a first and a second audio channels;separate a first set of spectral components of the first audio channel and a second set of spectral components of the second audio channel from the encoded audio bitstream, without decoding the audio bitstream;generate a first encoded bitstream from the first set of spectral components; andforward the first encoded bitstream to the second device over the wireless link.2. The system according to claim 1 , wherein the separating of the first and second sets of spectral components comprises unpacking the encoded audio bitstream.3. The system according to claim 2 , wherein the first channel and the second channel are joint encoded claim 2 , and wherein the separating of the first and second sets of spectral components further comprises an inverse quantization of the unpacked audio bitstream.4. The system according to wherein the generating of the first encoded bitstream is further based on a quantization of the first set of spectral components.5. The system according to claim 1 ...

16-01-2020 дата публикации


Номер: US20200020344A1

Conventional audio compression technologies perform a standardized signal transformation, independent of the type of the content. Multi-channel signals are decomposed into their signal components, subsequently quantized and encoded. This is disadvantageous due to lack of knowledge on the characteristics of scene composition, especially for e.g. multi-channel audio or Higher-Order Ambisonics (HOA) content. A method for decoding an encoded bitstream of multi-channel audio data and associated metadata is provided, including transforming the first Ambisonics format of the multi-channel audio data to a second Ambisonics format representation of the multi-channel audio data, wherein the transforming maps the first Ambisonics format of the multi-channel audio data into the second Ambisonics format representation of the multi-channel audio data. A method for encoding multi-channel audio data that includes audio data in an Ambisonics format, wherein the encoding includes transforming the audio data in an Ambisonics format into encoded multi-channel audio data is also provided. 1. A method for decoding an encoded bitstream of multi-channel audio data and associated metadata , the method comprising:decoding the encoded bitstream of multi-channel audio data into multi-channel audio data;detecting that the multi-channel audio data includes a first Ambisonics format; andtransforming the first Ambisonics format of the multi-channel audio data to a second Ambisonics format representation of the multi-channel audio data, wherein the transforming maps the first Ambisonics format of the multi-channel audio data into the second Ambisonics format representation of the multi-channel audio data,wherein the associated metadata further describes re-mixing information and wherein the transforming the first Ambisonics format is based on the re-mixing information indicated by the associated metadata.2. A non-transitory computer program product storing a computer program claim 1 , the computer ...

21-01-2021 дата публикации


Номер: US20210020187A1

Disclosed are a method and an apparatus for high frequency decoding for bandwidth extension. The method for high frequency decoding for bandwidth extension comprises the steps of: decoding an excitation class; transforming a decoded low frequency spectrum on the basis of the excitation class; and generating a high frequency excitation spectrum on the basis of the transformed low frequency spectrum. The method and apparatus for high frequency decoding for bandwidth extension according to an embodiment can transform a restored low frequency spectrum and generate a high frequency excitation spectrum, thereby improving the restored sound quality without an excessive increase in complexity. 1. A high frequency decoding method comprising:decoding a low frequency spectrum and an excitation class for a current frame;modifying the low frequency spectrum by applying a random sign or an original sign to the low frequency spectrum based on the excitation class; andgenerating a high frequency excitation spectrum based on the modified low frequency spectrum.2. The high frequency decoding method of claim 1 , wherein the excitation class indicates one among a plurality of classes including a speech excitation class claim 1 , a first non-speech excitation class claim 1 , and a second non-speech excitation class.3. The high frequency decoding method of claim 2 , wherein the first non-speech excitation class is related to noisy characteristic and the second non-speech excitation class is related to tonal characteristic.4. The high frequency decoding method of claim 1 , wherein the modifying of the low frequency spectrum further comprises:normalizing the low frequency spectrum;identifying a control parameter based on the decoded excitation class; andmodifying the normalized low frequency spectrum by reducing an amplitude of the normalized low frequency spectrum based on the control parameter.5. The high frequency decoding method of claim 4 , wherein an amount of the reduced amplitude ...

17-01-2019 дата публикации


Номер: US20190021054A1

Methods, systems, and devices for wireless communication are described. A user equipment (UE) may be enabled for voice over long term evolution (VoLTE). The UE may include an audio layer to encode and decode voice information and a packet layer to transmit voice packets. The packet layer may store parameters related to a discontinuous reception (DRX) in a shared memory. The audio layer may obtain the DRX parameters and encode voice information based on the parameters. For example, the audio layer coding may be synchronized with the wake period of the DRX cycle. The audio layer may encode voice information during a wake up period of the packet layer DRX cycle, and the packet layer may transmit the voice packets while awake. The audio layer may perform back to back encodings at the beginning of the DRX cycle. The packet layer may extend the wake period to transmit the voice packets. 1. A method for wireless communication , comprising:identifying, by an audio layer of a wireless device, a set of samples of a voice transmission;obtaining, from a memory accessible by the audio layer and a packet layer of the wireless device, a set of discontinuous reception (DRX) parameters corresponding to the packet layer of the wireless device;synchronizing an audio timeline of the audio layer with a DRX cycle of the packet layer based at least in part on the set of DRX parameters; andencoding, by the audio layer, the set of samples of the voice transmission based at least in part on the synchronized audio timeline.2. The method of claim 1 , further comprising:determining, by the audio layer, a DRX wake period of the packet layer based at least in part on the set of DRX parameters, wherein encoding the set of samples is performed during at least a portion of the DRX wake period.3. The method of claim 1 , further comprising:sending the encoded set of samples to the packet layer of the wireless device.4. The method of claim 3 , further comprising:transmitting, to a second wireless ...

16-01-2020 дата публикации

Audio decoder for audio channel reconstruction

Номер: US20200021915A1

A method and apparatus for reconstructing N audio channels from M audio channels is disclosed. The method includes receiving a bitstream containing an encoded audio signal representing the M audio channels and decoding the encoded audio signal to obtain a frequency domain representation of the M audio channels. The method further includes extracting a parameter from the bitstream and reconstructing at least one of the N audio channels using the parameter. The parameter represents an angle between two signals, at least one of which is included in the M audio channels.

22-01-2015 дата публикации


Номер: US20150025879A1

This invention introduces audio/speech encoding apparatus audio/speech decoding apparatus, audio/speech encoding method and audio/speech decoding method to efficiently encode the quantization parameters of split multi-rate lattice vector quantization. In this invention, the position of the sub-vector whose codebook indication consumes the most bits is firstly located, and then the value of the codebook is estimated based on the total number of bits available and the bits usage information for other sub-vectors. The difference value is calculated between the actual value and estimated value. Finally, instead of transmitting the codebook indication which consumes the most bits, the position of the sub-vector whose codebook indication consumes the most bits and the difference value between the actual value and the estimated value are transmitted. By applying of the invented method, bits can be saved by the codebook indications. 1. An audio/speech encoding apparatus comprising:a time to frequency domain transformation section that transforms a time domain input signal to a frequency spectrum;a vector quantization section that splits an input signal of the frequency spectrum into sub-bands and that quantizes the input signal split into the sub-bands to generate a codebook indication; anda codebook indication transformation section that transforms the codebook indication, wherein:the codebook indication transformation section identifies a position of a sub-vector whose codebook indication consumes the most bits and encodes the identified position of the sub-vector and codebook indications for all the sub-hands except the sub-band whose codebook indication consumes the most bits;the codebook indication transformation section estimates a codebook whose codebook indication consumes the most bits; andthe codebook indication transformation section encodes a difference between an actual codebook indication and the estimated codebook indication.2. The audio/speech encoding ...

22-01-2015 дата публикации

Audio Encoder with Parallel Architecture

Номер: US20150025895A1
Автор: Schildbach Wolfgang

The present document relates to methods and systems for audio encoding. In particular, the present document relates to methods and systems for fast audio encoding using a parallel system architecture. A frame-based audio encoder () comprising K parallel transform units () is described; wherein each of the K parallel transform units () is configured to transform a respective one of a group of K frames () of an audio signal () into a respective one of K sets of frequency coefficients; wherein K>1; wherein each of the K frames (305) comprises a plurality of samples of the audio signal (). 137-. (canceled)38. A frame-based audio encoder comprisingK parallel transform units; wherein each of the K parallel transform units is configured to transform a respective one of a current group of K frames of an audio signal into a respective one of K current sets of frequency coefficients; wherein K>1; wherein each of the K frames comprises a plurality of samples of the audio signal;K parallel quantization and encoding units; wherein each of the K parallel quantization and encoding units is configured to quantize and entropy encode the respective one of the K current sets of frequency coefficients, under consideration of a respective number of allocated bits;a bit allocation unit configured to allocate the respective number of bits to each of the K parallel quantization and encoding units under consideration of a number of previously consumed bits; anda bit reservoir tracking unit configured to update the number of previously consumed bits with a number of bits used by the K parallel quantization and encoding units for encoding the K sets of frequency coefficients of the audio signal for a group of K frames preceding the current group of K frames.39. The audio encoder of claim 38 , wherein each of the K parallel transform units is configured to transform the respective one of the K frames into a frame-type dependent set of frequency coefficients; and further comprising:K parallel ...

26-01-2017 дата публикации


Номер: US20170024441A1

A system and method for detecting songs in a continuous audio stream are disclosed. A detection server segments the continuous audio stream, and analyzes the audio stream to determine song candidates according to various processes disclosed herein. In one embodiment, the candidates are determined to be accurate when temporally ordered fingerprints exceed a predetermined threshold, adjacent audio stream segments are determined to have the same best song candidate, and the determined song detection has not been previously detected in the data stream within a predetermined period of time. 1. A system for detecting and identifying songs in a continuous audio stream , said system comprising:a fingerprint database of acoustic fingerprints, wherein each fingerprint is associated with a temporal value and a song identifier such that the acoustic fingerprints corresponding to a song identifier can be temporally ordered by the temporal value associated with each acoustic fingerprint; and receive, via a communications network, a data stream comprising an audio stream;', 'decode the audio stream into a pulse code modulated (PCM) stream; and', accumulating a predetermined length of the PCM stream into a buffer;', 'generating a plurality of temporally ordered acoustic fingerprints from the predetermined length of the PCM stream accumulated in the buffer;', 'comparing each acoustic fingerprint of the plurality of temporally ordered acoustic fingerprints generated from the predetermined length of the PCM stream accumulated in the buffer to the acoustic fingerprints in the fingerprint database to produce matches;', 'grouping matches by song identifier;', 'adding the song identifier to a list of song candidates if the quantity of matches in a group exceeds a predetermined threshold and the matched acoustic fingerprints are in the same temporal order in the fingerprint database as in the plurality of temporally ordered acoustic fingerprints generated from the predetermined length of ...

26-01-2017 дата публикации


Номер: US20170025130A1

An audio signal decoding apparatus is provided that includes a receiver that receives an encoded information, a memory, and a processor that demultiplexes low-band encoding parameters, index information, and scale factor information from the encoded information. The processor also decodes the low-band encoding parameters to obtain a synthesized low frequency spectrum, replicates a high frequency subband spectrum based on the index information using the synthesized low frequency spectrum, and adjusts an amplitude of the replicated high frequency subband spectrum using the scale factor information. The processor further estimates a frequency of a harmonic component in the synthesized low frequency spectrum, adjusts a frequency of a harmonic component in the high frequency subband spectrum using the estimated harmonic frequency spectrum, and generates an output signal using the synthesized low frequency spectrum and the high frequency subband spectrum. 1. An audio signal decoding apparatus , comprising:a receiver that receives an encoded information;a memory; and demultiplexes low-band encoding parameters, index information, and scale factor information from the encoded information;', 'decodes the low-band encoding parameters to obtain a synthesized low frequency spectrum;', 'replicates a high frequency subband spectrum based on the index information using the synthesized low frequency spectrum;', 'adjusts an amplitude of the replicated high frequency subband spectrum using the scale factor information,', 'estimates a frequency of a harmonic component in the synthesized low frequency spectrum;', 'adjusts a frequency of a harmonic component in the high frequency subband spectrum using the estimated harmonic frequency spectrum; and', 'generates an output signal using the synthesized low frequency spectrum and the high frequency subband spectrum;', splits a preselected portion of the synthesized low frequency spectrum into a number of blocks;', 'identifies a spectral peak ...

25-01-2018 дата публикации


Номер: US20180025737A1

Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one block of an encoded audio bitstream. The block includes a fill element that begins with an identifier followed by fill data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. A corresponding method for decoding an encoded audio bitstream is also provided. 1. An audio processing unit comprising:a buffer configured to store at least one block of an encoded audio bitstream;a bitstream payload deformatter coupled to the buffer and configured to demultiplex at least a portion of the at least one block of the encoded audio bitstream; anda decoding subsystem coupled to the bitstream payload deformatter and configured to decode at least a portion of the at least one block of the encoded audio bitstream, wherein the at least one block of the encoded audio bitstream includes:a fill element with an identifier indicating a start of the fill element and fill data after the identifier, wherein the fill data includes:at least one flag identifying whether a base form of spectral band replication or an enhanced form of spectral band replication is to be performed on audio content of the at least one block of the encoded audio bitstream, wherein the base form of spectral band replication includes spectral patching, the enhanced form of spectral band replication includes harmonic transposition, one value of the flag indicates that said enhanced form of spectral band replication should be performed on the audio content, and another value of the flag indicates that said base form of spectral band replication but not said harmonic transposition should be performed on the audio content.2. The audio processing unit of wherein the fill data further includes enhanced spectral band replication metadata.3. The audio ...

25-01-2018 дата публикации


Номер: US20180025738A1

Embodiments relate to an audio processing unit that includes a buffer, bitstream payload deformatter, and a decoding subsystem. The buffer stores at least one block of an encoded audio bitstream. The block includes a fill element that begins with an identifier followed by fill data. The fill data includes at least one flag identifying whether enhanced spectral band replication (eSBR) processing is to be performed on audio content of the block. A corresponding method for decoding an encoded audio bitstream is also provided. 1210. An audio processing unit () comprising:{'b': '201', 'a buffer () configured to store at least one block of an encoded audio bitstream;'}{'b': '215', 'a bitstream payload deformatter () coupled to the buffer and configured to demultiplex at least a portion of the at least one block of the encoded audio bitstream; and'}{'b': 202', '215, 'a decoding subsystem () coupled to the bitstream payload deformatter () and configured to decode at least a portion of the at least one block of the encoded audio bitstream, wherein the at least one block of the encoded audio bitstream includesa fill element with an identifier indicating a start of the fill element and fill data after the identifier, wherein the fill data includes:at least one flag identifying whether enhanced spectral band replication processing is to be performed on audio content of the at least one block of the encoded audio bitstream.2. The audio processing unit of claim 1 , wherein the fill data further includes enhanced spectral band replication metadata.3. The audio processing unit of claim 2 , wherein the enhanced spectral band replication metadata does not include one or more parameters used for both spectral patching and harmonic transposition.4. The audio processing unit of or claim 2 , wherein the enhanced spectral band replication metadata does not include a parameter for selecting between harmonic transposition and spectral patching.5. The audio processing unit of any one of to ...

25-01-2018 дата публикации

Time-Alignment of QMF Based Processing Data

Номер: US20180025739A1

The present document relates to time-alignment of encoded data of an audio encoder with associated metadata, such as spectral band replication (SBR) metadata. An audio decoder configured to determine a reconstructed frame of an audio signal from an access unit of a received data stream is described. The access unit comprises waveform data and metadata, wherein the waveform data and the metadata are associated with the same reconstructed frame of the audio signal. The audio decoder comprises a waveform processing path configured to generate a plurality of waveform subband signals from the waveform data, and a metadata processing path configured to generate decoded metadata from the metadata. 1. An audio decoder configured to determine a reconstructed frame of an audio signal from an access unit of a received data stream; wherein the access unit comprises waveform data and metadata; wherein the waveform data and the metadata are associated with the same reconstructed frame of the audio signal; wherein the audio decoder comprisesa waveform processing path configured to generate a plurality of waveform subband signals from the waveform data;a metadata processing path configured to generate decoded metadata from the metadata; anda metadata application and synthesis unit configured to generate the reconstructed frame of the audio signal from the plurality of waveform subband signals and from the decoded metadata;wherein the waveform processing path comprises a waveform delay unit, which is configured to apply a waveform delay on a waveform signal which is represented in the time domain, and/or the metadata processing path comprises a metadata delay unit, the waveform delay unit and/or the metadata delay unit being configured to time-align the plurality of waveform subband signals and the decoded metadata, and wherein the analysis unit introduces a fixed delay which is independent of the frame length N of the reconstructed frame of the audio signal.2. The audio decoder of ...

28-01-2016 дата публикации

Methods and Systems for Interactive Rendering of Object Based Audio

Номер: US20160029138A1

Methods for generating an object based audio program which is renderable in a personalizable manner, e.g., to provide an immersive, perception of audio content of the program. Other embodiments include steps of delivering (e.g., broadcasting), decoding, and/or rendering such a program. Rendering of audio objects indicated by the program may provide an immersive experience. The audio content of the program may be indicative of multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects, and typically also a default set of objects which will be rendered in the absence of a selection by a user) and a bed of speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method.

10-02-2022 дата публикации


Номер: US20220044693A1
Автор: LIANG Junbin

This disclosure provides a network call method and apparatus, a computer device, and a storage medium, and belongs to the field of audio data processing. The method includes: performing time-frequency transformation on an acquired audio signal, to obtain a plurality of pieces of frequency domain information of the audio signal; determining a target bit rate corresponding to the audio signal according to the plurality of pieces of frequency domain information; and encoding the audio signal based on the target bit rate, and performing a network call based on the encoded audio signal. 1. A network call method , applicable to a computer device , the method comprising:performing time-frequency transformation on an acquired audio signal to obtain a plurality of pieces of frequency domain information of the audio signal;determining a target bit rate corresponding to the audio signal according to the plurality of pieces of frequency domain information;encoding the audio signal based on the target bit rate; andperforming a network call based on the encoded audio signal.2. The method according to claim 1 , wherein determining the target bit rate corresponding to the audio signal according to the plurality of pieces of frequency domain information comprises:inputting the plurality of pieces of frequency domain information into a bit rate prediction model to obtain a plurality of first candidate bit rates, the plurality of first candidate bit rates being sufficient to meet a target speech quality condition; anddetermining a first bit rate that meets a target condition from the plurality of first candidate bit rates as the target bit rate.3. The method according to claim 2 , wherein determining the first bit rate that meets the target condition from the plurality of first candidate bit rates as the target bit rate comprises:classifying the plurality of first candidate bit rates having same bit rate values into bit rate groups;obtaining a quantity of each bit rate group; ...

25-01-2018 дата публикации

Content Delivery Apparatus, Content Delivery System and Content Delivery Method

Номер: US20180027328A1
Автор: Osamu KOHARA
Принадлежит: Yamaha Corp

A content delivery apparatus includes an external input portion, a content reception portion and a delivery portion. To the external input portion, first content having at least a fundamental component is inputted. The content reception portion receives second content having a fundamental component and an extended component. When delivering the first content, the delivery portion delivers the fundamental component of the first content to a client apparatus more than once.

23-01-2020 дата публикации


Номер: US20200027469A1

Embodiments of the disclosure include an improved content streaming system that is configured to simplify and streamline the process of streaming media content from one or more content providers to one or more electronic devices. In some embodiments, the interaction of a user with one or more components in a content distribution system is used to initiate the streaming of media content to one or more content players from either a first content server or a second content server. 120.-. (canceled)21. A method of streaming media content , comprising: the first user input signal comprises streaming command information,', "the first user input signal was generated from a first signal generated by an electromechanical device of the first content player based on a user's interaction with the electromechanical device, and", 'the first user input signal consists of information other than information configured to identify a media-containing file;, 'transmitting a first user input signal from a first content player, wherein'} the transmission of the first delivery command is provided automatically based on receipt of the first user input signal, and', 'the first delivery command comprises media identification information relating to a first playlist of a plurality of playlists; and, 'transmitting a first delivery command from a connected electronic device to a first content server after a control software application running on the connected electronic device determines that the first user input signal comprises streaming command information, wherein'}streaming information comprising at least one media-containing file that is provided from the first content server to the first content player based on information contained within the first delivery command, wherein the at least one media-containing file includes content from the first playlist.22. The method of claim 21 , wherein streaming the information from the first content server to the first content player further ...

23-01-2020 дата публикации


Номер: US20200027470A1

Embodiments of the disclosure include an improved content streaming system that is configured to simplify and streamline the process of streaming media content from one or more content providers to one or more electronic devices. In some embodiments, the interaction of a user with one or more components in a content distribution system is used to initiate the streaming of media content to one or more content players from either a first content server or a second content server. 1. A method of streaming media content , comprising:receiving, by an electronic device, a first user input signal from a first content player;launching a control application on the electronic device, wherein the control application is launched based on the receipt of the first user input signal; determining that the streaming command was provided in the first user input signal comprises transmitting a request from the electronic device to the first content player, and', 'the transmitted request causes the first content player to transmit a response to the electronic device that contains information regarding whether the streaming command was transmitted;, 'determining, by the control application, that a streaming command was provided in the first user input signal provided from the first content player, wherein'} the transmission of the first delivery command is provided based at least partially on the receipt of the first user input signal from the first content player, and', 'the first delivery command comprises media identification information; and, 'transmitting a first delivery command to a first content server after the control application running on the electronic device determines that the streaming command was transmitted from the first content player, wherein'}receiving, by the first content player, at least one media-containing file provided from the first content server based on information contained within the first delivery command.2. (canceled)3. (canceled)4. The method of ...

24-01-2019 дата публикации

Dynamic latency control

Номер: US20190028528A1
Автор: Mark Barry Dolson
Принадлежит: NXP BV

As may be implemented in a manner consistent with one or more embodiment, aspects of the disclosure are directed to latency control with signals, such as audio signals. For instance, a quality characteristic of an audio signal having time-sequenced frames exhibiting a signal quality can assessed, and an output indicative of the signal quality is provided based on the assessment. An amount of latency in the audio signal is dynamically adjusted based on the output, and the latency can be used in processing the time-sequenced frames, such as to use future frames in assessing or correcting a current frame during a time period facilitated via the latency.

28-01-2021 дата публикации


Номер: US20210027792A1

Methods, apparatus and articles of manufacture to identify sources of network streaming services are disclosed. An example apparatus includes a coding format identifier to identify, from a received first audio signal representing a decompressed second audio signal, an audio compression configuration used to compress a third audio signal to form the second audio signal, and a source identifier to identify a source of the second audio signal based on the identified audio compression configuration. 1a coding format identifier to identify, from a received first audio signal representing a decompressed second audio signal, an audio compression configuration used to compress a third audio signal to form the second audio signal; anda source identifier to identify a source of the second audio signal based on the identified audio compression configuration.. An apparatus, comprising: This patent claims the priority benefit of U.S. patent application Ser. No. 15/793,543, which was filed on Oct. 25, 2017. U.S. patent application Ser. No. 15/793,543 is hereby incorporated herein by reference in its entirety.This disclosure relates generally to network streaming services, and, more particularly, to methods, apparatus and articles of manufacture to identify sources of network streaming services.Audience measurement entities (AMEs) perform, for example, audience measurement, audience categorization, measurement of advertisement impressions, measurement of media exposure, etc., and link such measurement information with demographic information. AMEs can determine audience engagement levels for media based on registered panel members. That is, an AME enrolls people who consent to being monitored into a panel. The AME then monitors those panel members to determine media (e.g., television programs or radio programs, movies, DVDs, advertisements (ads), websites, etc.) exposed to those panel members.Wherever possible, the same reference numbers will be used throughout the drawing(s) and ...

24-04-2014 дата публикации


Номер: US20140114668A1

An audio data stream from a processing system may be buffered to allow low power states in the processing system during audio playback. An audio buffer may be provided external to the processing system and between the processing system and an audio codec. The audio buffer may also shift to an alternate audio data interface mode when the processing system is in the low power state. Of course, many alternatives, variations, and modifications are possible without departing from this embodiment. 1. An apparatus comprising:an audio processing logic, coupled to a general purpose processor, the audio processing logic to process audio information, wherein in a first operating state the audio processing logic to process the audio information while the general purpose processor at least has a capability to process data in the first operating state, and wherein in a second operating state the audio processing logic to process the audio information while the general purpose processor resides at least in a low power state, the general purpose processor to not process data in the second operating state.2. The apparatus of claim 1 , wherein the apparatus to consume less power in the second operating state than in the first operating state.3. The apparatus of claim 1 , wherein the general purpose processor to have the capability to process the audio information in the first operating state.4. The apparatus of claim 1 , wherein the general purpose processor to not have the capability to process the audio information in the first operating state.5. The apparatus of claim 1 , wherein the apparatus to enter into the second operating state in response to audio playback selected as a sole use for the apparatus.6. The apparatus of claim 1 , wherein the apparatus to enter into the second operating state in response to one or more components in the system other than the audio processing logic being inactive.7. The apparatus of claim 1 , wherein the audio processing logic to read the audio ...

02-02-2017 дата публикации


Номер: US20170032801A1
Автор: Baumgarte Frank

A system for producing an encoded digital audio recording has an audio encoder that encodes a digital audio recording having a number of audio channels or audio objects. An equalization (EQ) value generator produces a sequence of EQ values which define EQ filtering that is to be applied when decoding the encoded digital audio recording, wherein the EQ filtering is to be applied to a group of one or more of the audio channels or audio objects of the recording independent of any downmix. A bitstream multiplexer combines the encoded digital audio recording with the sequence of EQ values, the latter as metadata associated with the encoded digital audio recording. Other embodiments are also described including a system for decoding the encoded audio recording. 1. A system for producing an encoded digital audio recording having a plurality of audio channels or audio objects , comprising:an encoder to encode a digital audio recording having an original plurality of audio channels or audio objects, to produce an encoded digital audio recording;an equalization (EQ) value generator to produce a sequence of EQ values which define EQ filtering that is to be applied to a specified EQ group of one or more of the original audio channels or audio objects, independent of downmix and upon decoding the encoded digital audio recording; anda bitstream multiplexer to combine a) the encoded digital audio recording with b) the sequence of EQ values including an indication of said EQ group, the latter as metadata associated with the encoded digital audio recording.2. The system of wherein the sequence of EQ values defines the EQ filtering that is to be applied upon decoding of the EQ group claim 1 , as reducing gain below 500 Hz whether or not downmix is applied to the decoded EQ group.3. The system of wherein the sequence of EQ values defines the EQ filtering that is to be applied upon decoding of the EQ group claim 1 , as a late night mode that can be enabled during playback claim 1 , of ...

01-02-2018 дата публикации


Номер: US20180033445A1

The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a first acoustic device, a representation of an audio signal, and amplifying the representation of the audio signal by a first gain factor to generate an amplified input signal. The method also includes processing the amplified input signal by an audio codec that includes one or more processors to generate a processed signal that represents a portion of the audio signal to be output by a second acoustic device. The processed signal includes noise originating at the audio codec. The method further includes transmitting the processed signal to the second acoustic device. 1. A computer-implemented method comprising:receiving, at a first acoustic device, a representation of an audio signal;amplifying the representation of the audio signal by a first gain factor to generate an amplified input signal;processing the amplified input signal by an audio codec comprising one or more processors to generate a processed signal that represents a portion of the audio signal to be output by a second acoustic device remote with respect to the first acoustic device, wherein the processed signal includes noise originating at the audio codec; andtransmitting the processed signal to the second acoustic device over a wireless link,wherein the processing comprises bit rate compression in accordance with a data transfer capacity of the wireless link.2. The method of claim 1 , further comprising:receiving, at the second acoustic device, a representation of the processed signal;compensating for the first gain factor by amplifying the representation of the processed signal by a second gain factor to generate a compensated signal; andoutputting the compensated signal at the second acoustic device.3. The method of claim 2 , wherein the second gain factor is substantially an inverse of the first gain factor.4. The method of claim 1 , wherein the audio codec is an adaptive ...

05-02-2015 дата публикации


Номер: US20150036679A1
Автор: Deng Huiqun, Sun Xuejing

Methods and corresponding apparatuses for transmitting and receiving audio signals are described. A transformation is performed on the audio signals in units of frame in order to obtain transformed audio data of each frame, said transformed audio data consisting of multiple signal components in the frequency domain. These signal components of each frame are distributed into multiple adjacent packets in order to generate packets in which signal components distributed from multiple frames are interleaved. Subsequently, the generated packets are transmitted. Accordingly, in case that packet loss occurs during transmission, the audio signals can be recovered based on the received signal components without consuming additional bandwidth. Therefore, robustness against packet loss can be achieved with little overhead. 130-. (canceled)31. A method of transmitting audio signals , comprising:performing transformation on the audio signals in units of frame to obtain transformed audio data of each frame, which includes multiple signal components in frequency domain;distributing the signal components of each frame into multiple adjacent packets to generate each packet in which partial signal components distributed from multiple frames are interleaved; andtransmitting the generated packet.32. The method according to claim 31 , wherein the distributing comprises:distributing the signal components of each frame into the multiple adjacent packets uniformly.33. The method according to claim 32 , wherein the distributing comprises:for the signal components of each frame, distributing a half of the signal components indexed by ones of odd numbers and even numbers into a first packet, and distributing the other half of the signal components indexed by the other ones of odd numbers and even numbers into a subsequent packet that follows the first packet in transmission order.34. The method according to claim 31 , wherein the distributing comprises:for the signal components of each frame ...

17-02-2022 дата публикации


Номер: US20220051688A1

Disclosed are a device and method for wirelessly communicating. The device according to one example embodiment of the present disclosure may comprise a transceiver and a controller connected to the transceiver, wherein the controller is configured to identify at least one additional sample on the basis of a digital signal by using a neural network model and upscale the digital signal by adding the at least one identified additional sample to a plurality of samples of the digital signal. 1. A device for wireless communication , the device comprising:a transceiver; and identify at least one additional sample by using a neural network model, based on a digital signal, and', 'upscale the digital signal by adding the identified at least one additional sample to a plurality of samples of the digital signal., 'a controller connected to the transceiver, wherein the controller is configured to2. The device of claim 1 , wherein to identify the at least one additional sample using the neural network model based on the digital signal claim 1 , the controller is further configured to:determine a weight in response to the digital signal; andidentify the at least one additional sample based on the digital signal and the weight.3. The device of claim 1 , wherein the controller is further configured to generate the neural network model by:obtaining a first output digital signal upscaled from a first input digital signal in response to the first input digital signal;obtaining a difference between the first output digital signal and one reference digital signal of a set of at least one reference digital signal; andobtaining a second output digital signal upscaled from a second input digital signal based on the difference and the second input digital signal.4. The device of claim 3 , wherein the difference is related to at least one sample not corresponding to a plurality of samples of the first output digital signal among a plurality of samples of the one reference digital signal.5. ...

01-02-2018 дата публикации


Номер: US20180035246A1
Автор: Orescanin Marko

The technology described in this document can be embodied in a computer-implemented method that includes receiving, at a first acoustic device, a first data stream representing audio signals encoded using a first encoding scheme, and processing, using one or more processing devices, the first data stream to generate a second data stream representing a portion of the audio signals to be decoded at a second acoustic device. The method also includes transmitting the second data stream over a near-field magnetic induction (NFMI) link to the second acoustic device. 1. A computer-implemented method comprising:receiving, at a first acoustic device, a first data stream representing audio signals encoded using a first encoding scheme, the first acoustic device being selected over a second acoustic device to receive the first data stream, the selection of the first acoustic device being made based on a comparison of received signal strengths from the first acoustic device and the second acoustic device;processing, using one or more processing devices, the first data stream to generate a second data stream representing a portion of the audio signals to be decoded at a second acoustic device; andtransmitting the second data stream over a wireless link to the second acoustic device.2. The method of claim 1 , wherein the wireless link comprises a near-field magnetic induction (NFMI) link.3. The method of claim 1 , wherein the first data stream comprises data representing audio signals for two or more audio channels.4. The method of claim 3 , wherein processing the first data stream comprises:extracting the portion of the first data stream that corresponds to audio signals to be decoded at an acoustic device different from the first acoustic device; andgenerating the second data stream using the extracted portion.5. The method of claim 1 , wherein processing the first data stream comprises:decoding the first data stream to generate decoded audio data; andencoding a portion of the ...

31-01-2019 дата публикации


Номер: US20190035411A1

A method for generating a bitstream indicative of an object based audio program is described. The bitstream comprises a sequence of containers. A first container of the sequence of containers comprises a plurality of substream entities for a plurality of substreams of the object based audio program and a presentation section. The method comprises determining a set of object channels. The method further comprises providing a set of object related metadata for the set of object channels. In addition, the method comprises inserting a first set of object channel frames and a first set of object related metadata frames into a respective set of substream entities of the first container. Furthermore, the method comprises inserting presentation data into the presentation section. 1. A method for rendering an audio program from an encoded bitstream , the method comprising:extracting presentation data from the encoded bitstream, wherein the presentation data is indicative of a presentation of the audio program, and wherein the presentation data corresponds to a set of substream entities; andrendering, based on the presentation data, a first set of object channel frames and metadata corresponding to the set of substream entities,wherein the audio program comprises a plurality of substreams that includes the set of substream entities, and wherein the audio program comprises a first audio program frame,wherein the first audio program frame is part of a set of object channels and a corresponding first set of object related metadata frames; wherein an object channel frame is to be presented by a combination of speakers of a presentation environment, wherein an object related metadata frame of an object channel frame is indicative of a position within the presentation environment from which the object channel frame is to be rendered; and wherein the set of object channels is indicative of audio content of at least some of a set of audio signals.2. The method of claim 1 , wherein ...

31-01-2019 дата публикации


Номер: US20190035413A1

There is provided an audio encoding apparatus including a memory, and a processor coupled to the memory and the processor configured to determine whether a tone is included in a boundary between a low-frequency that is a frequency bandwidth below a predetermined frequency of an input signal and a high-frequency that is a frequency bandwidth above the predetermined frequency of the input signal, suppress a tone in one of the low-frequency and the high-frequency, encode the input signal having the low-frequency to generate a low-frequency code, encode the input signal having the high-frequency to generate a high-frequency code, and generate an encoded stream by multiplexing the low-frequency code and the high-frequency code. 1. An audio encoding apparatus comprising:a memory; anda processor coupled to the memory and the processor configured to:determine whether a tone is included in a boundary between a low-frequency that is a frequency bandwidth below a predetermined frequency of an input signal and a high-frequency that is a frequency bandwidth above the predetermined frequency of the input signal;suppress a tone in one of the low-frequency and the high-frequency;encode the input signal having the low-frequency to generate a low-frequency code;encode the input signal having the high-frequency to generate a high-frequency code; andgenerate an encoded stream by multiplexing the low-frequency code and the high-frequency code.2. The audio encoding apparatus according to claim 1 ,wherein the processor is further configured to:extract envelope information from a frequency spectrum of the input signal having the high-frequency;encode high-frequency information including the envelope information to encode the input signal having the high-frequency; andwhen the tone in the high-frequency is suppressed, suppress a value of the envelope information in a neighborhood of the boundary.3. The audio encoding apparatus according to claim 1 ,wherein the processor is configured to: ...

30-01-2020 дата публикации


Номер: US20200035251A1

The invention provides methods and devices for outputting a stereo audio signal having a left channel and a right channel. The apparatus includes a demultiplexer, decoder, and upmixer. The upmixer is configured operate either in a prediction mode or a non-prediction mode based on a parameter encoded in the audio bitstream. 1. An apparatus for outputting a stereo audio signal having a left channel and a right channel , the apparatus comprising:a demultiplexer configured to receive an audio bitstream and decode therefrom at least one prediction coefficient, wherein the audio bitstream is segmented into frames and values of the at least one prediction coefficient may change for each of the frames;a decoder configured to generate a downmix signal and a residual signal from the audio bitstream; and wherein, when the upmixer operates in the prediction mode, the residual signal represents a difference between a side signal and a predicted version of the side signal, and the upmixer generates the left channel and the right channel from a combination of the downmix signal, the residual signal, and the at least one prediction coefficient, wherein the at least one prediction coefficient is used as one or more weights in a weighted summing operation that generates the left channel and the right channel, and', 'wherein, when the upmixer operates in the non-prediction mode, the residual signal represents the side signal, the upmixer generates the left channel based on a sum of the downmix signal and the residual signal, and the upmixer generates the right channel based on a difference of the downmix signal and the residual signal., 'an upmixer configured to operate in either a prediction mode or a non-prediction mode based on at least one parameter encoded in the audio bitstream, and to output the left channel and the right channel as the stereo audio signal,'}2. The apparatus of wherein the at least one parameter is the at least one prediction coefficient.3. The apparatus of ...

30-01-2020 дата публикации


Номер: US20200035253A1

Methods, an encoder and a decoder are configured for transition between frames with different internal sampling rates. Linear predictive (LP) filter parameters are converted from a sampling rate S to a sampling rate S. A power spectrum of a LP synthesis filter is computed, at the sampling rate S, using the LP filter parameters. The power spectrum of the LP synthesis filter is modified to convert it from the sampling rate S to the sampling rate S. The modified power spectrum of the LP synthesis filter is inverse transformed to determine autocorrelations of the LP synthesis filter at the sampling rate S. The autocorrelations are used to compute the LP filter parameters at the sampling rate S 136-. (canceled)37. A method for encoding a sound signal , comprising:sampling the sound signal during successive sound signal processing frames;{'b': 1', '2', '1', '2, 'claim-text': [{'b': '1', 'computing, at the internal sampling rate S, a power spectrum of a LP synthesis filter using the LP filter parameters;'}, {'b': 1', '2', '1', '2, 'extending the power spectrum of the LP synthesis filter to convert it from the internal sampling rate S to the internal sampling rate S if the internal sampling rate S is smaller than the internal sampling rate S;'}, {'b': 1', '2', '1', '2, 'truncating the power spectrum of the LP synthesis filter to convert it from the internal sampling rate S to the internal sampling rate S if the internal sampling rate S is larger than the internal sampling rate S;'}, {'b': '2', 'applying an inverse Fourier transform to the extended or truncated power spectrum of the LP synthesis filter to determine autocorrelations of the LP synthesis filter at the internal sampling rate S; and'}, {'b': '2', 'computing the LP filter parameters at the internal sampling rate S by applying the Levinson-Durbin algorithm to the autocorrelations; and'}, 'encoding the sound signal encoding parameters into a bitstream., 'producing, in response to the sampled sound signal, parameters ...

04-02-2021 дата публикации


Номер: US20210035588A1

The present document relates to a method of layered encoding of a frame of a compressed higher-order Ambisonics, HOA, representation of a sound or sound field. The compressed HOA representation comprises a plurality of transport signals. The method comprises assigning the plurality of transport signals to a plurality of hierarchical layers, the plurality of layers including a base layer and one or more hierarchical enhancement layers, generating, for each layer, a respective HOA extension payload including side information for parametrically enhancing a reconstructed HOA representation obtainable from the transport signals assigned to the respective layer and any layers lower than the respective layer, assigning the generated HOA extension payloads to their respective layers, and signaling the generated HOA extension payloads in an output bitstream. The present document further relates to a method of decoding a frame of a compressed HOA representation of a sound or sound field, an encoder and a decoder for layered coding of a compressed HOA representation, and a data structure representing a frame of a compressed HOA representation of a sound or sound field. 1. A method of decoding a compressed Higher Order Ambisonics (HOA) representation of a sound or sound field , the method comprising:receiving a bit stream containing the compressed HOA representation corresponding to a plurality of hierarchical layers that include a base layer and one or more hierarchical enhancement layers, wherein plurality of layers have assigned thereto components of a basic compressed sound representation of the sound or sound field, the components being assigned to respective layers in respective groups of components;determining a highest usable layer among the plurality of layers for decoding;extracting a HOA extension payload assigned to the highest usable layer, wherein the HOA extension payload includes side information for parametrically enhancing a reconstructed HOA representation ...

09-02-2017 дата публикации


Номер: US20170040025A1

Provided are a method and apparatus for encoding and decoding a high frequency signal by using a low frequency signal. The high frequency signal can be encoded by extracting a coefficient by linear predicting a high frequency signal, and encoding the coefficient, generating a signal by using the extracted coefficient and a low frequency signal, and encoding the high frequency signal by calculating a ratio between the high frequency signal and an energy value of the generated signal. Also, the high frequency signal can be decoded by decoding a coefficient, which is extracted by linear predicting a high frequency signal, and a low frequency signal, and generating a signal by using the decoded coefficient and the decoded low frequency signal, and adjusting the generated signal by decoding a ratio between the generated signal and an energy value of the high frequency signal. 1. A method of encoding a high band signal , the method comprising:extracting a coefficient from linear prediction of a high band signal;encoding the extracted coefficient;generating a signal based on a low band signal and the extracted coefficient;obtaining a gain from an energy value of the high band signal and an energy value of the generated signal;encoding the gain; andtransmitting the encoded coefficient and the encoded gain to a decoder,wherein the encoding the extracted coefficient comprises performing an interpolation process on the extracted coefficient.2. The method of claim 1 , wherein the generating a signal comprises:generating a first signal by using the extracted coefficient;generating a second signal in a high band by using the low band signal; andgenerating a third signal by calculating the first and second signals in a predetermined method.3. The method of claim 1 , wherein the generating a signal comprises:generating a first signal by using the extracted coefficient;extracting a residual signal by linear predicting the low band signal;generating a second signal in a high band by ...

04-02-2021 дата публикации


Номер: US20210037080A1
Автор: HUMEIDA Yousif, KEGEL Ian

Methods and apparatus are disclosed for selecting encoding specifications for encoding audio and/or video data to be streamed from a sender to a receiver via a network. Methods and apparatus for encoding data using a selected encoding specification, and for streaming data which has been encoded using a selected encoding specification from a sender to a receiver via a network are also disclosed. The selecting method comprises selecting an encoding specification (s) in dependence on performance measures () previously obtained using a plurality of different encoding specifications in respect of a monitored network when in each of a plurality of different network conditions (), respective network conditions being characterised by different combinations of data-transmission characteristics. 1) A method of selecting an encoding specification for encoding audio and/or video data to be streamed from a sender to a receiver via a network , the method comprising:in respect of a monitored network in each of a plurality of monitored network conditions, respective monitored network conditions being characterised by different combinations of data-transmission characteristics of at least two different types, obtaining a performance measure in respect of each of a plurality of different encoding specifications, the performance measure in respect of the monitored network when in a particular monitored network condition and in respect of a particular encoding specification being obtained by applying a predetermined performance measuring process in respect of audio and/or video data encoded using said encoding specification and streamed via the monitored network when in said monitored network condition;in respect of a network over which audio and/or video data is subsequently to be streamed from a sender to a receiver, obtaining data-transmission characteristics of said at least two different types in respect of said network and selecting a corresponding one of said monitored network ...

24-02-2022 дата публикации


Номер: US20220059100A1
Автор: LIANG Junbin

The present disclosure discloses a data transmission method performed by a computer device and a non-transitory computer-readable storage medium. According to the present disclosure, voice criticality analysis is performed on a to-be-transmitted audio to obtain a criticality level of each to-be-transmitted audio frame in the to-be-transmitted audio, and a corrected redundancy multiple of each to-be-transmitted audio frame is obtained according to a current redundancy multiple and a redundant transmission factor corresponding to the criticality level of each to-be-transmitted audio frame. Therefore, each to-be-transmitted audio frame is duplicated according to a corrected redundancy multiple of each to-be-transmitted audio frame, to obtain at least one redundancy data packet, and the at least one redundancy data packet is transmitted to a target terminal, which can improve the network anti-packet loss effect without causing network congestion. 1. A data transmission method performed by a computer device , the method comprising:performing voice criticality analysis on a to-be-transmitted audio to obtain a criticality level of at least one to-be-transmitted audio frame in the to-be-transmitted audio, the criticality level being used for measuring an amount of information carried in the audio frame;obtaining a quantity of redundant transmissions of the at least one to-be-transmitted audio frame according to a current redundancy multiple and a redundant transmission factor corresponding to the criticality level, the criticality level being positively correlated with a size of the redundant transmission factor, and the current redundancy multiple being determined based on a current packet loss condition of a target terminal; andduplicating the at least one to-be-transmitted audio frame according to the quantity of redundant transmissions to obtain at least one redundancy data packet, and transmitting the at least one redundancy data packet to the target terminal.2. The ...

24-02-2022 дата публикации

Methods, Apparatus and Systems for Dual-Ended Media Intelligence

Номер: US20220059102A1

A method of encoding audio content comprises performing a content analysis of the audio content, generating classification information indicative of a content type of the audio content based on the content analysis, encoding the audio content and the classification information in a bitstream, and outputting the bitstream. A method of decoding audio content from a bitstream including audio content and classification information for the audio content, wherein the classification information is indicative of a content classification of the audio content, comprises receiving the bitstream, decoding the audio content and the classification information, and selecting, based on the classification information, a post processing mode for performing post processing of the decoded audio content. Selecting the post processing mode can involve calculating one or more control weights for post processing of the decoded audio content based on the classification information. 141-. (canceled)42. A method of encoding audio content , the method comprising;performing a content analysis of the audio content;generating classification information indicative of a content type of the audio content based on the content analysis, wherein the content analysis is trained using predetermined audio content, wherein the classification information comprises one or more confidence values, each confidence value being associated with a respective content type and giving an indication of a likelihood that the audio content is of the respective content type;encoding the audio content and the classification information in a bitstream; andoutputting the bitstream.43. The method according to claim 42 , wherein the content analysis is based at least in part on metadata for the audio content.44. The method of claim 42 , further comprising:receiving a user input relating to a content type of the audio content,wherein the generating information is based on the user input.45. The method according to claim 44 , ...

24-02-2022 дата публикации


Номер: US20220059103A1

Methods for generating an object based audio program which is renderable in a personalizable manner, e.g., to provide an immersive, perception of audio content of the program. Other embodiments include steps of delivering (e.g., broadcasting), decoding, and/or rendering such a program. Rendering of audio objects indicated by the program may provide an immersive experience. The audio content of the program may be indicative of multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects, and typically also a default set of objects which will be rendered in the absence of a selection by a user) and a bed of speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method. 13-. (canceled)4. A method of decoding audio content of an audio program , said method comprising:receiving the audio program, wherein the audio program comprises a first set of object channels, and wherein the audio program further comprises a second set of one or more speaker channels that does not include an object channel;determining a subset of object channels, wherein the subset of object channels is indicative of a subset of the first set of object channels; anddecoding and rendering the audio content of the audio program wherein the rendering includes rendering the subset of object channels and rendering the second set of one or more speaker channels.5. The method of claim 4 , further comprising receiving object related metadata related to the first set of object channels.6. The method of claim 5 , wherein the object related metadata is part of a received bitstream.7. The method of claim 5 , wherein the subset of object channels is determined based on the object related metadata.8. The method of claim 5 , ...

24-02-2022 дата публикации


Номер: US20220059109A1
Автор: Ichimura Gen
Принадлежит: Sony Group Corporation

A compressed audio signal and a linear PCM signal are successfully simultaneously transmitted and reproduced. An audio signal continuous for each specified unit is transmitted to a reception side through a specified transmission line. The audio signal continuous for each specified unit is obtained by alternately arranging the audio signal in the specified unit, which includes a compressed audio signal, and the audio signal in the specified unit, which includes a linear PCM signal. For example, the specified unit is a subframe. 1. A transmission apparatus , comprisinga transmission section that transmits an audio signal continuous for each specified unit to a reception side through a specified transmission line, the audio signal continuous for each specified unit being obtained by alternately arranging the audio signal in the specified unit, which includes a compressed audio signal, and the audio signal in the specified unit, which includes a linear PCM signal.2. The transmission apparatus according to claim 1 , whereinthe specified unit is a subframe.3. The transmission apparatus according to claim 1 , whereinthe linear PCM signal is an audio signal by which a real-time performance is to be ensured.4. The transmission apparatus according to claim 1 , further comprisingan information adder that adds, to the audio signal transmitted by the transmission section, identification information indicating that the audio signal transmitted by the transmission section is obtained by alternately arranging the audio signal in the specified unit, which includes a compressed audio signal, and the audio signal in the specified unit, which includes a linear PCM signal.5. The transmission apparatus according to claim 4 , whereinthe information adder adds the identification information using a specified bit region in a channel status for each block, the specified bit region being assigned a specified number, the channel status being made up for each specified unit.6. The transmission ...

24-02-2022 дата публикации


Номер: US20220059110A1

A method for decoding an encoded audio bitstream in an audio processing system is disclosed. The method includes extracting from the encoded audio bitstream a first waveform-coded signal comprising spectral coefficients corresponding to frequencies up to a first cross-over frequency for a time frame and performing parametric decoding at a second cross-over frequency for the time frame to generate a reconstructed signal. The second cross-over frequency is above the first cross-over frequency and the parametric decoding uses reconstruction parameters derived from the encoded audio bitstream to generate the reconstructed signal. The method also includes extracting from the encoded audio bitstream a second waveform-coded signal comprising spectral coefficients corresponding to a subset of frequencies above the first cross-over frequency for the time frame and interleaving the second waveform-coded signal with the reconstructed signal to produce an interleaved signal for the time frame. 13-. (canceled)4. A decoding method in a multi-channel audio processing system , the decoding method comprising:receiving at least an waveform-coded downmix signal comprising spectral coefficients corresponding to frequencies above a first cross-over frequency;performing frequency reconstruction to determine to determine a reconstructed signal based on the waveform-coded downmix signal, wherein the reconstructed signal is above a second cross-over frequency, wherein the second cross-over frequency is different than the first cross-over frequency, and wherein the frequency reconstruction is based on the waveform-coded downmix signal;performing a parametric upmix of the reconstructed signal into M upmix signals.5. The method of claim 4 , wherein the M upmix signals are interleaved with M waveform coded signals.6. The method of claim 4 , wherein M>1.7. The method of claim 4 , wherein the waveform-coded downmix signal is determined based on downmixing M waveform coded signals.8. The method of ...

18-02-2021 дата публикации


Номер: US20210050023A1
Принадлежит: Sony Interactive Entertainment Inc.

A system for determining prioritisation values for two or more sounds within an audio clip includes: a feature extraction unit operable to extract characteristic features from the two or more sounds, a feature combination unit operable to generate a combined mix comprising extracted features from the two or more sounds, an audio assessment unit operable to identify the contribution of one or more of the features to the combined mix, a feature classification unit operable to assign a saliency score to each of the features in the combined mix, and an audio prioritisation unit operable to determine relative priority values for the two or more sounds in dependence upon the assigned saliency scores for each of one or more features of the sounds. 1. A system for determining prioritisation values for two or more sounds within an audio clip , the system comprising:a feature extraction unit operable to extract characteristic features from the two or more sounds;a feature combination unit operable to generate a combined mix comprising extracted features from the two or more sounds;an audio assessment unit operable to identify the contribution of one or more of the features to the combined mix;a feature classification unit operable to assign a saliency score to each of the features in the combined mix; andan audio prioritisation unit operable to determine relative priority values for the two or more sounds in dependence upon the assigned saliency scores for each of one or more features of the sounds.2. The system of claim 1 , wherein the characteristic features comprise one or more audio frequencies.3. The system of claim 1 , comprising an audio mix generating unit operable to generate output audio comprising a subset of the two or more sounds in dependence upon the determined relative priority values.4. The system of claim 1 , wherein the audio assessment unit is operable to identify the sound source associated with each of one or more of the sounds.5. The system of claim 4 , ...

18-02-2021 дата публикации


Номер: US20210050028A1
Автор: LEE Tungchin, Oh Sejin

A method for transmitting audio data performed by an audio data transmission apparatus in accordance with the present invention comprises the steps of: generating playback environment information of three-dimensional audio content; encoding a three-dimensional audio signal of the three-dimensional audio content; and transmitting, to an audio data reception apparatus, the encoded three-dimensional audio signal of the three-dimensional audio content and the generated playback environment information, wherein the playback environment information includes environment information of a room in which the three-dimensional audio content is played. 1. A method for transmitting audio data by an audio data transmission apparatus , the method comprising:generating playback environment information for three-dimensional (3D) audio content;encoding a 3D audio signal of the 3D audio content; andtransmitting the encoded 3D audio signal of the 3D audio content and the generated playback environment information to an audio data reception apparatus,wherein the playback environment information comprises environment information about a room for playing the 3D audio content.2. The method of claim 1 , wherein the environment information about the room for playing the 3D audio content comprises at least one of size information about the room or property information about the room.3. The method of claim 2 , wherein the size information about the room comprises at least one of information about a width of the room claim 2 , information about a length of the room claim 2 , or information about a height of the room.4. The method of claim 2 , wherein the property information about the room comprises at least one of a material property of a floor constituting the room claim 2 , a material property of a ceiling constituting the room claim 2 , a material property of a left wall constituting the room claim 2 , and a material property of a right wall constituting the room claim 2 , a material ...

16-02-2017 дата публикации

Audio Segmentation Based on Spatial Metadata

Номер: US20170047071A1
Принадлежит: Dolby Laboratories Licensing Corp

A method of encoding adaptive audio, comprising receiving N objects and associated spatial metadata that describes the continuing motion of these objects, and partitioning the audio into segments based on the spatial metadata. The method encodes adaptive audio having objects and channel beds by capturing a continuing motion of a number N objects in a time-varying matrix trajectory comprising a sequence of matrices, coding coefficients of the time-varying matrix trajectory in spatial metadata to be transmitted via a high-definition audio format for rendering the adaptive audio through a number M output channels, and segmenting the sequence of matrices into a plurality of sub-segments based on the spatial metadata, wherein the plurality of sub segments are configured to facilitate coding of one or more characteristics of the adaptive audio.

16-02-2017 дата публикации


Номер: US20170047076A1
Автор: Du Hui, Shi Runyu, Yen Chiafu
Принадлежит: Xiaomi Inc.

The present disclosure relates to a method and a device for achieving object audio recording and an electronic apparatus. The method may include: performing a sound collection operation via a plurality of microphones simultaneously so as to obtain a mixed sound signal; identifying the number of sound sources and position information of each sound source and separating out an object sound signal corresponding to each sound source from the mixed sound signal according to the mixed sound signal and set position information of each microphone; and combining the position information and the object sound signal of individual sound sources to obtain audio data in an object audio format. 1. A method for achieving object audio recording , comprising:collecting, by an electronic device, a mixed sound signal from a plurality of sound sources simultaneously via a plurality of microphones;identifying, by the electronic device from the mixed sound signal, each of the plurality of sound sources and position information of each sound source;for each of the plurality of sound sources, separating out, by the electronic device, an object sound signal from the mixed sound signal according to the position information of the sound source; andcombining the position information and the object sound signals of each of the plurality of sound sources to obtain audio data of the mixed sound signal in an object audio format.2. The method of claim 1 , wherein the identifying of a sound source from the plurality of sound sources and the position information of the sound source comprises:identifying an identity of the sound source and position information of the sound source according to an amplitude difference and a phase difference of a sound from the sound source and detected by the plurality of microphones.3. The method of claim 1 , wherein the identifying of a sound source from the plurality of sound sources and the position information of the sound source comprises: identifying an identity ...

18-02-2021 дата публикации


Номер: US20210051325A1

A spectrum coding method includes quantizing spectral data of a current band based on a first quantization scheme, generating a lower bit of the current band using the spectral data and the quantized spectral data, quantizing a sequence of lower bits including the lower bit of the current band based on a second quantization scheme, and generating a bitstream based on a upper bit excluding N bits, where N is 1 or greater, from the quantized spectral data and the quantized sequence of lower bits. 1. A spectrum coding method comprising:quantizing spectral data of a non-zero band by using uniform scalar quantization (USQ);extracting a lower bit from the quantized spectral data;generating a sequence of lower bits by collecting lower bits of the quantized spectral data for all non-zero bands;quantizing the sequence of lower bits by using Trellis coded quantization (TCQ);generating a bitstream based on upper bits except for the lower bit in the quantized spectral data and the quantized sequence of lower bits,wherein a number of bits used for TCQ is extracted evenly from a number of bits allocated for quantization of each non-zero band, and wherein a remaining number of bits in the non-zero band is used for USQ.2. The method of claim 1 , wherein the quantizing the sequence of lower bits is performed based on a difference between the spectral data and the quantized spectral data.3. The method of claim 1 , wherein a bandwidth of the spectral data is either a super wide band or a full band.4. The method of claim 1 , wherein the generating of the bitstream comprises:performing first lossless coding on a number, a position and a sign of an important spectral component of the non-zero band;performing second lossless coding on magnitude information which is constructed by excluding the lower bit from the quantized spectral data;performing third lossless coding on the quantized sequence of the lower bits; andgenerating the bitstream by using data provided from the first lossless ...

15-02-2018 дата публикации


Номер: US20180047403A1

An encoder for encoding an audio signal includes an analyzer for analyzing the audio signal and for determining analysis prediction coefficients from the audio signal. The encoder includes a converter for deriving converted prediction coefficients from the analysis prediction coefficients, a memory for storing a multitude of correction values and a calculator. The calculator includes a processor for processing the converted prediction coefficients to obtain spectral weighting factors. The calculator includes a combiner for combining the spectral weighting factors and the multitude of correction values to obtain corrected weighting factors. A quantizer of the calculator is configured for quantizing the converted prediction coefficients using the corrected weighting factors to obtain a quantized representation of the converted prediction coefficients. The encoder includes a bitstream former for forming an output signal based on the quantized representation of the converted prediction coefficients and based on the audio signal. 1. Encoder for encoding an audio signal , the encoder comprising:an analyzer configured for analyzing the audio signal and for determining analysis prediction coefficients from the audio signal;a converter configured for deriving converted prediction coefficients from the analysis prediction coefficients;a memory configured for storing a multitude of correction values; a processor configured for processing the converted prediction coefficients to obtain spectral weighting factors;', 'a combiner configured for combining the spectral weighting factors and the multitude of correction values to obtain corrected weighting factors; and', 'a quantizer configured for quantizing the converted prediction coefficients using the corrected weighting factors to obtain a quantized representation of the converted prediction coefficients; and, 'a calculator comprisinga bitstream former configured for forming an output signal based on the quantized representation ...

16-02-2017 дата публикации


Номер: US20170048615A1

An audio processing method and an electronic device for supporting the same are provided. The audio signal processing method includes checking property information of an audio signal in response to a request for playing the audio signal; processing, when property information fulfils a first condition, the audio signal using the first method and, when the property information fulfils a second condition, the audio signal using the second method; and outputting the audio signal processed using one of the first and second methods through a speaker. 1. An electronic device , comprising:a memory for storing a first and second methods for processing an audio signal; anda processor functionally connected to the memory,wherein the processor checks property information of an audio signal in response to a request to play the audio signal, processes, when the checked property information fulfils a first condition, the audio signal using a first process, processes, when the property information fulfils a second condition, the audio signal using a second process, and outputs the signal processed using one of the first and second processes through a speaker.2. The electronic device of claim 1 , wherein the property information of the audio signal comprises volume level information claim 1 , audio channel information claim 1 , audio quality information claim 1 , and audio type information.3. The electronic device of claim 2 , wherein the first and second processes process the audio signal using audio parameters and volume gain corresponding to the first and second conditions.4. The electronic device of claim 3 , wherein:the memory stores an audio parameter table that maps the property information of the audio signal to the audio parameters, andthe processor determines the audio parameters corresponding to the conditions of the property information by referencing the audio parameter table and applies the audio parameters to the audio signal.5. The electronic device of claim 4 , ...

03-03-2022 дата публикации


Номер: US20220068286A1
Автор: Liao Chun-Ku, WU Chia-Che
Принадлежит: Realtek Semiconductor Corp.

The present invention provides a signal transmission method for data transmission through an audio transmission interface. The method includes: receiving audio data and first control data that is generated based on at least one first human-machine interaction; packing the audio data into at least one first data unit; packing the first control data into at least one second data unit; and transmitting a bit stream including the first data unit and the second data unit at a transmission rate that is higher than a sampling rate of the audio data.

25-02-2016 дата публикации


Номер: US20160055864A1

An audio processing system () accepts an audio bitstream having one of a plurality of predefined audio frame rates. The system comprises a front-end component (), which receives a variable number of quantized spectral components, corresponding to one audio frame in any of the predefined audio frame rates, and performs an inverse quantization according to predetermined, frequency-dependent quantization levels. The front-end component may be agnostic of the audio frame rate. The audio processing system further comprises a frequency-domain processing stage () and a sample rate converter (), which provide a reconstructed audio signal sampled at a target sampling frequency independent of the audio frame rate. By its frame-rate adaptability, the system can be configured to operate frame-synchronously in parallel with a video processing system that accepts plural video frame rates. 114-. (canceled)15. An audio processing system configured to accept an audio bitstream segmented into audio frames carrying audio data , the audio processing system comprising:a front-end component, which includes:a dequantization stage adapted to repeatedly receive quantized spectral coefficients corresponding to one audio frame in the audio bitstream, and to output a first frequency-domain representation of an intermediate signal; andan inverse transform stage for receiving the first frequency-domain representation of the intermediate signal and synthesizing, based thereon, a time-domain representation of the intermediate signal;a processing stage, which includes:an analysis filterbank for receiving the time-domain representation of the intermediate signal and outputting a second frequency-domain representation of the intermediate signal;at least one processing component for receiving said second frequency-domain representation of the intermediate signal and outputting a frequency-domain representation of a processed audio signal; anda synthesis filterbank for receiving the frequency-domain ...

14-02-2019 дата публикации


Номер: US20190051310A1

Disclosed are a packet loss concealment method and apparatus a using a generative adversarial network. A method for packet loss concealment in voice communication may include training a classification model based on a generative adversarial network (GAN) with respect to a voice signal including a plurality of frames, training a generative model having a contention relation with the classification model based on the GAN, estimating lost packet information based on the trained generative model with respect to the voice signal encoded by a codec, and restoring a lost packet based on the estimated packet information. 1. A method for packet loss concealment in voice communication , the method comprising:training a classification model based on a generative adversarial network (GAN) with respect to a voice signal comprising a plurality of frames;training a generative model having a contention relation with the classification model based on the GAN;estimating lost packet information based on the trained generative model with respect to the voice signal encoded by a codec; andrestoring a lost packet based on the estimated packet information.2. The method of claim 1 , wherein the classification model determines whether data corresponding to an input signal is data corresponding to a real voice signal or data corresponding to a voice signal generated through the generative model.3. The method of claim 1 , wherein training the generative model comprises training the generative model to classify data generated through the generative model as data corresponding to a real voice signal in the classification model.4. The method of claim 3 , wherein training the classification model comprises training the classification model to classify data generated through the generative model as fake data not data corresponding to a real voice signal through the training.5. The method of claim 1 , wherein training the classification model comprises training the classification model to classify ...

25-02-2021 дата публикации

Methods and systems for generating and rendering object based audio with conditional rendering metadata

Номер: US20210056978A1

Methods and audio processing units for generating an object based audio program including conditional rendering metadata corresponding to at least one object channel of the program, where the conditional rendering metadata is indicative of at least one rendering constraint, based on playback speaker array configuration, which applies to each corresponding object channel, and methods for rendering audio content determined by such a program, including by rendering content of at least one audio channel of the program in a manner compliant with each applicable rendering constraint in response to at least some of the conditional rendering metadata. Rendering of a selected mix of content of the program may provide an immersive experience.

22-02-2018 дата публикации

Methods and systems for generating and interactively rendering object based audio

Номер: US20180053515A1

Methods for generating an object based audio program, renderable in a personalizable manner, and including a bed of speaker channels renderable in the absence of selection of other program content (e.g., to provide a default full range audio experience). Other embodiments include steps of delivering, decoding, and/or rendering such a program. Rendering of content of the bed, or of a selected mix of other content of the program, may provide an immersive experience. The program may include multiple object channels (e.g., object channels indicative of user-selectable and user-configurable objects), the bed of speaker channels, and other speaker channels. Another aspect is an audio processing unit (e.g., encoder or decoder) configured to perform, or which includes a buffer memory which stores at least one frame (or other segment) of an object based audio program (or bitstream thereof) generated in accordance with, any embodiment of the method.

23-02-2017 дата публикации


Номер: US20170053658A1

A method for generating a high-band target signal includes receiving, at an encoder, an input signal having a low-band portion and a high-band portion. The method also includes comparing a first autocorrelation value of the input signal to a second autocorrelation value of the input signal. The method further includes scaling the input signal by a scaling factor to generate a scaled input signal. The scaling factor is determined based on a result of the comparison. The method also includes generating a low-band signal based on the input signal and generating the high-band target signal based on the scaled input signal. 1. A method for generating a high-band target signal , the method comprising:receiving, at an encoder, an input signal having a low-band portion and a high-band portion;comparing a first autocorrelation value of the input signal to a second autocorrelation value of the input signal;scaling the input signal by a scaling factor to generate a scaled input signal, the scaling factor determined based on a result of the comparison;generating a low-band signal based on the input signal, wherein the low-band signal is generated independently of the scaled input signal; andgenerating the high-band target signal based on the scaled input signal.2. The method of claim 1 , wherein comparing the first autocorrelation value to the second autocorrelation value comprises comparing the second autocorrelation value to a product of the first autocorrelation value and a threshold claim 1 , and wherein scaling the input signal by the scaling factor comprises:scaling the input signal by a first scaling factor if the comparison generates a first result; orscaling the input signal by a second scaling factor if the comparison generates a second result.3. The method of claim 2 , wherein the scaled input signal has a first amount of headroom in response to scaling the input signal by the first scaling factor claim 2 , wherein the scaled input signal has a second amount of ...

23-02-2017 дата публикации


Номер: US20170053659A1

A method includes determining an error condition during a bandwidth transition period of an encoded audio signal. The error condition corresponds to a second frame of the encoded audio signal, where the second frame sequentially follows a first frame in the encoded audio signal. The method also includes generating audio data corresponding to a first frequency band of the second frame based on audio data corresponding to the first frequency band of the first frame. The method further includes re-using a signal corresponding to a second frequency band of the first frame to synthesize audio data corresponding to the second frequency band of the second frame. 1. A method comprising:determining, at an electronic device during a bandwidth transition period of an encoded audio signal, an error condition corresponding to a second frame of the encoded audio signal, wherein the second frame sequentially follows a first frame in the encoded audio signal;generating audio data corresponding to a first frequency band of the second frame based on audio data corresponding to the first frequency band of the first frame; andre-using a signal corresponding to a second frequency band of the first frame to synthesize audio data corresponding to the second frequency band of the second frame.2. The method of claim 1 , wherein the bandwidth transition period corresponds to a bandwidth reduction.3. The method of claim 2 , wherein the bandwidth reduction is from:full band (FB) to super wideband (SWB);FB to wideband (WB);FB to narrowband (NB);SWB to WB;SWB to NB; orWB to NB.4. The method of claim 2 , wherein the bandwidth reduction corresponds to at least one of a reduction in encoding bitrate or a reduction in bandwidth of a signal that is encoded to generate the encoded audio signal.5. The method of claim 1 , wherein the bandwidth transition period corresponds to a bandwidth increase.6. The method of claim 1 , wherein the first frequency band includes a low-band frequency band.7. The method ...

23-02-2017 дата публикации


Номер: US20170053660A1

Systems and methods are presented for cross-fading (or other multiple clip processing) of information streams on a user or client device, such as a telephone, tablet, computer or MP3 player, or any consumer device with audio playback. Multiple clip processing can be accomplished at a client end according to directions sent from a service provider that specify a combination of (i) the clips involved; (ii) the device on which the cross-fade or other processing is to occur and its parameters; and (iii) the service provider system. For example, a consumer device with only one decoder, can utilize that decoder (typically hardware) to decompress one or more elements that are involved in a cross-fade at faster than real time, thus pre-fetching the next element(s) to be played in the cross-fade at the end of the currently being played element. The next elements(s) can, for example, be stored in an input buffer, then decoded and stored in a decoded sample buffer, all prior to the required presentation time of the multiple element effect. At the requisite time, a client device component can access the respective samples of the decoded audio clips as it performs the cross-fade, mix or other effect. Such exemplary embodiments use a single decoder and thus do not require synchronized simultaneous decodes. 1. A method for implementing multiple element effects on audio packets on a client device having a single hardware decoder , comprising:downloading on a client device compressed audio clips to be used in a multiple element effect;storing the compressed audio in an input buffer;sequentially decoding at least one of said clips at a rate that is faster than real time;storing the decoded clips in separate portions of a decoded sample buffer; andaccessing respective samples of the decoded clips from the decoded sample buffer while performing the effect.2. The method of claim 1 , wherein said multiple element effect is one of a linear cross-fade claim 1 , nonlinear cross-fade claim 1 ...

13-02-2020 дата публикации


Номер: US20200051574A1
Автор: CHON Sang-bae, KIM Sun-min

A method of processing an audio signal includes receiving an audio bitstream encoded via MPEG Surround 212 (MPS212); generating an internal channel (IC) signal for a single channel pair element (CPE), based on the received audio bitstream, equalization (EQ) values for MPS212 output channels defined in a format converter, and gain values for the MPS212 output channels; and generating stereo output channels, based on the generated IC signal. 1. A method of decoding a QCE (Quadruple Channel Element) including a first CPE (Channel Pair Element) and a second CPE , the method comprising:obtaining first SBR (spectral band replication) payload and first MPS212 payload for stereo output layout by decoding the first CPE;obtaining second SBR payload and second MPS212 payload for stereo output layout by decoding the second CPE;generating a pair of band-limited internal channel (IC) signals based on the first MPS212 payload, the second MPS212 payload, and a pair of internal channel gain (ICG)s;downmixing the first SBR payload and the second SBR payload into downmixed SBR parameters using rendering parameters of a format converter; andgenerating a pair of full-band internal channel signals based on the generated pair of band-limited internal channel signals and the downmixed SBR parameters.2. The method of claim 1 , wherein the generating of the band-limited IC signal comprises determining whether IC processing for the first CPE and the second CPE is possible.3. The method of claim 1 , further comprising generating stereo output channel signals based on the generated pair of the full-band internal channel signals.4. The method of claim 1 , wherein the generating of the pair of the band-limited IC signals comprises:calculating the pair of the ICGs with respect to the QCE based on the rendering parameters of the format converter.6. An apparatus for processing an audio signal claim 1 , the apparatus comprising:a receiver configured to receive a QCE (Quadruple Channel Element) ...

14-02-2019 дата публикации


Номер: US20190052555A1

A system and method receives one or more captured signals through a captured audio path and produces one or more playback signals through a playback audio path. The system and method executes one or more signal processing functions and measures the delays within the playback audio path and captured audio path during operation of the one or more signal processing functions. The system and method stores the measured delays in a memory and compensates the one or more signal processing functions for the playback delay and the capture delay. 1. A system for real-time audio signal processing that receives one or more captured signals through a captured audio path , and produces one or more playback signals through a playback audio path , and executes one or more signal processing functions comprising:a processor configured to measure a playback delay corresponding to delays of the playback audio path and a capture delay corresponding to delays of the captured audio path during operation of the one or more signal processing functions, wherein measuring the playback delay and the capture delay includes measuring delay of a test signal played and received over a feedback channel, and wherein the feedback channel passes through components of the playback audio path and the captured audio path; anda memory for storing the measured delays;where the processor is further configured to compensate the one or more signal processing functions for the playback delay and the capture delay.2. The system of claim 1 , wherein measuring the playback delay and the capture delay occurs when the processor first boots up.3. The system of claim 1 , wherein measuring the playback delay and the capture delay occurs continuously during the operation of the one or more signal processing functions.4. The system of where measurement of the playback delay and the capture delay determines a synchronization time claim 1 , when boot-up of a playback event and capture event is completed.5. The system of ...

22-02-2018 дата публикации

Video Content Assisted Audio Object Extraction

Номер: US20180054689A1
Принадлежит: Dolby Laboratories Licensing Corp

Embodiments of the present invention relate to video content assisted audio object extraction. A method of audio object extraction from channel-based audio content is disclosed. The method comprises extracting at least one video object from video content associated with the channel-based audio content, and determining information about the at least one video object. The method further comprises extracting from the channel-based audio content an audio object to be rendered as an upmixed audio signal based on the determined information. Corresponding system and computer program product are also disclosed.

13-02-2020 дата публикации


Номер: US20200053488A1
Принадлежит: GN Hearing A/S

A system includes a first device and a second device configured to exchange data packages over a bi-directional wireless communication channel, wherein the first device and the second device are configured for use by a user; wherein the first device is configured to provide a first data package, wherein the first data package belongs to a first packet category and comprises first audio data; wherein the first device is also configured to transmit the first data package to the second device, and receive a second data package belonging to the first packet category; and wherein the second data package comprises second audio data, a code indicating whether data in the second data package is corrupted or invalid, and an indicator indicating whether the first data package was successfully received by the second device or not. 1. A system comprising a first device and a second device configured to exchange data packages over a bi-directional wireless communication channel , wherein the first device and the second device are configured for use by a user;wherein the first device is configured to provide a first data package, wherein the first data package belongs to a first packet category and comprises first audio data; 'wherein the second data package comprises second audio data, a code indicating whether data in the second data package is corrupted or invalid, and an indicator indicating whether the first data package was successfully received by the second device or not.', 'wherein the first device is also configured to transmit the first data package to the second device, and receive a second data package belonging to the first packet category; and'}2. The system of claim 1 , wherein the first device comprises a first hearing aid.3. The system of claim 2 , wherein the second device comprises a second hearing aid.4. The system of claim 3 , wherein the first hearing aid and the second hearing aid are configured to communicate with each other through a communication device ...

01-03-2018 дата публикации


Номер: US20180060022A1
Автор: Kozlov Sergey

Methods, systems, and computer readable media can be operable to facilitate the processing and output of multiple audio streams associated with a piece of content. A multimedia device may identify and notify a user of multiple audio stream languages available for a received multimedia stream. Based upon user input, the multimedia device may identify a plurality of audio streams that are associated with user-selected languages, decode the identified audio streams, and output the decoded audio streams to one or more audio devices. The multimedia device may determine, for each audio stream, an audio device to which the audio stream is to be delivered. The multimedia device may output a video stream associated with the audio streams to a display device while outputting each respective audio stream of the multiple audio streams to the audio device designated for the respective audio stream. 1. A method comprising:receiving a piece of content that comprises a plurality of audio streams;identifying at least two audio streams of the plurality of audio streams for output, wherein each respective one of the identified at least two audio streams is designated for output to one or more audio devices; andoutputting the identified at least two audio streams, wherein each respective one of the identified at least two audio streams is output to the one or more designated audio devices.2. The method of claim 1 , further comprising:identifying the plurality of audio streams and a language associated with each respective one of the plurality of audio streams from a program map table associated with the piece of content;outputting a user interface that displays an identification of each identified language; andwherein the at least two audio streams are identified for output based upon user input that identifies the languages associated with the at least two audio streams.3. The method of claim 2 , further comprising:receiving user input that identifies an audio device to which each ...

02-03-2017 дата публикации


Номер: US20170060520A1

The present disclosure describes systems and methods for providing streaming, dynamically editable social media content, such as songs, music videos, or other such content. Audio may be delivered to a computing device of a user in a multi-track format, or as separate audio files for each track. The computing device may instantiate a plurality of synchronized audio players and simultaneously playback the separate audio files. The user may individually adjust parameters for each audio player, allowing dynamic control over the media content during use. 1. A method for multi-track media playback comprising:transmitting, by a client device to a server, a request for an item of media;receiving, by the client device from the server, an identification of locations of each of a plurality of tracks of the item of media;instantiating, by the client device, a plurality of playback engines corresponding to the plurality of tracks;retrieving, by the client device, a first portion of each of the plurality of tracks of the item of media based on the received identifications;directing, by the client device, each of the retrieved first portions of each of the plurality of tracks to a corresponding one of the plurality of playback engines;decoding, by each playback engine, the first portion of the corresponding track of the plurality of tracks; anditeratively combining, by a mixer of the client device, outputs of each of the plurality of playback engines to generate a combined multi-track output.2. The method of claim 1 , further comprising:retrieving a second portion of each of the plurality of tracks of the item of media, during decoding of the first portion of the plurality of tracks by the plurality of playback engines.3. The method of claim 1 , wherein instantiating the plurality of playback engines further comprises establishing separate input and output buffers for each of the plurality of playback engines.4. The method of claim 1 , wherein each of the plurality of tracks ...

03-03-2016 дата публикации

Methods and Systems for Generating and Rendering Object Based Audio with Conditional Rendering Metadata

Номер: US20160064003A1

Methods and audio processing units for generating an object based audio program including conditional rendering metadata corresponding to at least one object channel of the program, where the conditional rendering metadata is indicative of at least one rendering constraint, based on playback speaker array configuration, which applies to each corresponding object channel, and methods for rendering audio content determined by such a program, including by rendering content of at least one audio channel of the program in a manner compliant with each applicable rendering constraint in response to at least some of the conditional rendering metadata. Rendering of a selected mix of content of the program may provide an immersive experience.

03-03-2016 дата публикации

Apparatus and method for encoding and decoding signal for high frequency bandwidth extension

Номер: US20160064013A1

An apparatus and method for encoding and decoding a signal for high frequency bandwidth extension are provided. An encoding apparatus may down-sample a time domain input signal, may core-encode the down-sampled time domain input signal, may transform the core-encoded time domain input signal to a frequency domain input signal, and may perform bandwidth extension encoding using a basic signal of the frequency domain input signal.

01-03-2018 дата публикации

Method for decoding audio sequences

Номер: US20180061426A1
Автор: Anton Bilan
Принадлежит: Individual

An application may combine encoded input audio data into overlapping blocks for decoding and then remove overlapping to remove otherwise present audible defects on borders between separately decoded non-overlapping consecutive audio data blocks.

04-03-2021 дата публикации


Номер: US20210065727A1

A device for signal processing includes a memory and a processor. The memory is configured to store a parameter associated with a bandwidth-extended audio stream. The processor is configured to select a plurality of non-linear processing functions based at least in part on a value of the parameter. The processor is also configured to generate a high-band excitation signal based on the plurality of non-linear processing functions. 1. A device for signal processing comprising:a receiver configured to receive an encoded audio signal, wherein the encoded audio signal comprises a parameter;a memory configured to store the parameter associated with a bandwidth-extended audio stream; anda processor configured to:select a plurality of non-linear processing functions based at least in part on a value of the parameter, wherein the plurality of non-linear processing functions comprise a first non-linear processing function and a second non-linear processing function, wherein the first non-linear processing function is different from the second non-linear processing function;generate a first excitation signal based on the first non-linear processing function;generate a second excitation signal based on the second non-linear processing function; andgenerate a high-band excitation signal based on the first excitation signal and the second excitation signal, wherein the first excitation signal corresponds to a first high-band frequency sub-range, and wherein the second excitation signal corresponds to a second high-band frequency sub-range.2. The device of claim 1 , wherein the processor is further configured to generate a resampled signal based on a low-band excitation signal claim 1 , wherein the high-band excitation signal is based at least in part on the resampled signal.3. The device of claim 1 , wherein the processor is further configured to:generate a first filtered signal by applying a low-pass filter to the first excitation signal; andgenerate a second filtered signal by ...
