Настройки

Укажите год
-

Небесная энциклопедия

Космические корабли и станции, автоматические КА и методы их проектирования, бортовые комплексы управления, системы и средства жизнеобеспечения, особенности технологии производства ракетно-космических систем

Подробнее
-

Мониторинг СМИ

Мониторинг СМИ и социальных сетей. Сканирование интернета, новостных сайтов, специализированных контентных площадок на базе мессенджеров. Гибкие настройки фильтров и первоначальных источников.

Подробнее

Форма поиска

Поддерживает ввод нескольких поисковых фраз (по одной на строку). При поиске обеспечивает поддержку морфологии русского и английского языка
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Ведите корректный номера.
Укажите год
Укажите год

Применить Всего найдено 2820. Отображено 199.
10-09-2011 дата публикации

СИСТЕМЫ, СПОСОБЫ И УСТРОЙСТВО ДЛЯ ШИРОКОПОЛОСНОГО КОДИРОВАНИЯ И ДЕКОДИРОВАНИЯ НЕАКТИВНЫХ КАДРОВ

Номер: RU2428747C2

Изобретение относится к обработке речевых сигналов. Раскрыты речевые кодеры и способы речевого кодирования для кодирования неактивных кадров на разных скоростях, а также устройство и способы для обработки кодированного речевого сигнала для вычисления декодированного кадра на основании описания спектральной огибающей по первому частотному диапазону и описания спектральной огибающей по второму частотному диапазону, причем описание для первого частотного диапазона базируется на информации из соответствующего кодированного кадра и описание для второго частотного диапазона базируется на информации из, по меньшей мере, одного предыдущего кодированного кадра. Вычисление декодированного кадра также может базироваться на описании временной информации для второго частотного диапазона, которое базируется на информации из, по меньшей мере, одного предыдущего кодированного кадра. Технический результат - повышение разборчивости речи. 9 н. и 65 з.п. ф-лы, 66 ил.

Подробнее
27-02-2009 дата публикации

КВАНТОВАНИЕ КЛАССОВ ДЛЯ РАСПРЕДЕЛЕННОГО РАСПОЗНАВАНИЯ РЕЧИ

Номер: RU2348019C2

Изобретение касается систем распределенного распознавания речи, где раскрыты система, способ и читаемый компьютером носитель информации для квантования информации о классе и информации об основном тоне звука. Способ в системе обработки информации включает в себя прием звука и захват кадра звука. Способ дополнительно включает в себя определение основного тона кадра и вычисление ключевого слова, представляющего основной тон кадра, причем первое значение ключевого слова указывает неопределенный основной тон. Способ дополнительно включает в себя определение класса кадра, причем класс представляет собой любой из по меньшей мере двух классов, указывающих неопределенный основной тон, и по меньшей мере одного класса, указывающего определенный основной тон. Способ дополнительно включает в себя вычисление ключевого слова, представляющего класс кадра, причем длина ключевого слова составляет максимум минимальное количество битов, требуемых для представления по меньшей мере двух классов, и минимальное ...

Подробнее
27-11-2017 дата публикации

РЕШЕНИЕ ОТНОСИТЕЛЬНО НАЛИЧИЯ/ОТСУТСТВИЯ ВОКАЛИЗАЦИИ ДЛЯ ОБРАБОТКИ РЕЧИ

Номер: RU2636685C2
Автор: ГАО Ян (US)

Изобретение относится к решению относительно наличия/отсутствия вокализации для обработки речи. Технический результат заключается в усовершенствованном и более надежном обнаружении невокализованной/вокализованной речи. Способ обработки речи включает этапы: определение параметра наличия/отсутствия вокализации в текущем кадре речевого сигнала, который является комбинированным параметром, отражающим произведение параметра периодичности и параметра спектрального наклона; определение сглаженного параметра наличия/отсутствия вокализации для включения информации параметра наличия/отсутствия вокализации в предшествующий кадр; вычисление разности между параметром наличия/отсутствия вокализации в текущем кадре и сглаженным параметром и определение, содержит ли текущий кадр невокализованную речь или вокализованную речь, с использованием вычисленной разности в качестве параметра принятия решения. 2 н. и 18 з.п. ф-лы, 15 ил.

Подробнее
20-02-2012 дата публикации

УСТРОЙСТВО И СПОСОБ РАСЧЕТА ПАРАМЕТРОВ РАСШИРЕНИЯ ПОЛОСЫ ПРОПУСКАНИЯ ПОСРЕДСТВОМ УПРАВЛЕНИЯ ФРЕЙМАМИ НАКЛОНА СПЕКТРА

Номер: RU2443028C2

Изобретение относится к области кодирования и декодирования звука, в частности, с расширением диапазона рабочих частот (BWE). Техническим результатом является улучшение качества звукового сигнала расширенного диапазона. Указанный результат достигается тем, что первая полоса спектра кодируется с помощью первого набора битов, а вторая полоса спектра, отличающаяся от первой полосы спектра, кодируется вторым наборов битов, который меньше, чем первый набор битов. Устройство для вычисления параметров расширения рабочего диапазона частот звукового сигнала в системе расширения рабочего диапазона частот оснащено вычислителем параметров управляемого расширения полосы пропускания (10) для вычисления параметров расширения рабочего диапазона частот для второй полосы частот в виде последовательности фреймов аудиосигнала. Каждый фрейм имеет управляемый начальный момент. Дополнительно устройство оснащено детектором наклона спектра (12) в интервале времени звукового сигнала, который передает сигналы момента ...

Подробнее
01-10-1992 дата публикации

VERFAHREN UND EINRICHTUNG ZUR KOMPRIMIERUNG VON SPRACHSIGNALDATEN.

Номер: DE0003781393D1
Автор: JIBBE KHALED
Принадлежит: NCR INT INC

Подробнее
02-12-2021 дата публикации

SPRACHVERTRAULICHKEITSSYSTEM, ZUGEHÖRIGES VERFAHREN UND AKUSTIKWAND

Номер: DE112018001393B4
Принадлежит: GUARDIAN GLASS LLC, Guardian Glass, LLC

Verfahren zum Stören der Sprachverständlichkeit, wobei das Verfahren umfasst:Empfangen eines ursprünglichen Sprachsignals, das Originalsprache entspricht, über ein Mikrofon (606, 606a, 606b);Unterziehen des ursprünglichen Sprachsignals einem Filter, wobei die Ausgabe aus dem Filter angibt, ob Konsonanten in dem ursprünglichen Sprachsignal vorhanden sind, sodass die ursprüngliche Sprache wahrscheinlich eine Störung für Menschen in einem Bereich von Interesse verursacht;und abhängig von der Ausgabe aus dem Filter, die angibt, dass Konsonanten in der ursprünglichen Sprache vorhanden sind, sodass die ursprüngliche Sprache wahrscheinlich eine Störung für Menschen in dem Bereich von Interesse verursacht:Erzeugen eines die Verständlichkeit störenden Maskierungssignals aus dem ursprünglichen Sprachsignal, wobei das die Verständlichkeit störende Maskierungssignal sich von dem ursprünglichen Sprachsignal dadurch unterscheidet, dass es erzeugt wird, um (a) eine Zeitverzögerung in Bezug auf das ursprüngliche ...

Подробнее
10-04-2008 дата публикации

Verfahren zur adaptiven Filterung

Номер: DE602004007593T2
Принадлежит: BROADCOM CORP, BROADCOM CORP.

Подробнее
23-04-1997 дата публикации

Noise output for a decoded speech signal

Номер: GB0009704316D0
Автор:
Принадлежит:

Подробнее
28-11-1990 дата публикации

METHOD AND APPARATUS FOR PROCESSING SPEECH

Номер: GB0009022674D0
Автор:
Принадлежит:

Подробнее
02-09-1998 дата публикации

Noise output for a decoded speech signal

Номер: GB0002322778A
Принадлежит:

A speech decoder (201) is arranged and constructed to receive a plurality of speech parameters and decode the plurality of speech parameters into at least one fragment of decoded speech. A noise generator (205) is arranged and constructed to output a noise signal. Determining means (203) provide a decision as to whether the plurality of speech parameters represents unvoiced speech. A switch (209), operating together with the speech decoder, the noise generator, and the means for determining, is arranged and constructed to output the noise signal when the plurality of speech parameters represents unvoiced speech. A noise source (2051) e.g. a pseudo-random sequence generator applies a Gaussian noise signal to an LPC filter (2053) which spectrally shapes the noise signal with an estimate of the original speech spectrum envelope as provided by the LPC parameters provided by speech decoder (201). A filter (2055) adjusts the output energy of the noise signal to match the original decoded speech ...

Подробнее
12-10-1988 дата публикации

COMPUTER COMMUNICATION SYSTEM

Номер: GB0002165974B
Автор: KIT-FUN HO, * KIT-FUN HO
Принадлежит: KIT FUN HO, * KIT-FUN HO

Подробнее
20-02-1963 дата публикации

Apparatus for deriving pitch signals from a speech wave

Номер: GB0000918941A
Автор:
Принадлежит:

... 918,941. Vocoders. WESTERN ELECTRIC CO. Inc. Dec. 7, 1961 [Dec. 27, 1960], No. 43803/61. Class 40 (4). In a pitch signal generator and voiced-unvoiced detector the peaks of a signal representing a portion of the speech band are sampled and a unidirectional signal is derived from the largest of the sampled peaks and this signal is compared with the average value of the speech signal to determine whether the speech is voiced or unvoiced and the signal is also used to generate a further signal indicative of the pitch of the speech. Fig. 2 shows a pitch detector in which speech from microphone 10 has its polarity adjusted, via switching circuit 2, so that its maximum peaks are negative joining, this signal is limited to the band 0-300 c.p.s. in filter 201 and applied to the sampling pulse generator 202 and sampler 203. Sampling pulses to apply to sampling circuit 203 are obtained by differentiating and infinitely clipping the band limited speech wave to obtain a rectangular wave the axis crossings ...

Подробнее
15-02-2008 дата публикации

LANGUAGE IMPROVEMENT

Номер: AT0000385027T
Принадлежит:

Подробнее
15-09-2010 дата публикации

PROCEDURE, DEVICE FOR THE LANGUAGE CODING IN A MOBILE COMMUNICATION TERMINAL OF MEANS PLP

Номер: AT0000480852T
Принадлежит:

Подробнее
15-06-2010 дата публикации

PROCEDURE AND DEVICE FOR THE LANGUAGE CODING WITH REDUCTION, VARIABLE BIT RATE

Номер: AT0000470932T
Принадлежит:

Подробнее
15-01-2012 дата публикации

LANGUAGE ACTIVITY DETECTION MECHANISM AND PROCEDURE

Номер: AT0000540398T
Автор: WANG ZHE, WANG, ZHE
Принадлежит:

Подробнее
15-10-2007 дата публикации

INTEROPERABLE LANGUAGE CODING

Номер: AT0000373857T
Принадлежит:

Подробнее
15-11-2008 дата публикации

PROCEDURE AND DEVICE FOR RECOGNIZING LANGUAGE SEGMENTS DURING LANGUAGE SIGNAL PROCESSING

Номер: AT0000412235T
Принадлежит:

Подробнее
15-08-2010 дата публикации

METHODE ZUR TRENNUNG VON SIGNALPFADEN UND ANWENDUNG AUF DIE VERBESSERUNG VON SPRACHE MIT ELEKTRO-LARYNX

Номер: AT0000507844A1
Принадлежит:

In order to improve the speech quality of an electric larynx (EL) speaker, the speech signal of which is digitized by suitable means, the following steps are carried out: a) dividing a single-channel speech signal into a series of frequency channels by transferring it from a time domain into a discrete frequency domain; b) filtering out the modulation frequency of the EL by way of a high-pass or notch filter, in each frequency channel; and c) back-transforming the filtered speech signal from the frequency domain into the time domain and combining it into a single-channel output signal.

Подробнее
15-11-2010 дата публикации

METHODE ZUR TRENNUNG VON SIGNALPFADEN UND ANWENDUNG AUF DIE VERBESSERUNG VON SPRACHE MIT ELEKTRO-LARYNX

Номер: AT0000507844B1
Принадлежит:

In order to improve the speech quality of an electric larynx (EL) speaker, the speech signal of which is digitized by suitable means, the following steps are carried out: a) dividing a single-channel speech signal into a series of frequency channels by transferring it from a time domain into a discrete frequency domain; b) filtering out the modulation frequency of the EL by way of a high-pass or notch filter, in each frequency channel; and c) back-transforming the filtered speech signal from the frequency domain into the time domain and combining it into a single-channel output signal.

Подробнее
15-01-1988 дата публикации

FAR TO THE SUGGESTION ANALYSIS FOR THE AUTOMATIC SPEECH RECOGNITION.

Номер: AT0000031989T
Принадлежит:

Подробнее
15-11-1992 дата публикации

ADAPTIVE MULTIVARIABLE ANALYSIS MECHANISM.

Номер: AT0000082426T
Принадлежит:

Подробнее
15-12-1998 дата публикации

C.E.L.P. - VOCODER

Номер: AT0000174146T
Принадлежит:

Подробнее
15-10-2004 дата публикации

PROCEDURE AND DEVICE FOR CODING AND DECODING MESSAGES

Номер: AT0000276607T
Принадлежит:

Подробнее
15-10-2001 дата публикации

PROCEDURE AND ARRANGEMENT FOR THE CLASSIFICATION OF SPRACHSIGNALEN

Номер: AT0000206841T
Принадлежит:

Подробнее
15-04-2005 дата публикации

PROCEDURE AND DEVICE FOR THE VOTING DETENTION/CVOTING LOT DECISION

Номер: AT0000291268T
Принадлежит:

Подробнее
15-01-2005 дата публикации

CODING OF VOTING LOTS LANGUAGE SEGMENTS WITH LOW DATA RATE

Номер: AT0000286617T
Принадлежит:

Подробнее
15-05-2005 дата публикации

VOCODER MIT VERÄNDERLICHER BITRATE

Номер: AT294441T
Автор:
Принадлежит:

Подробнее
11-03-2004 дата публикации

Identification end exclusion of pause frames for speech storage, transmission and playback

Номер: AU2003265602A8
Принадлежит:

Подробнее
04-03-2002 дата публикации

Method for noise robust classification in speech coding

Номер: AU0007764701A
Автор: THYSSEN JENS, JENS THYSSEN
Принадлежит:

Подробнее
02-10-2014 дата публикации

Hearing aid and a method of enhancing speech reproduction

Номер: AU2010365366B2
Принадлежит:

A hearing aid (60A) configured to be worn by a hearing-impaired user has a speech detector (10A) and a speech enhancer (40A) for enhancing speech being present in an input signal of the hearing aid (60A). The speech detector (10A) has means (11, 12) for independently detecting the presence of voiced and unvoiced speech in order to allow for the speech enhancer (40A) to increase the gain of speech signals suitably fast to incorporate the speech signals themselves. The hearing aid (60A) has means (49A, 50A) for communicating information regarding the detected speech signals wirelessly to a similar hearing aid (60B) worn contralaterally by the user for the purpose of mutually enhancing speech signals in the two hearing aids (60A, 60B) when speech is detected to be originating from the front of the user, and means (52B) for suppressing speech enhancement in the contralateral hearing aid (60B) when speech is detected to be originating from the ipse-lateral side of the user. The invention further ...

Подробнее
22-03-1984 дата публикации

SPEECH PROCESSOR

Номер: AU0000535489B2
Принадлежит:

Подробнее
17-09-1985 дата публикации

SPEECH ANALYSIS SYSTEM

Номер: CA1193730A

Speech analysis system in which segments of digitized speech are transformed into amplitude spectrums. For the voiced/unvoiced are decision use is made of the peak value or spectral intensity in each amplitude spectrum. Basically a voiced decision is made when the spectral intensity increases monotonically over several segments by more than a given factor. An unvoiced decision is made if the spectral intensity drops below a given fraction of the maximum spectral intensity in the current voiced period. Refinements in the decisions are made by the use of fixed and adaptive thresholds. This system is intend to be used in vocoders.

Подробнее
16-04-1996 дата публикации

ADAPTIVE MULTIVARIATE ESTIMATING APPARATUS

Номер: CA0001338251C

The present invention relates to an apparatus for determining the voicing decision for non-training set speech signals. The apparatus is comprised of a unit which is responsive to the non-training set speech signals for sampling the speech signals to produce digital speech signals, to form frames of the digital non-training set speech signals, and to process each frame to generate a set of classifiers defining speech attributes. A unit is also provided for estimating statistical distributions for voiced and unvoiced frames without prior knowledge of the voicing decisions for past ones of the frames of digital non-training set speech. A unit is provided which is responsive to these statistical distributions for determining decision regions representing voiced and unvoiced digital non-training set speech. A unit is then provided which is responsive to the decision regions and a present one of the frames for making the voicing decisions. Finally, a unit is provided which is responsive to the ...

Подробнее
31-07-1979 дата публикации

METHOD OF JUDGING VOICED AND UNVOICED CONDITIONS OF SPEECH SIGNAL

Номер: CA0001059631A1
Принадлежит:

Подробнее
06-03-2018 дата публикации

TREATMENT PROCESSING OF A PLURALITY OF STREAMING VOICE SIGNALS FOR DETERMINATION OF RESPONSIVE ACTION THERETO

Номер: CA0002665055C

Streaming voice signals, such as might be received at a contact center or similar operation, are analyzed to detect the occurrence of one or more unprompted, predetermined utterances. The predetermined utterances preferably constitute a vocabulary of words and/or phrases having particular meaning within the context in which they are uttered. Detection of one or more of the predetermined utterances during a call causes a determination of response-determinative significance of the detected utterance(s). Based on the response-determinative significance of the detected utterance(s), a responsive action may be further determined. Additionally, long term storage of the call corresponding to the detected utterance may also be initiated. Conversely, calls in which no predetermined utterances are detected may be deleted from short term storage. In this manner, the present invention simplifies the storage requirements for contact centers and provides the opportunity to improve caller experiences ...

Подробнее
10-06-2014 дата публикации

SYSTEMS, METHODS, AND APPARATUS FOR WIDEBAND ENCODING AND DECODING OF INACTIVE FRAMES

Номер: CA0002657412C
Принадлежит: QUALCOMM INCORPORATED, QUALCOMM INC

Speech encoders and methods of speech encoding are disclosed that encode inactive frames at different rates. Apparatus and methods for processing an encoded speech signal are disclosed that calculate a decoded frame based on a description of a spectral envelope over a first frequency band and the description of a spectral envelope over a second frequency band, in which the description for the first frequency band is based on information from a corresponding encoded frame and the description for the second frequency band is based on information from at least one preceding encoded frame. Calculation of the decoded frame may also be based on a description of temporal information for the second frequency band that is based on information from at least one preceding encoded frame.

Подробнее
16-06-2016 дата публикации

A SIGNAL PROCESSING APPARATUS FOR ENHANCING A VOICE COMPONENT WITHIN A MULTI-CHANNEL AUDIO SIGNAL

Номер: CA0002959090A1
Принадлежит:

The invention relates to a signal processing apparatus (100) for enhancing a voice component within a multi-channel audio signal, the multi-channel audio signal comprising a left channel audio signal (L), a center channel audio signal (C), and a right channel audio signal (R), the signal processing apparatus (100) comprising a filter (101) and a combiner (103); wherein the filter (101) is configured to determine a measure representing an overall magnitude of the multi-channel audio signal over frequency upon the basis of the left channel audio signal (L), the center channel audio signal (C), and the right channel audio signal (R), to obtain a gain function (G) based on a ratio between a measure of magnitude of the center channel audio signal (C) and the measure representing the overall magnitude of the multi- channel audio signal, and to weight the left channel audio signal (L) by the gain function (G) to obtain a weighted left channel audio signal (LE), to weight the center channel audio ...

Подробнее
28-12-1990 дата публикации

SYSTEM FOR SPEECH CODING AND AN APPARATUS FOR THE SAME

Номер: CA0002019801A1
Принадлежит:

Подробнее
26-07-2005 дата публикации

METHOD AND APPARATUS FOR REDUCING NOISE IN SPEECH SIGNAL

Номер: CA0002169422C
Принадлежит: SONY CORPORATION, SONY CORP

A method and an apparatus for reducing the noise in a speech signal capable of suppressing the noise in the input signal and simplifying the processing. The apparatus includes a fast Fourier transform unit 3 for transforming the input speech signal into a frequency-domain signal, and an Hn value calculation unit 7 for controlling filter characteristics for filtering employed for removing the noise from the input speech signal. The apparatus also includes a spectrum correction unit 10 for reducing the input speech signal by the filtering conforming to the filter characteristics produced by the Hn value calculation unit 7. The Hn value calculation unit 7 calculates the Hn value responsive to a value derived from the frame-based maximum SN ratio of the input signal spectrum obtained by the fast Fourier transform unit 3 and an estimated noise level and controls the processing for removing the noise in the spectrum correction unit 10 responsive to the Hn value.

Подробнее
23-08-1996 дата публикации

SYNTHESIS OF SPEECH USING REGENERATED PHASE INFORMATION

Номер: CA0002169822A1
Принадлежит:

The spectral magnitude and phase representation used in Multi-Band Excitation (MBE) based speech coding systems is improved. At the encoder the digital speechsignal is divided into frames, and a fundamental frequency, voicing information, and a set of spectral magnitudes are estimated for each frame. A spectral magnitude is computed at each harmonic frequency (ie. multiples of the estimated fundamental frequency) using a new estimation method which is independent of voicing state and which corrects for any offset between the harmonic and the frequency sampling grid. The result is a fast, FFT compatible method which produces a smooth set ofspectral magnitudes without the sharp discontinuities introduced by voicing transi-tions as found in prior MBE based speech coders. Quantization efficiency is thereby improved, producing higher speech quality at lower bit rates. In addition, smooth-ing methods, typically used to reduce the effect of bit errors or to enhance formants, are more effective ...

Подробнее
11-07-1996 дата публикации

SPEECH CODING METHOD USING SYNTHESIS ANALYSIS

Номер: CA0002209623A1
Принадлежит:

A speech signal linear prediction analysis is performed for each frame of a speech signal to determine the coefficients of a short-term synthesis filter, and an open-loop analysis is performed to determine a degree of frame voicing. At least one closed-loop analysis is performed for each sub-frame to determine an excitation sequence which, when applied to the short-term synthesis filter, generates a synthetic signal representative of the speech signal. Each closedloop analysis uses the impulse response of a filter consisting of the shortterm synthesis filter and a perceptual weighting filter, by truncating said impulse response to a truncation length that is no greater than the number of samples per sub-frame, and dependent on the energy distribution of said response and the degree of voicing of the frame.

Подробнее
25-03-1999 дата публикации

METHOD FOR CONDITIONING A DIGITAL SPEECH SIGNAL

Номер: CA0002304013A1
Принадлежит:

Pour conditionner un signal numérique de parole(s) traité par trames successives, on en effectue une analyse harmonique pour estimer une fréquence tonale sur chaque trame où il présente une activité vocale, et on le suréchantillonne à une fréquence de suréchantillonnage (fe) multiple de la fréquence tonale estimée.

Подробнее
06-08-2019 дата публикации

Method and device for determining voiced sound of voice processing

Номер: CN0110097896A
Автор:
Принадлежит:

Подробнее
24-07-2020 дата публикации

Disease state noise assessment system based on Chinese speech

Номер: CN0109727608B
Автор:
Принадлежит:

Подробнее
19-11-2019 дата публикации

Method for reconstructing sub-band turbidity sound parameters at voice decoding end based on support vector machine

Номер: CN0108461088B
Автор:
Принадлежит:

Подробнее
14-12-1962 дата публикации

Device allowing to obtain information height of sound starting from a wave of word

Номер: FR0001312196A
Автор:
Принадлежит:

Подробнее
13-08-1976 дата публикации

METHOD OF IDENTIFYING BASIC PITCH PERIOD OF SPEECH SIGNAL

Номер: FR0002145501B1
Автор:
Принадлежит:

Подробнее
24-07-2015 дата публикации

APPARATUS AND METHOD FOR NOISE SUPPRESS IN A RECEIVER

Номер: KR0101539268B1
Автор: 손백권, 강상기, 이동원
Принадлежит: 삼성전자주식회사

... 본 발명이 제공하는 수신기의 잡음 제거 장치는, 수신 신호의 현재 프레임 구간의 비트율 정보를 이용하여 상기 현재 프레임 구간이 비음성 구간인지 여부를 결정하고, 상기 현재 프레임 구간이 비음성 구간인 경우 상기 수신 신호의 잡음 특성을 분석하는 잡음 분석부와, 상기 분석된 잡음 특성에 따라 상기 수신 신호에 포함된 잡음의 제거 강도를 결정하고, 상기 결정된 제거 강도에 따라 상기 수신 신호에 포함된 잡음을 제거하는 잡음제거부를 포함한다.

Подробнее
02-07-2014 дата публикации

Apparatus and method for improving speech intelligibility

Номер: KR0101414233B1
Автор:
Принадлежит:

Подробнее
23-04-2019 дата публикации

Номер: KR1020190041769A
Автор:
Принадлежит:

Подробнее
23-09-2020 дата публикации

A Coputer Program for Reducing Waiting Time in Automatic Speech Recognition

Номер: KR1020200109834A
Автор:
Принадлежит:

Подробнее
09-07-2020 дата публикации

METHOD FOR REAL-TIME SPEAKER DETERMINATION

Номер: KR1020200083685A
Автор:
Принадлежит:

Подробнее
01-07-2013 дата публикации

HEARING AID AND A METHOD OF IMPROVED AUDIO REPRODUCTION

Номер: KR1020130072258A
Автор:
Принадлежит:

Подробнее
24-06-2015 дата публикации

오디오 신호 인코딩/디코딩 방법 및 오디오 신호 인코딩/디코딩 장치

Номер: KR1020150070398A
Принадлежит:

... 본 발명의 실시예는 오디오 신호 인코딩 및 디코딩 방법, 오디오 신호 인코딩 및 디코딩 장치, 전송기, 수신기, 및 통신 시스템을 제공하며, 이것은 인코딩 및/또는 디코딩 성능을 향상시킬 수 있다. 오디오 신호 인코딩 방법은, 인코딩될 시간 도메인 신호를 저대역 신호 및 고대역 신호로 분할하는 단계; 상기 저대역 신호를 인코딩하여 저주파 인코딩 파라미터를 획득하는 단계; 상기 저주파 인코딩 파라미터에 따라 음성 정도 인자를 계산하고, 상기 저주파 인코딩 파라미터에 따라 고대역 여기 신호(high band excitation signal)를 예측하는 단계 - 상기 음성 정도 인자는 고대역 신호에 의해 제공되는 음성 특성(voiced characteristic)의 정도를 나타내는 데 사용됨 - ; 상기 음성 정도 인자를 사용함으로써 고대역 여기 신호 및 랜덤 노이즈를 가중하여 합성 여기 신호를 획득하는 단계; 및 상기 합성 여기 신호 및 상기 고대역 신호에 기초하여 고주파 인코딩 파라미터를 획득하는 단계를 포함한다. 본 발명의 실시예에서의 기술적 솔루션은 인코딩 또는 디코딩 효과를 향상시킬 수 있다.

Подробнее
28-05-2013 дата публикации

sistemas, mÉtodos e equipamentos para encodificaÇço e decodificaÇço em banda larga de quadros inativos

Номер: BRPI0715064A2
Автор:
Принадлежит:

Подробнее
16-06-2001 дата публикации

An adaptive criterion for speech coding

Номер: TW0000440812B
Автор:
Принадлежит:

In producing from an original speech signal a plurality of parameters from which an approximation of the original speech signal can be reconstructed, a further signal is generated in response to the original speech signal, which further signal is intended to represent the original speech signal. At least one of the parameters is determined using first and second differences between the original speech signal and the further signal. The first difference is a difference between a waveform associated with the original speech signal and a waveform associated with the further signal, and the second difference is a difference between an energy parameter derived from the original speech signal and a corresponding energy parameter associated with the further signal.

Подробнее
05-06-2007 дата публикации

Telephone apparatus

Номер: US0007228271B2

The telephone apparatus of the present invention comprises a first voice band expander for generating a voiced signal frequency component by shifting the frequency of the voice signal received, a second voice band expander for generating a voiceless signal frequency component by shifting the frequency of the voice signal received, and a voice composer for composing the voice signal received, the output of the first voice band expander, and the output of the second voice band expander, which is able to output clear voices in aural communication.

Подробнее
10-02-1998 дата публикации

Voice encoding and voice decoding apparatus

Номер: US0005717724A
Автор:
Принадлежит:

To improve the voice quality of a digital mobile communication system such as a car telephone or portable telephone when outdoor background noises are superimposed on voices. To achieve the above object, a voice decoding apparatus comprises noise superimposed part detecting means for discriminating between a noise part containing only noises and a voice part containing voices from signal encoded at a transmission side, voice decoding means for decoding an encoded signal in the voice part into a waveform signal, noise decoding means for decoding an encoded signal in the noise part into a waveform signal, and noise control means for controlling the frequency characteristic of said noise part by controlling said noise decoding means when said noise superimposed part detecting means judges the noise part.

Подробнее
24-03-1998 дата публикации

Voiced/unvoiced classification of speech for excitation codebook selection in celp speech decoding during frame erasures

Номер: US5732389A
Автор:
Принадлежит:

A CELP speech decoder includes a first portion comprising an adaptive codebook and a second portion comprising a fixed codebook. The CS-ACELP decoder generates a speech excitation signal selectively based on output signals from said first and second portions when said decoder fails to receive reliably at least a portion of a current frame of compressed speech information. The decoder does this by classifying the speech signal to be generated as periodic (voiced) or non-periodic (unvoiced) and then generating an excitation signal based on this classification. If the speech signal is classified as periodic, the excitation signal is generated based on the output signal from the first portion and not on the output signal from the second portion. If the speech signal is classified as non-periodic, the excitation signal is generated based on the output signal from said second portion and not on the output signal from said first portion.

Подробнее
16-02-1999 дата публикации

Method and apparatus for decoding and changing the pitch of an encoded speech signal

Номер: US5873059A
Автор:
Принадлежит:

A method and apparatus for reproducing speech signals at a controlled speed and for synthesizing speech includes a dividing unit that divides the input speech into time segments and an encoding unit that discriminates whether each of the speech segments is voiced or unvoiced. Based on the results of the discrimination, the encoding unit performs sinusoidal synthesis and encoding for voiced segments and vector quantization by closed-loop search for an optimum vector using an analysis-by-synthesis method for unvoiced segments in order to find encoded parameters. A period modification unit modifies the length of time associated with each signal segment and calculates a set of modified encoded parameters. In the speech synthesizing unit, encoded speech signal data is output from the encoding unit and pitch data and amplitude data specifying the spectral envelope are sent via a data conversion unit to a waveform synthesis unit, where the number of amplitude data points of the spectral envelope ...

Подробнее
05-04-2011 дата публикации

Methods and apparatus for voice activity detection

Номер: US0007921008B2

A method for detecting voice activity comprises pre-processing a first frame in an audio frame sequence, receiving a subsequent frame as a current frame, calculating weighted linear prediction energy of the current frame based on Nth-order linear prediction coefficients, determining whether the current frame contains a noise or speech, if a speech is indicated, performing linear prediction analysis on the current frame to derive new Nth-order linear prediction coefficients and updating the coefficients with the derived one; if a nose is indicated and not the last frame, repeating the calculating and determining process. The corresponding device comprises a component for storing Nth-order linear prediction coefficients, a component for performing linear prediction analysis, a component for computing weighted linear prediction energy and a component for determining whether the current frame contains speech or noise based on calculated weighted linear prediction energy.

Подробнее
28-09-2004 дата публикации

Noise canceller

Номер: US0006799160B2

A random code vector reading section and a random codebook of a conventional CELP type speech coder/decoder are respectively replaced with an oscillator for outputting different vector streams in accordance with values of input seeds, and a seed storage section for storing a plurality of seeds . This makes it unnecessary to store fixed vectors as they are in a fixed codebook (ROM), thereby considerably reducing the memory capacity.

Подробнее
19-11-1996 дата публикации

Time-frequency interpolation with application to low rate speech coding

Номер: US0005577159A
Автор:
Принадлежит:

A new method for high quality speech coding, Timing-Frequency Interpolation (TFI) which offers advantages over conventional CELP (code-excited linear predictive) algorithms for low rate coding. The method, provides a perceptually advantageous framework for voiced speech processing. The general formulation of the TFI technique is described.

Подробнее
15-11-2011 дата публикации

Noise detection for audio encoding by mean and variance energy ratio

Номер: US0008060362B2

The techniques described are utilized for detection of noise and noise-like segments in audio coding. The techniques can include performing a prediction gain calculation, an energy compaction calculation, and a mean and variation energy calculation. Signal adaptive noise decisions can be made both in time and frequency dimensions. The techniques can be embodied as part of an AAC (advanced audio coding) encoder to detect noise and noise-like spectral bands. This detected information is transmitted in a bitstream using a signaling method defined for a perceptual noise substitution (PNS) encoding tool of the AAC encoder.

Подробнее
30-06-2011 дата публикации

PITCH MODEL FOR NOISE ESTIMATION

Номер: US20110161078A1
Принадлежит: Microsoft Corporation

Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.

Подробнее
12-04-2011 дата публикации

Pitch model for noise estimation

Номер: US0007925502B2

Pitch is tracked for individual samples, which are taken much more frequently than an analysis frame. Speech is identified based on the tracked pitch and the speech components of the signal are removed with a time-varying filter, leaving only an estimate of a time-varying speech signal. This estimate is then used to generate a time-varying noise model which, in turn, can be used to enhance speech related systems.

Подробнее
09-06-2016 дата публикации

METHOD FOR SPEECH CODING, METHOD FOR SPEECH DECODING AND THEIR APPARATUSES

Номер: US20160163325A1
Принадлежит: BlackBerry Limited

A high quality speech is reproduced with a small data amount in speech coding and decoding for performing compression coding and decoding of a speech signal to a digital signal. In speech coding method according to a code-excited linear prediction (CELP) speech coding, a noise level of a speech in a concerning coding period is evaluated by using a code or coding result of at least one of spectrum information, power information, and pitch information, and various excitation codebooks are used based on an evaluation result ...

Подробнее
13-09-2011 дата публикации

Speech codecs

Номер: US0008019599B2
Автор: Jari Makinen, MAKINEN JARI
Принадлежит: Nokia Corporation, NOKIA CORP, NOKIA CORPORATION

A method and apparatus include a voice activity detection module configured to detect silent frames, and a codec mode selection module configured to determine a codec mode. The voice activity detection module includes a receiver configured to receive a frame, a first determiner configured to determine a first set of parameters from the frame, and a providing unit configured to provide the first set of parameters to the codec mode selection module. The codec mode selection module includes a second determiner configured to determine a second set of parameters in dependence on the first set of parameters, and a selector configured to select a codec mode in dependence on the second set of parameters.

Подробнее
29-10-2009 дата публикации

IDENTIFYING FEATURES IN A PORTION OF A SIGNAL REPRESENTING SPEECH

Номер: US2009271197A1
Принадлежит:

Methods, systems, and machine-readable media are disclosed for processing a signal representing speech. According to one embodiment, processing a signal representing speech can comprise receiving a region of the signal representing speech. The region can comprise a portion of a frame of the signal representing speech classified as a voiced frame. The region can be marked based on one or more pitch estimates for the region. A cord can be identified within the region based on occurrence of one or more events within the region of the signal. For example, the one or more events can comprise one or more glottal pulses. In such cases, cord can begin with onset of a first glottal pulse and extend to a point prior to an onset of a second glottal pulse. The cord may exclude a portion of the region of the signal prior to the onset of the second glottal pulse.

Подробнее
30-05-2019 дата публикации

SOUND PROCESSING DEVICE AND METHOD

Номер: US20190164534A1
Принадлежит:

The present technology relates to a sound processing device and a method that can present progress of sound reproduction. The sound processing device includes a control unit for controlling a sound output that aurally expresses progress of sound reproduction with respect to an entirety of the sound reproduction according to the reproduction of a sound. The present technology can be applied to a sound speech progress presentation UI system. 1. A sound processing device , comprisinga control unit configured to control a sound output that aurally expresses progress of sound reproduction with respect to an entirety of the sound reproduction according to reproduction of a sound.2. The sound processing device according to claim 1 , whereinthe sound is a spoken sound based on a speech text.3. The sound processing device according to claim 1 , whereinthe control unit controls the sound output that expresses the progress by using a sound image position.4. The sound processing device according to claim 3 , whereinthe control unit controls the sound output in which an orientation position of a sound image differs in each reproduction section including a speech of a presentation item and the sound image moves toward a predetermined direction according to the progress of the sound reproduction.5. The sound processing device according to claim 4 , whereinthe control unit identifies the reproduction section corresponding to a specified direction on the basis of metadata including information indicating reproduction start time of the reproduction section of the sound and information related to a direction of the sound image in the reproduction section, and operates to start reproducing the sound from the specified reproduction section.6. The sound processing device according to claim 5 , whereina range including the direction of the sound image in the reproduction section is defined for each reproduction section so that the reproduction section including the presentation item with ...

Подробнее
08-04-2014 дата публикации

System, method and program for voice detection

Номер: US0008694308B2

A system for voice detection includes a feature value calculation unit that calculates a feature value from an input signal sliced on a per frame basis, a provisional voice/non-voice decision unit that provisionally decides a voiced interval and a non-voiced interval from the feature value calculated on a per frame basis, and a voice/non-voice decision unit that determines a voiced interval duration threshold value or a non-voiced interval duration threshold value, using a ratio of the feature value found on a per frame basis to a threshold value for the feature value and that re-decides the voiced interval and the non-voiced interval, using the voiced interval duration threshold value determined and the non-voiced interval duration threshold value determined. By determining the voiced interval duration threshold value and the non-voiced interval duration threshold value, using the feature value found on a per frame basis and the threshold value for the feature value, the constraint of ...

Подробнее
07-01-2014 дата публикации

Voice activity detection based on plural voice activity detectors

Номер: US0008626498B2
Автор: Te-Won Lee, LEE TE-WON

A voice activity detection (VAD) system includes a first voice activity detector, a second voice activity detector and control logic. The first voice activity detector is included in a device and produces a first VAD signal. The second voice activity detector is located externally to the device and produces a second VAD signal. The control logic combines the first and second VAD signals into a VAD output signal. Voice activity may be detected based on the VAD output signal. The second VAD signal can be represented as a flag included in a packet containing digitized audio. The packet can be transmitted to the device from the externally located VAD over a wireless link.

Подробнее
09-02-2012 дата публикации

NOISE ESTIMATION APPARATUS, NOISE ESTIMATION METHOD, AND NOISE ESTIMATION PROGRAM

Номер: US20120035920A1
Принадлежит: FUJITSU LIMITED

A noise estimation apparatus includes a correlation calculator configured to calculate a correlation value of a spectrum between a plurality of frames in sound information obtained using one or more microphones, a power calculator configured to calculate a power value indicating a sound level of one target frame among the plurality of frames, an update determiner configured to determine an update degree indicating a degree to which the sound information of the target frame is to be reflected in a noise model stored in a storage, or determine whether or not the noise model is to be updated to another noise model, based on the power value of the target frame and the correlation value, and an updater configured to generate the other noise model based on a determined result, the sound information of the target frame, and the noise model.

Подробнее
07-08-2008 дата публикации

Device for recovering missing frequency components

Номер: US2008189102A1
Автор: TAKADA MASASHI
Принадлежит:

A band recovering device recovers frequency components lying in a frequency band lost due to band-limitation of a sound signal. The device includes a peak-limiting amplifier for amplifying an input narrow-band signal while preventing the resulting amplified signal from exceeding a maximum amplitude. A peak-limitation detector detects the level of the amplified signal output. An amplification controller increases the amplification factor and/or the amount of amplification of the peak-limiting amplifier in accordance with the level of the amplified signal. A band recovering circuit generates, based on the amplified signal output from the peak-limiting amplifier and input narrow-band signal, a band-recovered signal including the frequency components lying in the missing band.

Подробнее
28-07-2022 дата публикации

METHOD AND SYSTEM FOR PROVIDING VOICE RECOGNITION TRIGGER AND NON-TRANSITORY COMPUTER-READABLE RECORDING MEDIUM

Номер: US20220238134A1
Автор: Seok Joong KIM
Принадлежит: VTOUCH CO., LTD.

A method for providing a voice recognition trigger and a system therefor, and a non-transitory computer-readable recording medium are provided. The method for providing the voice recognition trigger includes: calculating a distance change between a device and an object on the basis of proximity information detected by the device; and determining whether or not the voice recognition trigger of the device is provided with reference to the distance change between the device and the object.

Подробнее
03-06-2015 дата публикации

HEAD-MOUNTED SOUND CAPTURE DEVICE

Номер: EP2878136A1
Принадлежит:

Подробнее
07-03-2012 дата публикации

Номер: JP0004890792B2
Автор:
Принадлежит:

Подробнее
10-09-2016 дата публикации

КОДИРОВАНИЕ ОБОБЩЕННЫХ АУДИОСИГНАЛОВ НА НИЗКИХ СКОРОСТЯХ ПЕРЕДАЧИ БИТОВ И С НИЗКОЙ ЗАДЕРЖКОЙ

Номер: RU2596584C2

Изобретение относится к средствам смешанного кодирования во временной области/частотной области для кодирования входного звукового сигнала. Технический результат заключается в уменьшении времени задержки обработки при классификации звукового сигнала и при его преобразовании в частотную область. Частота отсечки для вклада в сигнал возбуждения во временной области также вычисляется в ответ на входной звуковой сигнал, и частотный охват вклада в сигнал возбуждения во временной области регулируется относительно этой частоты отсечки. После вычисления вклада в сигнал возбуждения в частотной области в ответ на входной звуковой сигнал отрегулированный вклад в сигнал возбуждения во временной области и вклад в сигнал возбуждения в частотной области суммируются, чтобы сформировать смешанное возбуждение во временной области/частотной области, составляющее кодированную версию входного звукового сигнала. При вычислении вклада в сигнал возбуждения во временной области входной звуковой сигнал может быть ...

Подробнее
20-02-2010 дата публикации

УСТРОЙСТВО РАДИОСВЯЗИ И СПОСОБ РАДИОСВЯЗИ

Номер: RU2382509C2

Изобретение относится к радиосвязи. Когда датчик наличия/отсутствия звука обнаруживает интервал отсутствия звука, немодулированный несущий сигнал выводят в области, используемой для передачи речевых данных, которые содержатся в данных передачи, имеющих структуру кадра, и которые соответствуют интервалу отсутствия звука. То есть, модулятор частотной манипуляции (FSK) инициирует вывод схемой передачи немодулированного несущего сигнала из устройства радиосвязи в области, используемой для передачи речевых данных, соответствующих интервалу отсутствия звука. Тем временем, в областях, отличных от интервала отсутствия звука, выводят волновой сигнал, модулированный FSK с четырьмя уровнями. В области информации идентификации канала, содержащейся в данных передачи, имеющих структуру кадра, содержатся отличительные данные отсутствия модуляции для обеспечения возможности определения интервала отсутствия звука, и они обеспечивают устройству терминала связи на принимающем конце возможность избежать любых ...

Подробнее
10-11-1998 дата публикации

СПОСОБ И УСТРОЙСТВО ОСЛАБЛЕНИЯ ШУМА В РЕЧЕВОМ СИГНАЛЕ

Номер: RU2121719C1
Принадлежит: Сони Корпорейшн (JP)

Способ и устройство ослабления шума в речевом сигнале, способные подавлять шум во входном сигнале и упрощающие процесс обработки. Устройство включает блок быстрого преобразования Фурье 3 для преобразования входного речевого сигнала в сигнал в частной области и блок вычисления величины И 7 для регулирования характеристик фильтров с целью фильтрации, используемой для устранения шума из входного речевого сигнала. Устройство также включает блок коррекции спектра 10 для ослабления входного речевого сигнала посредством фильтрации в соответствии с характеристиками фильтра, получаемыми с помощью блока вычисления значения Нп 7. Блок вычисления значения Нп 7 вычисляет значение Нп в соответствии с величиной, получаемой исходя покадрового максимального отношения "сигнал - шум" (SN) спектра входного сигнала, получаемого посредством блока быстрого преобразования Фурье 3, и приблизительной оценки уровня шума и управляет процессом устранения шума в блоке коррекции спектра 10 в соответствии со значением ...

Подробнее
20-03-2007 дата публикации

СПОСОБ УЛУЧШЕНИЯ КАЧЕСТВА РЕЧИ И УСТРОЙСТВО ДЛЯ ЕГО ОСУЩЕСТВЛЕНИЯ

Номер: RU2005127995A
Принадлежит:

... 1. Способ улучшения качества речи, включающий в себя разделение входной речи на вокализованную речь и невокализованную речь; выполнение адаптивной фильтрации по вокализованной речи для удаления шума вокализованной речи; и выполнение спектрального вычитания по невокализованной речи. 2. Способ по п.1, дополнительно включающий в себя выполнение процесса адаптивной линейной фильтрации с использованием адаптивной фильтрации по вокализованной речи для удаления шума вокализованной речи. 3. Способ по п.2, в котором для спектрального вычитания используют среднее значение шумовых спектров, оцененное из заранее заданных кадров, соответствующих предыдущей вокализованной речи, процессом адаптивной линейной фильтрации. 4. Способ по п.1, в котором адаптивная фильтрация использует период основного тона, извлеченный из кадра, соответствующего вокализованной речи. 5. Способ по п.1, дополнительно включающий в себя выполнение, по меньшей мере, одной из фильтрации нижних частот и фильтрации верхних частот по ...

Подробнее
10-04-2016 дата публикации

СПОСОБ (ВАРИАНТЫ) ФИЛЬТРАЦИИ ЗАШУМЛЕННОГО РЕЧЕВОГО СИГНАЛА В УСЛОВИЯХ СЛОЖНОЙ ПОМЕХОВОЙ ОБСТАНОВКИ

Номер: RU2580796C1

Изобретения относятся к области цифровой связи и технологиям обработки речи в условиях зашумления. Технический результат заключается в повышении отношения сигнал-шум очищенного речевого сигнала. Применяют способы фильтрации зашумленного речевого сигнала в условиях сложной помеховой обстановки. Для чего используют результаты полиспектрального анализа с целью точной оценки спектральных характеристик шумового воздействия. В заявленных способах осуществляется спектральное вычитание с дополнительной коррекцией сигналов на основе процедуры эмпирической модовой декомпозиции и адаптивной цифровой фильтрацией низких частот с применением коэффициента бикорреляции, полученного путем анализа суммарной бикорреляции в зонах сосредоточения низкоплотностной области биамплитуды обрабатываемого сегмента зашумленного речевого сигнала. 3 н.п. ф-лы, 10 ил.

Подробнее
10-09-2010 дата публикации

СИСТЕМЫ И СПОСОБЫ ДЛЯ ИЗМЕНЕНИЯ ОКНА С КАДРОМ, АССОЦИИРОВАННЫМ С АУДИО СИГНАЛОМ

Номер: RU2009107161A
Принадлежит:

... 1. Способ для изменения окна с кадром, ассоциированным с аудио сигналом, содержащий этапы, на которых ! принимают сигнал; ! разделяют сигнал на множество кадров; ! определяют, ассоциирован ли кадр в множестве кадров с неречевым сигналом; ! применяют оконную функцию измененного модифицированного косинусного преобразования (MDCT) к кадру для формирования первой области заполнения нулями и второй области заполнения нулями, если было определено, что кадр ассоциирован с неречевым сигналом; и ! кодируют кадр. ! 2. Способ по п.1, в котором кадр кодируется с использованием схемы на основе кодирования MDCT. ! 3. Способ по п.1, в котором кадр содержит длину 2М, где М представляет количество выборок в кадре. ! 4. Способ по п.1, в котором первая область заполнения нулями располагается в начале кадра. ! 5. Способ по п.1, в котором вторая область заполнения нулями располагается в конце кадра. ! 6. Способ по п.1, в котором первая область заполнения нулями и вторая область содержат длину из (M-L)/2, где ...

Подробнее
27-06-2015 дата публикации

ПОМЕХОУСТОЙЧИВАЯ КЛАССИФИКАЦИЯ РЕЖИМОВ КОДИРОВАНИЯ РЕЧИ

Номер: RU2013157194A
Принадлежит:

... 1. Способ помехоустойчивой классификации речи, содержащий этапы, на которых:вводят параметры классификации в классификатор речи из внешних компонентов;формируют, в классификаторе речи, внутренние параметры классификации из по меньшей мере одного из входных параметров классификации;устанавливают по меньшей мере одно пороговое значение нормированной функции коэффициентов автокорреляции (NACF) на основании сравнения оценки шума многочисленных кадров входной речи с пороговым значением оценки шума; иопределяют классификацию режима речи на основании внутренних параметров классификации и по меньшей мере одного порогового значения NACF.2. Способ по п. 1, в котором установка содержит этап, на котором снижают пороговое значение вокализованности для классификации текущего кадра в качестве вокализованного, если оценка шума превышает пороговое значение оценки шума, при этом пороговое значение вокализованности не настраивают, если оценка шума находится ниже порогового значения оценки шума.3. Способ по ...

Подробнее
23-12-1976 дата публикации

VERFAHREN ZUR BEWERTUNG STIMMHAFTER UND STIMMLOSER ZUSTAENDE EINES SPRACHSIGNALS

Номер: DE0002626793A1
Принадлежит:

Подробнее
09-02-2012 дата публикации

Information Processing Apparatus, Information Processing Method, and Program

Номер: US20120035927A1
Принадлежит: Sony Corp

An information processing apparatus includes a plurality of information input units that inputs observation information of a real space, an event detection unit that generates event information including estimated position information and estimated identification (ID) information of a user present in the real space based on analysis of the information input from the information input unit, and an information integration processing unit that inputs the event information, and generates target information including a position and user ID information of each user based on the input event information and signal information representing a probability value for an event generating source. Here, the information integration processing unit includes an utterance source probability calculation unit having an identifier, and calculates an utterance source probability based on input information using the identifier in the utterance source probability calculation unit.

Подробнее
06-12-2012 дата публикации

Method And Apparatus For Voice Activity Determination

Номер: US20120310641A1
Принадлежит: Nokia Oyj

In accordance with an example embodiment of the invention, there is provided an apparatus for detecting voice activity in an audio signal. The apparatus comprises a first voice activity detector for making a first voice activity detection decision based at least in part on the voice activity of a first audio signal received from a first microphone. The apparatus also comprises a second voice activity detector for making a second voice activity detection decision based at least in part on an estimate of a direction of the first audio signal and an estimate of a direction of a second audio signal received from a second microphone. The apparatus further comprises a classifier for making a third voice activity detection decision based at least in part on the first and second voice activity detection decisions.

Подробнее
21-03-2013 дата публикации

NON-SPEECH SECTION DETECTING METHOD AND NON-SPEECH SECTION DETECTING DEVICE

Номер: US20130073281A1
Принадлежит: FUJITSU LIMITED

A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound, and detecting a non-speech section having a frame not containing voice data based on speech uttered by a person, the device including: a calculating part calculating a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis; a judging part judging whether the bias is greater than or equal to a given threshold or alternatively smaller than or equal to a given threshold; a counting part counting the number of consecutive frames judged as having a bias greater than or equal to the threshold or alternatively smaller than or equal to the threshold; a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value. 1. A non-speech section detecting device generating a plurality of frames having a given time length on the basis of sound data obtained by sampling sound , and detecting a non-speech section having a frame not including voice data based on speech uttered by a person , the device comprising:a calculating part calculating, as an index, at least one of a bias of a spectrum obtained by converting sound data of each frame into components on a frequency axis and a power and a pitch of sound data of each frame;a second calculating part calculating an amount of variation relative to the preceding frame with respect to the calculated index;a judging part judging whether the calculated variation amount is smaller than or equal to a given threshold;a counting part counting the number of consecutive frames judged as having a variation amount smaller than or equal to the threshold;a count judging part judging whether the obtained number of consecutive frames is greater than or equal to a given value; anda detecting part detecting, when the obtained number of consecutive frames is judged as greater than or equal to the ...

Подробнее
25-04-2013 дата публикации

OPERATING METHODS FOR VOICE ACTIVITY DETECTION/SILENCE SUPPRESSION SYSTEM

Номер: US20130103395A1
Автор: Chen Bing, James James H.
Принадлежит: AT&T INTELLECTUAL PROPERTY II, L.P.

A Voice Activity Detection/Silence Suppression (VAD/SS) system is connected to a channel of a transmission pipe. The channel provides a pathway for the transmission of energy. A method for operating a VAD/SS system includes detecting the energy on the channel, and activating or suppressing activation of the VAD/SS system depending upon the nature of the energy detected on the channel. 1. A method of controlling a communication channel , the method comprising:determining, using a processing device, whether there is an absence of voice energy associated with the communication channel;determining, using the processing device, whether noise energy associated with the communication channel is constant based on a spectrum constancy criterion; andactivating silence suppression using the processing device, the silence suppression being activated in response to the noise energy associated with the communication channel being constant based on the spectrum constancy criterion and the voice energy associated with the communication channel being absent, the silence suppression comprising associating rerouted voice energy with the communication channel.2. The method according to claim 1 , wherein determining whether the noise energy is constant based on the spectrum constancy criterion comprises analyzing fluctuation in a power frequency spectrum associated with the noise energy.3. The method according to claim 2 , wherein analyzing fluctuation in the power frequency spectrum comprises analyzing the power frequency spectrum for a duration of 100 msec.4. The method according to claim 2 , wherein analyzing the power frequency spectrum comprises determining whether the power frequency spectrum is flat during a predetermined time period claim 2 , the method further comprising determining that the noise energy associated with the communication channel is constant based on the power frequency spectrum being flat during the predetermined time period.5. The method according to claim 1 , ...

Подробнее
30-05-2013 дата публикации

Switching Off DTX for Music

Номер: US20130138433A1
Принадлежит: TELEFONAKTIEBOLAGET L M ERICSSON (PUBL)

The invention relates to a method for disabling a discontinuous transmission node DTX of a speech encoder if a music signal is detected in a call input signal. The music signal is detected by determining an activity factor corresponding to the relation of sound signal periods relative to scheme signal periods. If the activity factor is higher than a specified activity factor, the DTX is disabled. 114-. (canceled)15. A system for controlling a discontinuous transmission mode of a speech encoder , the system comprising:the speech encoder configured to encode a call input signal and output an encoded call output signal;an activity detector configured to determine a presence of sound signal periods in the call input signal relative to the presence of silence signal periods in the call input signal; determine an activity factor based on the sound and silence signal periods in the call input signal;', 'enable and disable the discontinuous transmission mode of the encoded call output signal;, 'a discontinuous transmission enabling/disabling unit configured towherein the discontinuous transmission enabling/disabling unit disables the discontinuous transmission mode if the determined activity factor is higher than a specified activity factor;wherein the discontinuous transmission enabling/disabling unit is configured to determine the activity factor by determining a relation of speech and music signal samples relative to silence signal samples in the call input signal.16. The system of wherein the discontinuous transmission enabling/disabling unit is configured to use an information of a voice activity detector configured to distinguish speech from speech pauses for determining the activity factor.17. The system of wherein the discontinuous transmission enabling/disabling unit is configured to disable the discontinuous transmission mode if the activity factor is higher than 75%.18. The system of wherein the discontinuous transmission enabling/disabling unit is configured to ...

Подробнее
06-06-2013 дата публикации

Bandwidth Extender

Номер: US20130144614A1
Принадлежит: Nokia Corporation

An apparatus for extending the bandwidth of an audio signal, the apparatus being configured to: generate an excitation signal from an audio signal, wherein in the audio signal comprises a plurality of frequency components; extract a feature vector from the audio signal, wherein the feature vector comprises at least one frequency domain component feature and at least one time domain component feature; determine at least one spectral shape parameter from the feature vector, wherein the at least one spectral shape parameter corresponds to a sub band signal comprising frequency components which belong to a further plurality of frequency components; and generate the sub band signal by filtering the excitation signal through a filter bank and weighting the filtered excitation signal with the at least one spectral shape parameter. 1. A method comprising:generating an excitation signal from an audio signal, wherein in the audio signal comprises a plurality of frequency components;extracting a feature vector from the audio signal, wherein the feature vector comprises at least one frequency domain component feature and at least one time domain component feature;determining at least one spectral shape parameter from the feature vector, wherein the at least one spectral shape parameter corresponds to a sub band signal comprising frequency components which belong to a further plurality of frequency components; andgenerating the sub band signal by filtering the excitation signal through a filter bank and weighting the filtered excitation signal with the at least one spectral shape parameter.2. The method as claimed in claim 1 , wherein generating the excitation signal comprises:generating a residual signal by filtering the audio signal with an inverse linear predictive filter;filtering the residual signal with a post filter stage comprising an auto regressive moving average filter based on the linear predictive filter; andgenerating the excitation signal by up sampling and ...

Подробнее
13-06-2013 дата публикации

ADAPTIVE VOICE ACTIVITY DETECTION

Номер: US20130151246A1
Принадлежит: Core Wireless Licensing S.a.r.I.

Encoding audio signals with selecting an encoding mode for encoding the signal categorizing the signal into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode and encoding at least the active segments using the selected encoding mode. 1. A method comprising:dividing an audio signal temporally into segments;selecting an encoding mode for encoding the segments;categorizing the segments into active segments having voice activity and non-active segments having substantially no voice activity by using categorization parameters depending on the selected encoding mode;encoding at least the active segments using the selected encoding mode.2. The method of claim 1 , wherein the categorization parameters are such that for a low quality of the encoding mode a lower number of temporal sections are detected as active sections than for a high quality of the encoding mode.3. The method of claim 1 , wherein the categorization parameters depend on the encoding bitrate of the encoding mode.4. The method of claim 1 , further comprising obtaining network traffic of a network for which the audio signal is encoded and setting the categorization parameters depending on the obtained network traffic.5. The method of claim 1 , further comprising obtaining background noise within the audio signal and setting the categorization parameters depending on the obtained background noise.6. The method of claim 1 , wherein an energy threshold value is a categorization parameter and wherein categorizing the segments comprises comparing energy information of the audio signal to at least the energy threshold value.7. The method of claim 1 , wherein a signal-to-noise threshold value is a categorization parameter and wherein categorizing the segments comprises comparing signal-to-noise information of the audio signal to at least the signal-to-noise threshold value.8. The method of ...

Подробнее
12-09-2013 дата публикации

Method and Apparatus for Speech Segmentation

Номер: US20130238328A1
Автор: Du Robert, Tao Ye, Zu Daren
Принадлежит:

Machine-readable media, methods, apparatus and system for speech segmentation are described. In some embodiments, a fuzzy rule may be determined to discriminate a speech segment from a non-speech segment. An antecedent of the fuzzy rule may include an input variable and an input variable membership. A consequent of the fuzzy rule may include an output variable and an output variable membership. An instance of the input variable may be extracted from a segment. An input variable membership function associated with the input variable membership and an output variable membership function associated with the output variable membership may be trained. The instance of the input variable, the input variable membership function, the output variable, and the output variable membership function may be operated, to determine whether the segment is the speech segment or the non-speech segment. 118.-. (canceled)19. A method comprising:applying a fuzzy rule to determine whether a media segment is speech segment or a non-speech segment and further to discriminate the speech segment from the non-speech segment based on one or more of characteristics of media data, prior knowledge relating to speech data, and speech-likelihood of the media segment, wherein the fuzzy rule to further determine whether the media segment takes one or more forms including an antecedent and a consequent;extracting an instance of the input variable from a segment;training an input variable membership function associated with the input variable membership and an output variable membership function associated with the output variable membership; andoperating the instance of the input variable, the input variable membership function, the output variable, and the output variable membership function, to determine whether the segment is the speech segment or the non-speech segment.20. The method of claim 19 , wherein the antecedent admits a first partial degree that the input variable belongs to the input ...

Подробнее
24-10-2013 дата публикации

METHOD AND APPARATUS FOR PERFORMING VOICE ACTIVITY DETECTION

Номер: US20130282367A1
Автор: Wang Zhe
Принадлежит:

This application relates to a voice activity detection (VAD) apparatus configured to provide a voice activity detection decision for an input audio signal. The VAD apparatus includes a state detector and a voice activity calculator. The state detector is configured to determine, based on the input audio signal, a current working state of the VAD apparatus among at least two different working states. Each of the at least two different working states is associated with a corresponding working state parameter decision set which includes at least one voice activity decision parameter. The voice activity calculator is configured to calculate a voice activity detection parameter value for the at least one voice activity decision parameter of the working state parameter decision set associated with the current working state, and to provide the voice activity detection decision by comparing the calculated voice activity detection parameter value with a threshold. 1. A voice activity detection (VAD) apparatus , comprising:a receiving unit, configured to receive an input audio signal;a state detector, configured to determine a current working state of the VAD apparatus based on the input audio signal, wherein the VAD apparatus has at least two different working states, each of the at least two different working states is associated with a corresponding working state parameter decision set (WSPDS), and each WSPDS includes at least one voice activity decision parameter (VADP);a voice activity calculator, configured to calculate a value for the at least one VADP of the WSPDS associated with the current working state, and to generate a voice activity detection decision (VADD) by comparing the calculated VADP value with a threshold; andan output unit, configured to output the VADD.2. The VAD apparatus according to claim 1 , wherein the VADD is generated by the voice activity calculator by using sub-band segmental signal to noise ratio (SNR) based voice activity decision parameters ...

Подробнее
24-10-2013 дата публикации

Systems and methods for audio signal processing

Номер: US20130282373A1
Принадлежит: Qualcomm Inc

A method for restoring a processed speech signal by an electronic device is described. The method includes obtaining at least one audio signal. The method also includes performing bin-wise voice activity detection based on the at least one audio signal. The method further includes restoring the processed speech signal based on the bin-wise voice activity detection.

Подробнее
31-10-2013 дата публикации

Voiced Interval Command Interpretation

Номер: US20130290000A1
Автор: David Edward Newman
Принадлежит: ZANAVOX

A method is disclosed for controlling a voice-activated device by interpreting a spoken command as a series of voiced and non-voiced intervals. A responsive action is then performed according to the number of voiced intervals in the command. The method is well-suited to applications having a small number of specific voice-activated response functions. Applications using the inventive method offer numerous advantages over traditional speech recognition systems including speaker universality, language independence, no training or calibration needed, implementation with simple microcontrollers, and extremely low cost. For time-critical applications such as pulsers and measurement devices, where fast reaction is crucial to catch a transient event, the method provides near-instantaneous command response, yet versatile voice control.

Подробнее
21-11-2013 дата публикации

VOICED SOUND INTERVAL DETECTION DEVICE, VOICED SOUND INTERVAL DETECTION METHOD AND VOICED SOUND INTERVAL DETECTION PROGRAM

Номер: US20130311183A1
Автор: Onishi Yoshifumi
Принадлежит: NEC Corporation

This invention provides a voiced sound interval detection device which enables appropriate detection of a voiced sound interval of an observation signal even when a volume of sound from a sound source varies or when the number of sound sources is unknown or when different kinds of microphones are used together. 1. A voiced sound interval detection device comprising:a vector calculation unit which calculates, from a power spectrum time series of voice signals collected by a plurality of microphones, a multidimensional vector series as a vector series of a power spectrum having as many dimensions as the number of said microphones;a clustering unit which clusters said multidimensional vector series;a voiced sound index calculation unit which calculates, at each time of said multidimensional vector series sectioned by an arbitrary time length, a center vector of a noise cluster and a center vector of a cluster to which a vector of said voice signal at the time in question belongs and after projecting the center vector of said noise cluster and the vector of said voice signal at the time in question toward a direction of the center vector of the cluster to which the vector of said voice signal at the time in question belongs, calculates a signal noise ratio as a voiced sound index; anda voiced sound interval determination unit which determines whether the vector of said voice signal is in a voiced sound interval or a voiceless sound interval by comparing said voiced sound index with a predetermined threshold value.2. The voiced sound interval detection device according to claim 1 , whereinsaid clustering unit executes stochastic clustering, andsaid voiced sound index calculation unit calculates an expected value of said voiced sound index from said clustering result.3. The voiced sound interval detection device according to claim 1 , wherein said multidimensional vector series is a vector series of a logarithm power spectrum.4. A voiced sound interval detection method of ...

Подробнее
28-11-2013 дата публикации

SPARSE SIGNAL DETECTION WITH MISMATCHED MODELS

Номер: US20130317821A1
Принадлежит: QUALCOMM INCORPORATED

Various arrangements for detecting a type of sound, such as speech, are presented. A plurality of audio snippets may be sampled. A period of time may elapse between consecutive audio snippets. A hypothetical test may be performed using the sampled plurality of audio snippets. Such a hypothetical test may include weighting one or more hypothetical values greater than one or more other hypothetical values. Each hypothetical value may correspond to an audio snippet of the plurality of audio snippets. The hypothetical test may further include using at least the greater weighted one or more hypothetical values to determine whether at least one audio snippet of the plurality of audio snippets comprises the type of sound. 1. A method for detecting a type of sound , the method comprising:sampling a plurality of audio snippets; weighting one or more hypothetical values greater than one or more other hypothetical values, wherein each hypothetical value corresponds to an audio snippet of the plurality of audio snippets; and', 'using at least the greater weighted one or more hypothetical values to determine whether at least one audio snippet of the plurality of audio snippets comprises the type of sound., 'performing a hypothetical test using the sampled plurality of audio snippets, the hypothetical test comprising2. The method for detecting the type of sound of claim 1 , wherein sampling the plurality of audio snippets comprises at least a period of time elapsing between consecutive audio snippets of the plurality of audio snippets during which audio is not sampled.3. The method for detecting the type of sound of claim 2 , wherein the period of time elapsing between when consecutive audio snippets of the plurality of audio snippets are captured is at least as long in time as one of the plurality of audio snippets.4. The method for detecting the type of sound of claim 1 , wherein the type of sound is speech.5. The method for detecting the type of sound of claim 1 , wherein one ...

Подробнее
05-12-2013 дата публикации

Apparatus and method for detecting end point using decoding information

Номер: US20130325475A1

An apparatus for detecting an end point using decoding information includes: an end point detector configured to extract a speech signal from an acoustic signal received from outside and detect end points of the speech signal; a decoder configured to decode the speech signal; and an end point detector configured to extract reference information serving as a standard of actual end point discrimination from decoding information generated during the decoding process of the decoder, and discriminate an actual end point among the end points detected by the end point detector based on the extracted reference information.

Подробнее
12-12-2013 дата публикации

VOICED SOUND INTERVAL CLASSIFICATION DEVICE, VOICED SOUND INTERVAL CLASSIFICATION METHOD AND VOICED SOUND INTERVAL CLASSIFICATION PROGRAM

Номер: US20130332163A1
Автор: Onishi Yoshifumi
Принадлежит: NEC Corporation

The voiced sound interval classification device comprises a vector calculation unit which calculates, from a power spectrum time series of voice signals, a multidimensional vector series as a vector series of a power spectrum having as many dimensions as the number of microphones, a difference calculation unit which calculates, with respect to each time of the multidimensional vector series, a vector of a difference between the time and the preceding time, a sound source direction estimation unit which estimates, as a sound source direction, a main component of the differential vector, and a voiced sound interval determination unit which determines whether each sound source direction is in a voiced sound interval or a voiceless sound interval by using a predetermined voiced sound index indicative of a likelihood of a voiced sound interval of the voice signal applied at each time. 1. A voiced sound interval classification device comprising:a vector calculation unit which calculates, from a power spectrum time series of voice signals collected by a plurality of microphones, a multidimensional vector series as a vector series of a power spectrum having as many dimensions as the number of said microphones;a difference calculation unit which calculates, with respect to each time of said multidimensional vector series sectioned by an arbitrary time length, a vector of a difference between the time in question and the preceding time;a sound source direction estimation unit which estimates, as a sound source direction, a main component of said differential vector obtained while allowing the vector to be non-orthogonal and exceed a space dimension; anda voiced sound interval determination unit which determines whether each sound source direction obtained by said sound source direction estimation unit is in a voiced sound interval or a voiceless sound interval by using a predetermined voiced sound index indicative of a likelihood of a voiced sound interval of said voice ...

Подробнее
06-02-2014 дата публикации

EFFICIENT CONTENT CLASSIFICATION AND LOUDNESS ESTIMATION

Номер: US20140039890A1
Принадлежит: DOLBY INTERNATIONAL AB

The present document relates to methods and systems for encoding an audio signal. The method comprises determining a spectral representation of the audio signal. The determining a spectral representation step may comprise determining modified discrete cosine transform, MDCT, coefficients, or a Quadrature Mirror Filter, QMF, filter bank representation of the audio signal. The method further comprises encoding the audio signal using the determined spectral representation; and classifying parts of the audio signal to be speech or non-speech based on the determined spectral representation. Finally, a loudness measure for the audio signal based on the speech parts is determined. 131-. (canceled)32. A method for encoding an audio signal , the method comprising:determining a spectral representation of the audio signal, the determining a spectral representation comprising determining modified discrete cosine transform, MDCT, coefficients;encoding the audio signal using the determined spectral representation;determining a pseudo spectrum from the MDCT coefficients by averaging MDCT coefficients with adjacent MDCT coefficients;classifying parts of the audio signal to be speech or non-speech based at least in part on the values of the determined pseudo spectrum; anddetermining a loudness measure for the audio signal based on the speech parts.34. The method of claim 32 , wherein the spectral representation is determined for short blocks and/or long blocks claim 32 , the method further comprising:aligning the short block representation with a frame for a long block representation corresponding to a predetermined number of short blocks, thereby reordering MDCT coefficients of the predetermined number of short blocks into the frame for a long block.35. The method claim 32 , further comprising:encoding the audio signal using the determined spectral representation into a bit-stream; andencoding the determined loudness measure into the bit-stream.36. The method of claim 32 , wherein ...

Подробнее
06-01-2022 дата публикации

INERTIAL SENSOR UNIT AND METHOD FOR DETECTING A SPEECH ACTIVITY

Номер: US20220005495A1
Принадлежит:

An inertial sensor unit, including a sensor element for detecting and converting movements and vibrations into an electrical sensor signal, a signal processing element for evaluating the sensor signal, and an interface for signaling a detected speech activity. The signal processing element includes a first processing stage which checks a first criterion for the presence of a speech activity, and a second processing stage which checks a second criterion for the presence of a speech activity, the second processing stage being passed through only if the sensor signal has passed through the first processing stage and the first criterion for the presence of a speech activity has been met. The signal processing element is designed to activate the interface for signaling a speech activity only if the sensor signal has passed through the second processing stage, and the second criterion for the presence of a speech activity has been met. 1. An inertial sensor unit , comprising:a. a sensor element configured to detect and convert movements and vibrations into an electrical sensor signal;b. a signal processing element configured to evaluate the sensor signal with a goal of detecting vibrations induced by a speech activity; andc. an interface configured to signal a detected speech activity, the signal processing element includes a first processing stage and a second processing stage for the sensor signal, the first processing stage being configured to check a first criterion for a presence of a speech activity, and the second processing stage configured to check at least one further, second criterion for the presence of a speech activity,', 'the second processing stage is passed through only if the sensor signal has passed through the first processing stage and the first criterion for the presence of a speech activity has been met, and', 'the signal processing element is configured to activate the interface for signaling a speech activity only if the sensor signal has passed ...

Подробнее
02-01-2020 дата публикации

Unvoiced Voiced Decision For Speech Processing Cross Reference To Related Applications

Номер: US20200005812A1
Автор: GAO Yang
Принадлежит: Huawei Technologies Co., Ltd.

Method and apparatus for speech processing are disclosed. A first unvoicing parameter for a first frame of a speech signal is determined, and furthered smoothed based on a second unvoicing parameter for a second frame prior to the first frame. A difference between the first unvoicing parameter and the smoothed unvoicing parameter for the first subframe is computed and a unvoiced/voiced classification of the first frame is determined using the computed difference as a decision parameter. Further processing, such as Bandwidth extension (BWE) is performed on based on the classification of the first frame. 1. A method for speech processing , comprising:receiving a plurality of frames of a speech signal;determining, for a first frame of the speech signal, a first parameter for a first frequency band from a first energy envelope of the speech signal in a time domain, and a second parameter for a second frequency band from a second energy envelope of the speech signal in the time domain;determining a smoothed first parameter and a smoothed second parameter based on information of a second frame that is prior to the first frame of the speech signal;comparing the first parameter with the smoothed first parameter;comparing the second parameter with the smoothed second parameter; andgenerating a decision point used for determining whether the first frame comprises unvoiced speech or voiced speech based on the comparison as a decision parameter.2. The method of claim 1 , wherein frequency of the second frequency band is higher than frequency of the first frequency band.3. A method for speech processing claim 1 , comprising:receiving a plurality of frames of a speech signal, wherein the plurality of frames comprise a first frame and a second frame prior to the first frame;{'sub': voicing', 'tilt', 'voicing', 'voicing, 'determining a first parameter for the first frame based on a product of (1−P) and (1−P), wherein Pis a periodicity parameter and Pis a spectral tilt parameter;'} ...

Подробнее
10-01-2019 дата публикации

HEADSET WITH REDUCTION OF AMBIENT NOISE

Номер: US20190014404A1
Принадлежит:

A headset with an electro-acoustic input transducer arranged to pick up an acoustic signal and convert the acoustic signal to an electric signal. Based on processing a portion of the electric signal, the voice activity detector is configured to: detect proximal voice activity, distal voice activity and no voice activity, at times when respectively present in the acoustic signal picked up by the electro-acoustic transducer, and to select a respective mode, the selection of which is encoded in the control signal. The first processor is controlled by the voice activity detector to reduce, in the output signal, intelligibility of distal voice activity at least at portions of time periods when the control signal indicates the mode of presence of distal voice activity. 1101. A headset () comprising:{'b': '119', 'an electro-acoustic input transducer () arranged to pick up an acoustic signal and convert the acoustic signal to an electric signal (x);'}{'b': '109', 'a transmitter ();'}{'b': '108', 'a voice activity detector ();'}{'b': 107', '109', '108, 'a first processor () coupled to receive the electric signal (x) and to generate an output signal (y) to the transmitter () in response to a control signal (PDN) from the voice activity detector ();'}{'b': '108', 'wherein, based on processing a portion of the electric signal (x), the voice activity detector () is configured to: detect proximal voice activity, distal voice activity and no voice activity, at times when respectively present in the acoustic signal picked up by the electro-acoustic transducer, and to select a respective mode, the selection of which is indicated in the control signal (PDN);'}{'b': 107', '108, 'wherein the first processor () is controlled by the voice activity detector () to reduce, by filtering, in the output signal, intelligibility of distal voice activity at least at portions of time periods when the control signal (PDN) indicates the mode of presence of distal voice activity;'}{'b': '201', 'a ...

Подробнее
21-01-2016 дата публикации

Adaptive Vehicle State-Based Hands-Free Phone Noise Reduction With Learning Capability

Номер: US20160019904A1
Принадлежит:

This disclosure generally relates to a system, apparatus, and method for achieving an adaptive vehicle state-based hands free noise reduction feature. A noise reduction tool is provided for adaptively applying a noise reduction strategy on a sound input that uses feedback speech quality measures and machine learning to develop future noise reduction strategies, where the noise reduction strategies include analyzing vehicle operational state information and external information that are predicted to contribute to cabin noise and selecting noise reducing pre-filter options based on the analysis. 1. An apparatus , comprising:a memory configured to store a noise reduction pre-filter and feedback data; receive a sound input;', 'receive training input data;', 'receive the feedback data;', 'determine whether to select the pre-filter based on the training input data and feedback data, and', 'if the pre-filter is selected, apply the selected pre-filter to the sound input., 'a processor in communication with the memory, the processor configured to2. The apparatus of claim 1 , wherein the processor is further configured to:apply a Weiner filter to the sound input after the selected pre-filter has been applied.3. The apparatus of claim 1 , wherein the processor is further configured to:generate a performance measure on the sound input after the selected pre-filter has been applied, wherein the performance measure indicates a speech quality of the sound input after the selected pre-filter has been applied.4. The apparatus of claim 3 , wherein the performance measure is a signal-to-noise measure that identifies an energy level for a speech signal within the sound input after the selected pre-filter has been applied.5. The apparatus of claim 3 , wherein the processor is further configured to:feedback the performance measure as new feedback data, andcause the new feedback data to be stored in the memory.6. The apparatus of claim 1 , wherein the training input data includes vehicle ...

Подробнее
03-02-2022 дата публикации

AUDIO PROCESSING APPARATUS

Номер: US20220038831A1
Автор: MIYASAKA Shuji
Принадлежит:

An audio processing apparatus includes a preprocessor which extracts a voice-band signal from a first electric signal, and outputs a first output signal containing the voice-band signal; a first controller which generates a first amplification coefficient for multiplying with the first output signal to compress a dynamic range of an intensity of the first output signal, and generates a first modified amplification coefficient by smoothing the first amplification coefficient with a first time constant; and a first multiplier which multiplies the first modified amplification coefficient and the first output signal. The first time constant is a first rise time constant when the intensity increases, and is a first decay time constant when the intensity decreases. The first rise time constant is not less than a temporal resolution of hearing of a hearing-impaired person, and is less than a duration time of sound which induces recruitment in the hearing-impaired person. 1. An audio processing apparatus , comprising:a first microphone which converts a first sound to a first electric signal;a preprocessor which extracts a voice-band signal from the first electric signal, and outputs a first output signal containing the voice-band signal;a first controller which generates a first amplification coefficient to be multiplied with the first output signal to compress a dynamic range of an intensity of the first output signal, and generates a first modified amplification coefficient by smoothing the first amplification coefficient with a first time constant; anda first multiplier which multiplies the first modified amplification coefficient and the first output signal,wherein the first time constant is a first rise time constant in when the intensity of the first output signal increases, and is a first decay time constant when the intensify of the first output signal decreases, andthe first rise time constant is greater than or equal to a temporal resolution of a sense of hearing ...

Подробнее
23-01-2020 дата публикации

SYSTEM AND METHOD FOR THREAT DETECTION, CLASSIFICATION, WARNING AND ALERTING OF MOBILE USERS

Номер: US20200027475A1
Автор: Balan Maya
Принадлежит:

Embodiments of the invention are designed to detect a threat (such as an imminent automobile collision), and generates a audible warning to override the media distraction of a disengaged user. The user may take evasive action and acknowledge the warning, at which point the system resets and retreats to it vigilance mode. On the other hand, a system initiated alert is generated as a compensatory follow up step to a user who is non-responsive (does not acknowledge a system identified warning), and in such cases the system will transmit a user safety alert comprising the warning data and geolocation parameters, to first responders and interested parties such that they may engage the user to determine their current situation, or preemptively act to remedy a life threatening situation. 1. A system and method for threat detection , classification , warning and alerting of mobile users , comprising:means for classification of encountered sounds by the warning system as threats or as unrelated noise;means for warning the user of detected threats;means for recording and storing potential distress signals such that they may serve to help classify utterances by the user into either distress phrases or ambient noise;means for generating automatic alerts as compensation for lack of user expected input upon receiving a warning;means for generating an alert message designated to be transmitted to interested parties when a alert distress phrase is recognized; andmeans for the delivery of system generated or user initiated alerts messages to configured interested parties.2. The system and method for threat detection claim 1 , classification claim 1 , warning and alerting of mobile users in accordance with claim 1 , wherein said means for classification of encountered sounds by the warning system as threats or as unrelated noise comprises a warming classification system.3. The system and method for threat detection claim 1 , classification claim 1 , warning and alerting of mobile ...

Подробнее
24-04-2014 дата публикации

PITCH ESTIMATOR

Номер: US20140114653A1
Принадлежит: Nokia Corporation

An apparatus comprising an analysis window definer configured to define at least one analysis window for a first audio signal, wherein the at least one analysis window definer is configured to be dependent on the first audio signal and a pitch estimator configured to determine a first pitch estimate for the first audio signal, wherein the pitch estimator is dependent on the first audio signal sample values within the analysis window. 155-. (canceled)57. The method as claimed in claim 56 , wherein defining the at least one analysis window comprises defining at least one of:number of analysis windows;position of analysis window for each analysis window with respect to the first audio signal; andlength of each analysis window.58. The method as claimed in claim 56 , wherein the at least two portions comprise:a first half frame portion;a second half frame portion succeeding the first half frame; anda look ahead frame portion succeeding the second half frame.59. The method as claimed in claim 56 , further comprising determining at least one characteristic of the first audio signal claim 56 , wherein the first audio signal characteristic comprises at least one of:voiced audio;unvoiced audio;voiced onset audio; andvoiced offset audio.60. The method as claimed in claim 56 , wherein defining at least one analysis window for a first audio signal is dependent on a defined structure of the first audio signal and performed prior to receiving the first audio signal sample values.61. The method as claimed in claim 56 , wherein defining the at least one analysis window comprises:defining at least one window in at least one of the portions; anddefining at least one further window in at least one further portion dependent on the at least one window.62. The method as claimed in claim 56 , wherein the determination of the at least one analysis window is further dependent on the processing capacity of the pitch estimator.63. The method as claimed in claim 56 , wherein determining the ...

Подробнее
29-01-2015 дата публикации

NOISE ESTIMATION APPARATUS, NOISE ESTIMATION METHOD, NOISE ESTIMATION PROGRAM, AND RECORDING MEDIUM

Номер: US20150032445A1

A noise estimation apparatus which estimates a non-stationary noise component on the basis of the likelihood maximization criterion is provided. The noise estimation apparatus obtains the variance of a noise signal that causes a large value to be obtained by weighted addition of the sums each of which is obtained by adding the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a speech segment and a speech posterior probability in each frame, and the product of the log likelihood of a model of an observed signal expressed by a Gaussian distribution in a non-speech segment and a non-speech posterior probability in each frame, by using complex spectra of a plurality of observed signals up to the current frame. 1. A noise estimation apparatus which obtains a variance of a noise signal that causes a large value to be obtained by weighted addition of sums each of which is obtained by adding a product of a log likelihood of a model of an observed signal expressed by a Gaussian distribution in a speech segment and a speech posterior probability in each frame , and a product of a log likelihood of a model of an observed signal expressed by a Gaussian distribution in a non-speech segment and a non-speech posterior probability in each frame , by using complex spectra of a plurality of observed signals up to a current frame.2. The noise estimation apparatus according to claim 1 , wherein the variance of the noise signal claim 1 , a speech prior probability claim 1 , a non-speech prior probability claim 1 , and a variance of a desired signal that cause a large value to be obtained by weighted addition of the sums each of which is obtained by adding the product of the log likelihood of the model of the observed signal expressed by the Gaussian distribution in the speech segment and the speech posterior probability in each frame claim 1 , and the product of the log likelihood of the model of the observed signal expressed by the ...

Подробнее
02-02-2017 дата публикации

METHOD AND DEVICE FOR COLLECTING SOUNDS CORRESPONDING TO SURVEILLANCE IMAGES

Номер: US20170032805A1
Принадлежит: Xiaomi Inc.

Aspects of the disclosure provide a method for collecting sounds associated with surveillance images. The method includes identifying a sound-making surveilled object in the surveillance images, the surveillance images being captured by a camera of a monitoring device; determining whether a sound acquiring device meets a preset condition corresponding to that a signal strength of the sounds collected by the sound acquiring device from the identified surveilled object is greater than a signal strength of the sounds collected by the monitoring device from the identified surveilled object; and, when determining that the sound acquiring device meets the preset condition, controlling the sound acquiring device to collect the sounds from the identified surveilled object. 1. A method for collecting sounds associated with surveillance images , the method comprising:identifying a sound-making surveilled object in the surveillance images, the surveillance images being captured by a camera of a monitoring device;determining whether a sound acquiring device meets a preset condition corresponding to that a signal strength of the sounds collected by the sound acquiring device from the identified surveilled object is greater than a signal strength of the sounds collected by the monitoring device from the identified surveilled object; andwhen determining that the sound acquiring device meets the preset condition, controlling the sound acquiring device to collect the sounds from the identified surveilled object.2. The method according to claim 1 , further comprising:when determining that the sound acquiring device does not meet the preset condition, collecting the sounds from the identified surveilled object by the monitoring device.3. The method according to claim 1 , wherein identifying the sound-making surveilled object in the surveillance images comprises:detecting a movement of the surveilled object that corresponds to a predetermined sound-making movement, the predetermined ...

Подробнее
05-02-2015 дата публикации

VEHICLE-MOUNTED COMMUNICATION DEVICE

Номер: US20150039300A1
Автор: Mochiki Naoya
Принадлежит:

An in-vehicle communication device includes: a noise removal filter and a noise suppressor which are configured to remove running noise superimposed on a voice signal collected by a microphone; a band energy ratio corrector for correcting a band energy ratio reduced by the noise removal filter and the noise suppressor; and a variable bitrate encoder for transmitting a speech voice to the other party via a telephone network, the variable bitrate encoder compressing the speech voice corrected by the band energy ratio corrector. This can reduce the possibility that a voice classifier of the variable bitrate encoder erroneously determines voiced sound as voiceless sound and the voiced sound is erroneously compressed by voiceless sound-use low bitrate encoding. Consequently, even in low average bitrate communications, the speech voice in the in-vehicle environment can be provided to the other party at high quality. 15.-. (canceled)6. An in-vehicle communication device , comprising:voice collection means for collecting a voice of a speaker;noise removal means for removing running noise that is superimposed on the voice of the speaker input to the voice collection means;band energy ratio correction means for correcting a band energy ratio of a voice signal output from the noise removal means; andvariable bitrate encoding means for compressing a speech voice corrected by the band energy ratio correction means.7. An in-vehicle communication device according to claim 6 , wherein the band energy ratio correction means comprises:a bandwidth divider for dividing a bandwidth of the voice signal;a multiplier for correcting a bandwidth ratio of the voice signal;a band energy ratio analyzer for analyzing the band energy ratio of the voice signal;a band energy ratio correction update unit for updating a coefficient of the band energy ratio correction means; anda bandwidth combiner for combining divided bandwidth signals that are corrected for each bandwidth of the voice signal.8. An ...

Подробнее
05-02-2015 дата публикации

Voice Activity Detection Using A Soft Decision Mechanism

Номер: US20150039304A1
Автор: Wein Ron
Принадлежит: VERINT SYSTEMS LTD.

Voice activity detection (VAD) is an enabling technology for a variety of speech based applications. Herein disclosed is a robust VAD algorithm that is also language independent. Rather than classifying short segments of the audio as either “speech” or “silence”, the VAD as disclosed herein employees a soft-decision mechanism. The VAD outputs a speech-presence probability, which is based on a variety of characteristics. 1. A method of detection of voice activity in audio data , the method comprising:obtaining audio data;segmenting the audio data into a plurality of frames;computing an activity probability for each frame from the plurality of features of each frame;compare a moving average of activity probabilities to at least one threshold; andidentifying a speech and non-speech segments in the audio data based upon the comparison.2. The method of detection of voice activity in audio data of claim 1 , wherein calculating any of the plurality of features includes calculating an overall energy speech probability for each frame.3. The method of detection of voice activity in audio data of claim 1 , wherein calculating any of the plurality of features includes calculating a band energy speech probability for each frame.4. The method of detection of voice activity in audio data of claim 1 , wherein calculating any of the plurality of features includes calculating a spectral peakiness speech probability for each frame.5. The method of detection of voice activity in audio data of claim 1 , wherein calculating any of the plurality of features includes calculating a residual energy speech probability for each frame.6. The method of detection of voice activity in audio data of claim 1 , wherein the obtaining step includes obtaining a set of audio data in segmented form.7. The method of detection of voice activity in audio data of claim 1 , wherein each of the plurality of features is a speech probability.8. A method of detection of voice activity in audio data claim 1 , the ...

Подробнее
30-01-2020 дата публикации

SOUND DETECTION

Номер: US20200035261A1
Принадлежит:

A method for generating a health indicator for at least one person of a group of people, the method comprising: receiving, at a processor, captured sound, where the captured sound is sound captured from the group of people; comparing the captured sound to a plurality of sound models to detect at least one non-speech sound event in the captured sound, each of the plurality of sound models associated with a respective health-related sound type; determining metadata associated with the at least one non-speech sound event; assigning the at least one non-speech sound event and the metadata to at least one person of the group of people; and outputting a message identifying the at least one non-speech event and the metadata to a health indicator generator module to generate a health indicator for the at least one person to whom the at least one non-speech sound event is assigned. 1. A method for generating a health indicator for at least one person of a group of people , the method comprising:receiving, at a processor, captured sound, where the captured sound is sound captured from the group of people;comparing the captured sound to a plurality of sound models to detect at least one non-speech sound event in the captured sound, each of the plurality of sound models associated with a respective health-related sound type;determining metadata associated with the at least one non-speech sound event;assigning the at least one non-speech sound event and the metadata to at least one person of the group of people; andoutputting a message identifying the at least one non-speech event and the metadata to a health indicator generator module to generate a health indicator for the at least one person to whom the at least one non-speech sound event is assigned.2. The method of claim 1 , wherein the metadata comprises at least one of: a time of the non-speech sound event claim 1 , a date of the non-speech sound event claim 1 , a location of the non-speech sound event claim 1 , and a ...

Подробнее
09-02-2017 дата публикации

AUDIO PROCESSING APPARATUS AND AUDIO PROCESSING METHOD

Номер: US20170040030A1
Принадлежит:

An audio processing apparatus includes a first-section detection unit configured to detect a first section that is a section in which the power of a spatial spectrum in a sound source direction is higher than a predetermined amount of power on the basis of an audio signal of a plurality of channels, a speech state determination unit configured to determine a speech state on the basis of an audio signal within the first section, a likelihood calculation unit configured to calculate a first likelihood that a type of sound source according to an audio signal within the first section is voice and a second likelihood that the type of sound source is non-voice, and a second-section detection unit configured to determine whether or not a second section in which power is higher than average the power of a speech section is a voice section on the basis of the first likelihood and the second likelihood within the second section. 1. An audio processing apparatus comprising:a first-section detection unit configured to detect a first section that is a section in which a power of a spatial spectrum in a sound source direction is higher than a predetermined amount of power on the basis of an audio signal of a plurality of channels;a speech state determination unit configured to determine a speech state on the basis of an audio signal within the first section;a likelihood calculation unit configured to calculate a first likelihood that a type of sound source according to an audio signal within the first section is voice and a second likelihood that the type of sound source is non-voice; anda second-section detection unit configured to determine whether or not a second section in which power is higher than an average power of a speech section is a voice section on the basis of the first likelihood and the second likelihood within the second section.2. The audio processing apparatus according to claim 1 ,wherein a vector space of a likelihood vector including the first likelihood and ...

Подробнее
24-02-2022 дата публикации

METHODS AND SYSTEMS FOR COMPUTER-GENERATED VISUALIZATION OF SPEECH

Номер: US20220059116A1
Принадлежит: SomniQ, Inc.

Methods, systems and apparatuses for computer-generated visualization of speech are described herein. An example method of computer-generated visualization of speech including at least one segment includes: generating a graphical representation of an object corresponding to a segment of the speech; and displaying the graphical representation of the object on a screen of a computing device. Generating the graphical representation includes: representing a duration of the respective segment by a length of the object and representing intensity of the respective segment by a width of the object; and placing, in the graphical representation, a space between adjacent objects. 1. A method of computer-generated visualization of speech including at least one segment , the method comprising: representing a duration of the segment by a length of the object;', 'representing intensity of the segment by a width of the object; and', 'representing a pitch contour of the segment by an angle of inclination of the object with respect to a reference frame; and, 'generating a graphical representation of an object corresponding to a segment of the speech, wherein generating the graphical representation comprisesdisplaying the graphical representation of the object on a screen of a computing device.2. The method of claim 1 , wherein the pitch contour is associated with movement of fundamental frequencies claim 1 , andwherein generating the graphical representation further comprises representing an offset of the fundamental frequencies of the segment by a vertical position of the object with respect to the reference frame.3. The method of wherein the segment is a first segment claim 1 , the method comprising:displaying a first object corresponding to the first segment;displaying a second object corresponding to and a second segment of the speech following the first segment such that the first object and the second object are separated by a space corresponding to an unvoiced period between ...

Подробнее
19-02-2015 дата публикации

Hierarchical Active Voice Detection

Номер: US20150051906A1

One or more audio signals are processed using a multi-stage (hierarchical) voice and/or signal activity detector (VAD/SAD). A first stage is capable of reducing the workload bandwidth by employing an inexpensive VAD/SAD processor. One or more subsequent stages may further process the audio signals from the first stage. Other implementations may include a first stage that also performs continuity preservation between last blocks of audio signal and the first blocks of audio after it is detected that relevant audio signals are resumed. In yet other implementations, the first stage may extract features from audio signals when they are presented in their coded domain, and possibly with little or no decoding of the audio signal. 124-. (canceled)25. A system for processing audio signals , said system comprising:a first stage processor, said first stage processor inputting an audio signal from at least one audio source, wherein said first stage processor is capable of performing preliminary voice or signal activity detection (VAD/SAD) processing upon said audio signal and capable of outputting a first intermediate set of audio signals; wherein said first stage processor is capable of eliminating at least some of the audio signal; anda second stage processor, said second stage processor inputting said first intermediate set of audio signals from said first stage processor, wherein said second stage processor is capable of performing audio processing upon said first intermediate set of audio signals; wherein said second stage processor is capable of performing voice or signal activity detection (VAD/SAD) processing upon said first intermediate set of audio signals; wherein an accuracy for estimating periods of speech or signal activity is higher for the second stage processor than for the first stage processor;wherein said first stage processor is capable of achieving a reduction in bandwidth for the first intermediate set of audio signals which is sent to said second stage ...

Подробнее
26-02-2015 дата публикации

Decimation Synchronization in a Microphone

Номер: US20150055803A1
Принадлежит:

An external clock signal having a first frequency is received. A division ratio is automatically determined based at least in part upon a second frequency of an internal clock. The second frequency is greater than the first frequency. A decimation factor is automatically determined based at least in part upon the first frequency of the external clock signal, the second frequency of the internal clock signal, and a predetermined desired sampling frequency. The division ratio is applied to the internal clock signal to reduce the first frequency to a reduced third frequency. The decimation factor is applied to the reduced third frequency to provide the predetermined desired sampling frequency. Data is clocked to a buffer using the predetermined desired sampling frequency. 1. A method , the method comprising:receiving an external clock signal having a first frequency;automatically determining a division ratio based at least in part upon a second frequency of an internal clock, the second frequency being greater than the first frequency;automatically determining a decimation factor based at least in part upon the first frequency of the external clock signal, the second frequency of the internal clock signal, and a predetermined desired sampling frequency;applying the division ratio to the internal clock signal to reduce the first frequency to a reduced third frequency;applying the decimation factor to the reduced third frequency to provide the predetermined desired sampling frequency;clocking data to a buffer using the predetermined desired sampling frequency.2. The method of claim 1 , further comprising subsequently removing the external clock signal.3. The method of wherein the predetermined desired sampling frequency comprises a frequency rate of approximately 16 kHz.4. An apparatus claim 1 , the apparatus comprising:interface circuitry having an input and output, the input configured to receive an external clock signal having a first frequency;processing circuitry, ...

Подробнее
26-02-2015 дата публикации

Microphone and Corresponding Digital Interface

Номер: US20150058001A1
Принадлежит:

Analog signals are received from a sound transducer. The analog signals are converted into digitized data. A determination is made as to whether voice activity exists within the digitized signal. Upon the detection of voice activity, an indication of voice activity is sent to a processing device. The indication is sent across a standard interface, and the standard interface is configured to be compatible to be coupled with a plurality of devices from potentially different manufacturers. 1. A method , the method comprising:at a microphone:receiving analog signals from a sound transducer;converting the analog signals into digitized data;determining whether voice activity exists within the digitized signal;upon the detection of voice activity, sending an indication of voice activity to a processing device, wherein the indication is sent across a standard interface, the standard interface configured to be compatible to be coupled with a plurality of devices from potentially different manufacturers.2. The method of wherein the microphone is operated in multiple operating modes claim 1 , such that the microphone selectively operate in and moves between a first microphone sensing mode and a second microphone sensing mode based upon one of more of whether an external clock is being received from a processing device claim 1 , or whether power is being supplied to the microphone;wherein within the first microphone sensing mode, the microphone utilizes an internal clock, receives first analog signals from a sound transducer, converts the first analog signals into first digitized data, determines whether voice activity exists within the first digitized signal, upon the detection of voice activity, sends an indication of voice activity to the processing device an subsequently switches from using the internal clock and receives an external clock;wherein within the second microphone sensing mode, the microphone receives second analog signals from a sound transducer, converts the ...

Подробнее
15-05-2014 дата публикации

METHODS AND APPARATUS FOR IDENTIFYING FRAUDULENT CALLERS

Номер: US20140136194A1
Принадлежит: Mattersight Corporation

The methods, apparatus, and systems described herein are designed to identify fraudulent callers. A voice print of a call is created and compared to known voice prints to determine if it matches one or more of the known voice prints. The methods include a pre-processing step to separate speech from non-speech, selecting a number of elements that affect the voice print the most, and/or computing an adjustment factor based on the scores of each received voice print against known voice prints. 1. A method of voice print matching which comprises:receiving a telephonic communication;separating a first portion of the telephonic communication into silent and non-silent segments;evaluating the non-silent segments to determine which portions thereof are speech or non-speech;generating a plurality of parameters that determine what is speech and non-speech in the non-silent segments; andusing the generated parameters to determine what is speech and non-speech for at least the remainder of the telephonic communication.2. The method of claim 1 , wherein the first portion comprises a pre-selected time period.3. The method of claim 2 , wherein the pre-selected time period is about the first 30 seconds to 1 minute of the telephonic communication.4. The method of claim 1 , wherein the plurality of parameters are generated for each communication received.5. The method of claim 1 , wherein evaluating the non-silent segments comprises treating all non-speech sounds substantially the same.6. The method of claim 1 , which further comprises comparing the speech of the telephonic communication with a plurality of recorded voice prints from a plurality of fraudulent speakers based on a number of selective elements that most influence the speech.7. The method of claim 6 , which further comprises identifying a fraudulent speaker if the speech at least substantially matches any of the voice prints of the plurality of fraudulent speakers.8. The method of claim 7 , wherein identifying the ...

Подробнее
21-02-2019 дата публикации

DISPLAY APPARATUS CAPABLE OF RELEASING A VOICE INPUT MODE BY SENSING A SPEECH FINISH AND VOICE CONTROL METHOD THEREOF

Номер: US20190057702A1
Принадлежит: SAMSUNG ELECTRONICS CO., LTD.

A voice control method and display apparatus are provided. The voice control method includes converting a voice of a user into text in response to the voice being input during a voice input mode; performing a control operation corresponding to the text; determining whether speech of the user has finished based on a result of the performing the control operation; awaiting input of a subsequent voice of the user during a predetermined standby time in response to determining that the speech of the user has not finished; and releasing the voice input mode in response to determining that the speech of the user has finished. 1. An electronic apparatus comprising:a voice receiver;a communicator configured to communicate with a server; anda processor configured to:based on a voice signal being received by the voice receiver within a standby time of a voice input mode, receive a data corresponding to one or more search results from the server through the communicator,based on the data corresponding to a single search result being received, process the data corresponding to the single search result and switch the voice input mode to a normal mode,based on the data corresponding to a plurality of search results being received, generate a list including the search results corresponding to the voice signal and reset the standby time of the voice input mode,based on a subsequent voice signal, for selecting at least one of the search results, being received within a reset standby time of the voice input mode, transmit a signal to the server through the communicator to request an additional data corresponding to the selected search result, andprocess the additional data received from the server.2. The electronic apparatus as claimed in claim 1 , wherein the processor is configured to provide a graphic object corresponding to the voice input mode while the voice input mode is maintained.3. The electronic apparatus as claimed in claim 2 , wherein the processor is configured to claim ...

Подробнее
02-03-2017 дата публикации

METHOD FOR INCREASING BATTERY LIFETIME IN A HEARING DEVICE

Номер: US20170064461A1
Принадлежит: OTICON A/S

The disclosure relates to a method for extending the battery lifetime in a hearing device, and to a hearing device with means for extending the battery lifetime of the hearing device. The hearing device comprise an amplifier providing an output signal and a controllable maximum power output limiter configured to limit the output signal from the amplifier when the output signal form the amplifier exceeds a predefined threshold: the hearing device further comprise a speech detector providing a control signal to the controllable maximum power output limiter, where the speech detector is further provided with an input terminal for receiving an input signal, such as the signal provided by a microphone. The controllable maximum power output limiter is further configured to reduce the maximum power output of the amplifier only when the speech detector does not detect the presence of a speech signal in the input signal received by the speech detector. 2. The method according to claim 1 , where the maximum power output is only reduced claim 1 , when it is established that the battery is close to being discharged.3. The method according to claim 1 , where the frequency dependency of the maximum power output reduction is weighted with the speech articulation index claim 1 , whereby reduction of the maximum power output in frequency ranges where the speech information is at a maximum is avoided.6. The hearing device according to claim 4 , where the frequency dependency of the maximum power output reduction is weighted with the speech articulation index claim 4 , whereby reduction of the maximum power output in frequency ranges where the speech information is at a maximum is avoided.7. The method according to claim 2 , where the frequency dependency of the maximum power output reduction is weighted with the speech articulation index claim 2 , whereby reduction of the maximum power output in frequency ranges where the speech information is at a maximum is avoided.8. The hearing ...

Подробнее
12-03-2015 дата публикации

Unvoiced/Voiced Decision for Speech Processing

Номер: US20150073783A1
Автор: GAO Yang
Принадлежит:

In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter. 1. A method for speech processing , the method comprising:determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames;determining a smoothed unvoicing/voicing parameter to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal;computing a difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter; andgenerating a unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.2. The method of claim 1 , wherein the unvoicing/voicing parameter is a combined parameter reflecting at least two characteristics of unvoiced/voiced speech.3. The method of claim 2 , wherein the combined parameter is a product of a periodicity parameter and a spectral tilt parameter.4. The method of claim 1 , wherein the unvoicing/voicing parameter is an unvoicing parameter (P) reflecting a characteristic of unvoiced speech claim 1 , wherein the smoothed unvoicing/voicing parameter is a smoothed unvoicing parameter (P).5. The method of ...

Подробнее
19-03-2015 дата публикации

HARMONICITY ESTIMATION, AUDIO CLASSIFICATION, PITCH DETERMINATION AND NOISE ESTIMATION

Номер: US20150081283A1

Embodiments are described for harmonicity estimation, audio classification, pitch determination and noise estimation. Measuring harmonicity of an audio signal includes calculation a log amplitude spectrum of audio signal. A first spectrum is derived by calculating each component of the first spectrum as a sum of components of the log amplitude spectrum on frequencies. In linear frequency scale, the frequencies are odd multiples of the component's frequency of the first spectrum. A second spectrum is derived by calculating each component of the second spectrum as a sum of components of the log amplitude spectrum on frequencies. In linear frequency scale, the frequencies are even multiples of the component's frequency of the second spectrum. A difference spectrum is derived subtracting the first spectrum from the second spectrum. A measure of harmonicity is generated as a monotonically increasing function of the maximum component of the difference spectrum within predetermined frequency range. 122-. (canceled)23. A method of measuring harmonicity of an audio signal , comprising:calculating a log amplitude spectrum of the audio signal;deriving a first spectrum by calculating each component of the first spectrum as a sum of components of the log amplitude spectrum on frequencies which, in linear frequency scale, are odd multiples of the component's frequency of the first spectrum;deriving a second spectrum by calculating each component of the second spectrum as a sum of components of the log amplitude spectrum on frequencies which, in linear frequency scale, are even multiples of the component's frequency of the second spectrum;deriving a difference spectrum by subtracting the first spectrum from the second spectrum; andgenerating a measure of harmonicity as a monotonically increasing function of the maximum component of the difference spectrum within a predetermined frequency range.24. The method according to claim 23 , wherein the calculation of the log amplitude ...

Подробнее
24-03-2016 дата публикации

METHOD AND APPARATUS FOR MUTING A DEVICE

Номер: US20160085498A1
Принадлежит:

A method and apparatus for muting a device is provided herein. During operation a device such as a two-way radio detects a user's voice and will mute the radio in response to the voice being detected. During the time period the device is muted, all received transmissions will be stored by the radio. These transmissions will be played back to the user when voice activity has ceased for a predetermined amount of time. In a second embodiment of the present invention, the device is only muted when a particular identified user's voice is detected. 1. A method for muting a device , the method comprising the steps of:receiving, via a receiver, an over-the-air transmission;detecting voice activity;storing the over-the-air transmission when the voice activity has been detected;determining that voice activity has ceased for a predetermined period of time;sending the over-the-air transmission to a speaker when it has been determined that the voice activity has ceased for the predetermined period of time.2. The method of wherein the step of receiving the over-the-air transmission comprises the step of receiving an over-the-air voice transmission.3. The method of wherein the step of detecting voice activity comprises the step of determining if any voice was detected.4. The method of wherein the step of detecting voice activity comprises the step of determining if a voice from a particular user was detected claim 1 , and the step of storing the over-the-air transmission comprises the step of storing the over-the-air transmission only when the voice activity from the particular user has been detected.5. The method of wherein the step of sending the over-the-air transmission to the speaker comprises the step of sending the over-the-air transmission to the speaker when it has been determined that the voice activity from the particular user has ceased.6. The method of wherein the step of detecting voice activity comprises the step of determining if a voice from a particular user was ...

Подробнее
23-03-2017 дата публикации

VOICE SYNTHESIS APPARATUS AND METHOD FOR SYNTHESIZING VOICE

Номер: US20170084266A1
Принадлежит: SAMSUNG ELECTRONICS CO., LTD.

A voice synthesis apparatus is provided. The voice synthesis apparatus includes: an electrode array configured to, in response to voiceless speeches of a user, detect an electromyogram (EMG) signal from skin of the user; a speech activity detection module configured to detect a voiceless speech period of the user; a feature extractor configured to extract a signal descriptor indicating a feature of the EMG signal for the voiceless speech period; and a voice synthesizer configured to synthesize speeches by using the extracted signal descriptor. 1. A voice synthesis apparatus comprising:an electrode array configured to, in response to voiceless speeches of a user, detect an electromyogram (EMG) signal from skin of the user;a speech activity detection module configured to detect a voiceless speech period of the user;a feature extractor configured to extract a signal descriptor indicating a feature of the EMG signal for the voiceless speech period; anda voice synthesizer configured to synthesize speeches by using the extracted signal descriptor.2. The voice synthesis apparatus of claim 1 , wherein the electrode array comprises an electrode array comprising a plurality of electrodes having preset intervals.3. The voice synthesis apparatus of claim 1 , wherein the speech activity detection module detects the voiceless speech period of the user based on maximum and minimum values of the EMG signal detected from the skin of the user.4. The voice synthesis apparatus of claim 1 , wherein the feature extractor extracts the signal descriptor indicating the feature of the EMG signal in each preset frame for the voiceless speech period.5. The voice synthesis apparatus of claim 1 , further comprising:a calibrator configured to compensate for the EMG signal detected from the skin of the user.6. The voice synthesis apparatus of claim 5 , wherein the calibrator compensates for the detected EMG signal based on a pre-stored reference EMG signal claim 5 , and the voice synthesizer ...

Подробнее
24-03-2016 дата публикации

SILENCE SIGNATURES OF AUDIO SIGNALS

Номер: US20160088160A1
Принадлежит:

A method performed by a processing system. The method includes generating silence signatures of audio signals from a plurality of device based on energy levels of the audio signals, providing the silence signatures to an interaction service, and outputting interaction information corresponding to the devices. 1. A method performed by a processing system , the method comprising:generating a silence signature of an audio signal from a device based on an energy level of the audio signal;providing the silence signature to an interaction service using a network connection; andoutputting interaction information corresponding to the device from the interaction service.2. The method of claim 1 , wherein generating the silence signature of the audio signal further includes computing a dynamic silence threshold.3. The method of claim 2 , wherein computing the dynamic silence threshold further comprisescomputing an energy value for each of a plurality of time intervals of the audio signal;determining at least a minimum energy value and an average energy value for each of the plurality of time intervals during a predetermined time window of the audio signal; andcomputing at least one dynamic silence threshold for each time interval using at least the minimum energy value and the average energy value.4. The method of claim 3 , further comprising adjusting the at least one dynamic silence threshold at a predetermined decibel value above the minimum energy value.5. The method of claim 3 , wherein generating the silence signature of the audio signal further includes quantizing each time interval of the audio signal into a value based on the at least one dynamic silence threshold for each time interval claim 3 , wherein the value includes at least a silence value or a sound value.6. The method of claim 5 , wherein each time interval having an energy value above the at least one dynamic silence threshold is quantized with a sound value and each time interval having an energy value ...

Подробнее
05-05-2022 дата публикации

METHOD FOR USER VOICE INPUT PROCESSING AND ELECTRONIC DEVICE SUPPORTING SAME

Номер: US20220139377A1
Принадлежит:

According to an embodiment, disclosed is an electronic device including a speaker, a microphone, a communication interface, a processor operatively connected to the speaker, the microphone, and the communication interface, and a memory operatively connected to the processor. The memory stores instructions that, when executed, cause the processor to receive a first utterance through the microphone, to determine a speaker model by performing speaker recognition on the first utterance, to receive a second utterance through the microphone after the first utterance is received, to detect an end-point of the second utterance, at least partially using the determined speaker model. Besides, various embodiments as understood from the specification are also possible.

Подробнее
30-03-2017 дата публикации

AUDIO SIGNAL PROCESSING DEVICE, AUDIO SIGNAL PROCESSING METHOD, AND RECORDING MEDIUM STORING A PROGRAM

Номер: US20170092299A1
Автор: MATSUO Naoshi
Принадлежит: FUJITSU LIMITED

An audio signal processing device that includes: a processor configured to execute a procedure, the procedure comprising: detecting a speech segment of an audio signal; suppressing noise in the audio signal; and adjusting an amount of suppression of noise such that the amount of suppression during a specific period, which starts from a position based on a terminal end of the detected speech segment and is a period shorter than a period spanning from the terminal end of the detected speech segment to a starting end of a next speech segment, becomes greater than in other segments, and a memory configured to store audio signals before and after noise suppression and the amount of suppression before and after adjustment. 1. An audio signal processing device comprising: detecting a speech segment of an audio signal;', 'suppressing noise in the audio signal; and', 'adjusting an amount of suppression of noise such that the amount of suppression during a specific period, which starts from a position based on a terminal end of the detected speech segment and is a period shorter than a period spanning from the terminal end of the detected speech segment to a starting end of a next speech segment, becomes greater than in other segments, and, 'a processor configured to execute a procedure, the procedure comprisinga memory configured to store audio signals before and after noise suppression and the amount of suppression before and after adjustment.2. The audio signal processing device of claim 1 , whereinthe amount of suppression during the specific period is adjusted so as to increase according to an amount acquired based on the audio signal in a non-speech segment between the detected speech segment and the next speech segment.3. The audio signal processing device of claim 1 , wherein:the amount of suppression is adjusted so as to be greater during the specific period than in other segments by further suppressing noise in the noise-suppressed audio signal during the specific ...

Подробнее
16-04-2015 дата публикации

Efficient Discrimination of Voiced and Unvoiced Sounds

Номер: US20150106087A1
Автор: Newman David Edward
Принадлежит: ZANAVOX

A method is disclosed for discriminating voiced and unvoiced sounds in speech. The method detects characteristic waveform features of voiced and unvoiced sounds, by applying integral and differential functions to the digitized sound signal in the time domain. Laboratory tests demonstrate extremely high reliability in separating voiced and unvoiced sounds. The method is very fast and computationally efficient. The method enables voice activation in resource-limited and battery-limited devices, including mobile devices, wearable devices, and embedded controllers. The method also enables reliable command identification in applications that recognize only predetermined commands. The method is suitable as a pre-processor for natural language speech interpretation, improving recognition and responsiveness. The method enables realtime coding or compression of speech according to the sound type, improving transmission efficiency. 1. A method for discriminating voiced sounds from unvoiced sounds in speech , comprising the steps:converting speech sounds to a speech signal comprising sequential digitized values;generating an integral signal by integrating the speech signal;generating a differential signal by differentiating the speech signal;detecting the voiced sounds by comparing the integral signal to an integral-signal threshold;and detecting the unvoiced sounds by comparing the differential signal to a differential-signal threshold.2. The method of wherein:the speech signal includes wavelets, each wavelet comprising a monopolar variation of the speech signal with a duration in the range of 0.05 to 5 milliseconds;the integral signal includes wavelets with durations in the range of 0.25 to 5 milliseconds and substantially excludes wavelets with durations in the range of 0.05 to 0.15 milliseconds;and the differential signal includes wavelets with durations in the range of 0.05 to 0.15 milliseconds and substantially excludes wavelets with durations in the range of 0.25 to 5 ...

Подробнее
23-04-2015 дата публикации

Headset Interview Mode

Номер: US20150112671A1
Принадлежит: Plantronics, Inc.

Methods and apparatuses for headsets are disclosed. In one example, a headset includes a processor, a communications interface, a user interface, and a speaker. The headset includes a microphone array including two or more microphones arranged to detect sound and output two or more microphone output signals. The headset further includes a memory storing an application executable by the processor configured to operate the headset in a first mode utilizing a first set of signal processing parameters to process the two or more microphone output signals and operate the headset in a second mode utilizing a second set of signal processing parameters to process the two or more microphone output signals. 1. A headset comprising:a processor;a communications interface;a user interface;a speaker arranged to output audible sound to a headset wearer ear;a microphone array comprising two or more microphones arranged to detect sound and output two or more microphone output signals; anda memory storing an application executable by the processor configured to operate the headset in a first mode utilizing a first set of signal processing parameters to process the two or more microphone output signals and operate the headset in a second mode utilizing a second set of signal processing parameters to process the two or more microphone output signals.2. The headset of claim 1 , wherein the first set of signal processing parameters are configured to eliminate a signal component corresponding to a voice in proximity to a headset wearer and the second set of signal processing parameters are configured to detect and propagate the signal component corresponding to the voice in proximity to the headset wearer for recording at the headset or transmission to a remote device.3. The headset of claim 2 , wherein the second set of signal processing parameters comprise a beam forming algorithm to isolate the voice in proximity to the headset wearer and a noise reduction algorithm to reduce ambient ...

Подробнее
20-04-2017 дата публикации

Unvoiced/Voiced Decision for Speech Processing

Номер: US20170110145A1
Автор: GAO Yang
Принадлежит:

A method for speech processing includes determining an unvoicing parameter for a first frame of a speech signal and determining a smoothed unvoicing parameter for the first frame by weighting the unvoicing parameter of the first frame and a smoothed unvoicing parameter of a second frame. The unvoicing parameter reflects a speech characteristic of the first frame. The smoothed unvoicing parameter of the second frame is weighted less heavily when the smoothed unvoicing parameter of the second frame is greater than the unvoicing parameter of the first frame. The method further includes computing a difference, by a processor, between the unvoicing parameter of the first frame and the smoothed unvoicing parameter of the first frame, and determining a classification of the first frame according to the computed difference. The classification includes unvoiced speech or voiced speech. The first frame is processed in accordance with the classification of the first frame. 1. A method for speech processing , the method comprising:determining an unvoicing parameter for a first frame of a speech signal, wherein the unvoicing parameter reflects a speech characteristic of the first frame;determining a smoothed unvoicing parameter for the first frame by weighting the unvoicing parameter of the first frame and a smoothed unvoicing parameter of a second frame, wherein the smoothed unvoicing parameter of the second frame is weighted less heavily when the smoothed unvoicing parameter of the second frame is greater than the unvoicing parameter of the first frame;computing a difference, by a processor, between the unvoicing parameter of the first frame and the smoothed unvoicing parameter of the first frame;determining a classification of the first frame according to the computed difference, wherein the classification comprises unvoiced speech or voiced speech; andprocessing the first frame by the processor in accordance with the classification of the first frame.2. The method of claim 1 ...

Подробнее
19-04-2018 дата публикации

HEARING AID ADJUSTMENT VIA MOBILE DEVICE

Номер: US20180109889A1
Принадлежит:

Subject matter disclosed herein may relate to hearing aids, and may relate more particularly to adjusting one or more parameters for one or more hearing aids based, at least in part, on one or more detected audio interactions between a user and one or more second parties. 1. A method , at a mobile device , comprising:detecting an audio interaction between a user and one or more second parties; andadjusting one or more parameters to one or more hearing aids to enhance audibility or intelligibility to the user of an audio signal based, at least in part, on an identity of at least one of the one or more second parties.2. The method of claim 1 , further comprising identifying the at least one of the one or more second parties at least in part by comparing one or more signals and/or states derived from the audio interaction with voice print content stored in a memory of the mobile device.3. The method of claim 1 , further comprising identifying the at least one of the one or more second parties at least in part by user indication of the one or more second parties via a graphical user interface of the mobile device.4. The method of claim 2 , wherein the adjusting the one or more parameters of one or more hearing aids comprises communicating one or more signals and/or states comprising particular voice print content associated with the identified at least one of the one or more second parties between the mobile device and the one or more hearing aids.5. The method of claim 4 , wherein the communicating the one or more signals and/or states between the mobile device and the one or more hearing aids comprises communicating the one or more signals and/or states via a wireless type connection.6. The method of claim 5 , wherein the voiceprint content stored in the memory is stored as at least part of one or more records identifying one or more second parties.7. The method of claim 6 , wherein the one or more records comprise one or more contact lists claim 6 , and wherein the ...

Подробнее
18-04-2019 дата публикации

ROBUSTNESS OF SPEECH PROCESSING SYSTEM AGAINST ULTRASOUND AND DOLPHIN ATTACKS

Номер: US20190115046A1
Автор: Lesso John Paul

A method for improving the robustness of a speech processing system having at least one speech processing module comprises: receiving an input sound signal comprising audio and non-audio frequencies; separating the input sound signal into an audio band component and a non-audio band component; and identifying possible interference within the audio band from the non-audio band component. Based on such an identification, the operation of a downstream speech processing module is adjusted. 1. A method for improving the robustness of a speech processing system having at least one speech processing module , the method comprising:receiving an input sound signal comprising audio and non-audio frequencies;separating the input sound signal into an audio band component and a non-audio band component;identifying possible interference within the audio band from the non-audio band component; andadjusting the operation of a downstream speech processing module based on said identification.2. The method of claim 1 , wherein identifying possible interference within the audio band from the non-audio band component comprises determining whether a power level of the non-audio band component exceeds a threshold value and claim 1 , if so claim 1 , identifying possible interference within the audio band from the non-audio band component.3. The method of claim 1 , wherein identifying possible interference within the audio band from the non-audio band component comprises comparing the audio band and non-audio band components.4. The method of claim 3 , wherein the step of identifying possible interference within the audio band from the non-audio band component comprises:{'sub': 'a', 'measuring a signal power in the audio band component P;'}{'sub': 'b', 'measuring a signal power in the non-audio band component P; and'}{'sub': a', 'b, 'if (P/P) Подробнее

17-07-2014 дата публикации

System and Method for Speech Recognition Using Pitch-Synchronous Spectral Parameters

Номер: US20140200889A1
Автор: Chen Chengjun Julian
Принадлежит:

The present invention defines a pitch-synchronous parametrical representation of speech signals as the basis of speech recognition, and discloses methods of generating the said pitch-synchronous parametrical representation from speech signals. The speech signal is first going through a pitch-marks picking program to identify the pitch periods. The speech signal is then segmented into pitch-synchronous frames. An ends-matching program equalizes the values at the two ends of the waveform in each frame. Using Fourier analysis, the speech signal in each frame is converted into a pitch-synchronous amplitude spectrum. Using Laguerre functions, the said amplitude spectrum is converted into a unit vector, referred to as the timbre vector. By using a database of correlated phonemes and timbre vectors, the most likely phoneme sequence of an input speech signal can be decoded in the acoustic stage of a speech recognition system. 1. A method of automatic speech recognition to convert speech signal into text using one or more processors comprising:A) segmenting the speech signal into pitch-synchronous frames, wherein for voiced sections each said frame is a single pitch period;B) for each frame, equalizing the two ends of the waveform using an ends-matching program;C) generating an amplitude spectrum of each said frame using Fourier analysis;D) transforming each said amplitude spectrum into a timbre vector using Laguerre functions;E) performing acoustic decoding to find a list of most likely phonemes or sub-phoneme units for each said timbre vector by comparing with a timbre vector database;F) decoding the sequence of the list of the most likely phonemes or sub-phoneme units using a language-model database to find out the most likely text.2. The method in claim 1 , wherein segmenting of the speech-signal is based on the glottal closure instants derived from simultaneously recorded electroglottograph signals and by analyzing the sections of speech signal where glottal closure ...

Подробнее
25-08-2022 дата публикации

SPEECH CODING METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM

Номер: US20220270622A1
Автор: LIANG Junbin
Принадлежит:

This application relates to a speech coding method, an electronic device, and a storage medium. The method includes: extracting a first speech frame feature corresponding to a first to-be-encoded speech frame, and obtaining a first speech frame criticality level corresponding to the first to-be-encoded speech frame based on the first speech frame feature; extracting a second speech frame feature corresponding to the subsequent speech frame, and obtaining a second speech frame criticality level corresponding to the subsequent speech frame based on the second speech frame feature; obtaining a criticality trend feature based on the first speech frame criticality level and the second speech frame criticality level, and determining, using the criticality trend feature, an encoding bit rate corresponding to the to-be-encoded speech frame; and encoding the first to-be-encoded speech frame based on the encoding bit rate to obtain an encoding result. 1. A speech coding method , executed by an electronic device , the method comprising:obtaining a first to-be-encoded speech frame and a subsequent speech frame;extracting a first speech frame feature corresponding to the first to-be-encoded speech frame, and obtaining a first speech frame criticality level corresponding to the first to-be-encoded speech frame based on the first speech frame feature;extracting a second speech frame feature corresponding to the subsequent speech frame, and obtaining a second speech frame criticality level corresponding to the subsequent speech frame based on the second speech frame feature;obtaining a criticality trend feature based on the first speech frame criticality level and the second speech frame criticality level, and determining, using the criticality trend feature, an encoding bit rate corresponding to the first to-be-encoded speech frame, the encoding bit rate corresponding to each to-be-encoded speech frame being controlled adaptively based on criticality trend strength represented by ...

Подробнее
25-08-2022 дата публикации

UTTERANCE SECTION DETECTION DEVICE, UTTERANCE SECTION DETECTION METHOD, AND PROGRAM

Номер: US20220270637A1

An utterance section detection device which is capable of detecting an utterance section with high accuracy on the basis of whether or not an end of a speech section is an end of utterance. The utterance section detection device includes a speech/non-speech determination unit configured to perform speech/non-speech determination which is determination as to whether a certain frame of an acoustic signal is speech or non-speech, an utterance end determination unit configured to perform utterance end determination which is determination as to whether or not an end of a speech section is an end of utterance for each speech section which is a section determined as speech as a result of the speech/non-speech determination, a non-speech section duration threshold determination unit configured to determine a threshold regarding a duration of a non-speech section on the basis of a result of the utterance end determination, and an utterance section detection unit configured to detect an utterance section by comparing a duration of a non-speech section following the speech section with the corresponding threshold. 1. An utterance section detection device comprising:processing circuitry configured toperform speech/non-speech determination which is determination as to whether a certain frame of an acoustic signal is speech or non-speech;perform utterance end determination which is determination as to whether or not an end of a speech section is an end of utterance for each speech section which is a section determined as speech as a result of the speech/non-speech determination;determine a threshold regarding a duration of a non-speech section on a basis of a result of the utterance end determination; anddetect an utterance section by comparing a duration of a non-speech section following the speech section with the corresponding threshold.2. An utterance section detection device which is the speech/non-speech determination device according to claim 1 , make the corresponding ...

Подробнее
25-08-2022 дата публикации

METHOD AND APPARATUS FOR PROCESSING LIVE STREAM AUDIO, AND ELECTRONIC DEVICE AND STORAGE MEDIUM

Номер: US20220270638A1
Автор: XING Wenhao, Zhang Chen
Принадлежит:

A method for processing live stream audio, and an electronic device and a storage medium are provided. The method is applied to a live streamer end, and includes: acquiring a first audio signal formed by mixing a guest audio signal with a background audio signal of the live streamer end; obtaining a second audio signal by performing echo cancellation on the guest audio signal in the first audio signal according to the guest audio signal; detecting a voice activity state of a guest end according to the guest audio signal, the first audio signal and the second audio signal; obtaining a third audio signal by performing echo cancellation on the first audio signal in a mixed audio signal according to the voice activity state and the first audio signal; synthesizing and pushing the second audio signal and the third audio signal to the guest end. 1. A method for processing live stream audio , applied to a live streamer end , the method comprising:obtaining a first audio signal formed by mixing a guest audio signal with a background audio signal of the live streamer end;obtaining a second audio signal by performing echo cancellation on the guest audio signal in the first audio signal according to the guest audio signal;detecting a voice activity state of a guest end according to the guest audio signal, the first audio signal and the second audio signal;obtaining a third audio signal by performing echo cancellation on the first audio signal in a mixed audio signal according to the voice activity state and the first audio signal, wherein the mixed audio signal is a signal consisted of the first audio signal and a live streamer audio signal collected by a microphone of the live streamer end;synthesizing and pushing the second audio signal and the third audio signal to the guest end.2. The method according to claim 1 , wherein said detecting the voice activity state of the guest end according to the guest audio signal claim 1 , the first audio signal and the second audio signal ...

Подробнее
01-09-2022 дата публикации

VOICE/NON-VOICE DETERMINATION DEVICE, VOICE/NON-VOICE DETERMINATION MODEL PARAMETER LEARNING DEVICE, VOICE/NON-VOICE DETERMINATION METHOD, VOICE/NON-VOICE DETERMINATION MODEL PARAMETER LEARNING METHOD, AND PROGRAM

Номер: US20220277767A1

A voice/non-voice determination device robust with respect to an acoustic signal in a high-noise environment is provided. The voice/non-voice determination device includes an acoustic scene classification unit including a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected, a speech enhancement unit including a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement, and a voice/non-voice determination unit including a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section. 1. A voice/non-voice determination device comprising:processing circuitry configured toinclude a first model which receives input of an acoustic signal and outputs acoustic scene information which is information regarding a scene where the acoustic signal is collected and configured to receive input of the acoustic signal and a parameter of the first model learned in advance and output the acoustic scene information;include a second model which receives input of the acoustic signal and outputs speech enhancement information which is information regarding the acoustic signal after enhancement and configured to receive input of the acoustic signal and a parameter of the second model learned in advance and output the speech enhancement information; andinclude a third model which receives input of the acoustic signal, the acoustic scene information and the speech enhancement information and outputs a voice/non-voice label which is information regarding a label of either a speech section or a non-speech section and configured to receive input of the acoustic signal, the acoustic ...

Подробнее
21-05-2015 дата публикации

PRE-PROCESSING APPARATUS AND METHOD FOR SPEECH RECOGNITION

Номер: US20150142430A1
Автор: Kwon Min Ho
Принадлежит:

A pre-processing apparatus for speech recognition may include: a trailing silence period detection unit configured to detect the length of a trailing silence period contained in a speech signal; a reference trailing silence period storage unit configured to store the length of a reference trailing silence period; and a trailing silence period adjusting unit configured to adjust the length of the trailing silence period contained in the speech signal based on the length of the reference trailing silence period. 1. A pre-processing apparatus for speech recognition , comprising:a trailing silence period detection unit configured to detect the length of a trailing silence period contained in a speech signal;a reference trailing silence period storage unit configured to store the length of a reference trailing silence period; anda trailing silence period adjusting unit configured to adjust the length of the trailing silence period contained in the speech signal based on the length of the reference trailing silence period.2. The pre-processing apparatus of claim 1 , wherein the trailing silence period comprises a silence period required until an actual user's speech in the speech signal inputted through a speech input unit is inputted after a speech recognition start sound is outputted.3. The pre-processing apparatus of claim 1 , wherein the trailing silence period adjusting unit increases the length of the trailing silence period of the speech signal to the length of the reference trailing silence period claim 1 , when the length of the trailing silence period detected in the speech signal is smaller than the length of the reference trailing silence period.4. The pre-processing apparatus of claim 3 , wherein the trailing silence period adjusting unit decreases the length of the trailing silence period of the speech signal to the length of the reference trailing silence period claim 3 , when the length of the trailing silence period detected in the speech signal is larger ...

Подробнее
28-05-2015 дата публикации

METHOD AND APPARATUS FOR DETECTING SPEECH/NON-SPEECH SECTION

Номер: US20150149166A1
Автор: Jang In Seon, LIM Woo Taek

Provided is an apparatus for detecting a speech/non-speech section. The apparatus includes an acquisition unit which obtains inter-channel relation information of a stereo audio signal, a classification unit which classifies each element of the stereo audio signal into a center channel element and a surround element on the basis of the inter-channel relation information, a calculation unit which calculates an energy ratio value between a center channel signal composed of center channel elements and a surround channel signal composed of surround elements, for each frame, and an energy ratio value between the stereo audio signal and a mono signal generated on the basis of the stereo audio signal, and a judgment unit which determines a speech section and a non-speech section from the stereo audio signal by comparing the energy ratio values. 1. An apparatus for detecting a speech/non-speech section , the apparatus comprising:an acquisition unit which obtains inter-channel relation information of a stereo audio signal;a classification unit which classifies each element of the stereo audio signal into a center channel element and a surround element on the basis of the inter-channel relation information;a calculation unit which calculates an energy ratio value between a center channel signal composed of center channel elements and a surround channel signal composed of surround elements, for each frame, and an energy ratio value between the stereo audio signal and a mono signal generated on the basis of the stereo audio signal; anda judgment unit which determines a speech section and a non-speech section from the stereo audio signal by comparing the energy ratio values.2. The apparatus of claim 1 , wherein the inter-channel relation information comprises information on a level difference between channels of the stereo audio signal and information on a phase difference between channels.3. The apparatus of claim 2 , wherein the inter-channel relation information further ...

Подробнее
08-09-2022 дата публикации

VOICE ACTIVITY DETECTION APPARATUS, LEARNING APPARATUS, AND VOICE ACTIVITY DETECTION METHOD

Номер: US20220284921A1
Автор: Kim Uihyun
Принадлежит: KABUSHIKI KAISHA TOSHIBA

According to one embodiment, a voice activity detection apparatus comprises a processing circuit. The processing circuit calculates an acoustic feature based on an acoustic signal; calculates a non-acoustic feature based on a non-acoustic signal; calculates a correlation coefficient based on the acoustic feature and the non-acoustic feature; and detects a voice section and/or a non-voice section based on a comparison of the correlation coefficient with a threshold, the voice section being a time section in which voice is presence, the non-voice section being a time section in which voice is absence. 1. A voice activity detection apparatus comprising a processing circuit configured to:calculate an acoustic feature based on an acoustic signal;calculate a non-acoustic feature based on a non-acoustic signal;calculate a correlation coefficient based on the acoustic feature and the non-acoustic feature; anddetect a voice section and/or a non-voice section based on a comparison of the correlation coefficient with a threshold, the voice section being a time section in which voice is presence, the non-voice section being a time section in which voice is absence.2. The voice activity detection apparatus according to claim 1 , wherein the non-acoustic signal is an image signal temporally synchronized with the acoustic signal.3. The voice activity detection apparatus according to claim 1 , wherein the processing circuit acquires the acoustic signal and the non-acoustic signal relating to same voice generation source.4. The voice activity detection apparatus according to claim 1 , wherein the processing circuit calculates the acoustic feature from the acoustic signal using a first trained model claim 1 , and calculates the non-acoustic feature from the non-acoustic signal using a second trained model claim 1 ,the correlation coefficient includes a first correlation coefficient calculated based on the acoustic feature and the non-acoustic feature temporally synchronized with each ...

Подробнее
09-05-2019 дата публикации

Voice Activity Detection Feature Based on Modulation-Phase Differences

Номер: US20190139567A1
Принадлежит:

Speech processing methods may rely on voice activity detection (VAD) that separates speech from noise. Example embodiments of a computationally low complex VAD feature that is robust against various types of noise is introduced. By considering an alternating excitation structure of low and high frequencies, speech is detected with a high confidence. The computationally low complex VAD feature can cope even with the limited spectral resolution that may be typical for a communication system, such as an in-car-communication (ICC) system. Simulation results confirm the robustness of the computationally low complex VAD feature and show an increase in performance relative to established VAD features. 1. A method for detecting speech in an audio signal , the method comprising:identifying a pattern of at least one occurrence of time-separated first and second distinctive feature values of first and second feature values, respectively, in at least two different frequency bands of an electronic representation of an audio signal of speech that includes voiced and unvoiced phonemes and noise, the identifying including associating the first distinctive feature values with the voiced phonemes and the second distinctive feature values with the unvoiced phonemes, the first and second distinctive feature values representing information distinguishing the speech from the noise, the time-separated first and second distinctive feature values being non-overlapping, temporally, in the at least two different frequency bands; andproducing a speech detection result for a given frame of the electronic representation of the audio signal based on the pattern identified, the speech detection result indicating a likelihood of a presence of the speech in the given frame.2. The method of claim 1 , wherein:the first feature values represent power over time of the electronic representation of the audio signal in a first frequency band of the at least two frequency bands and wherein the first ...

Подробнее
04-06-2015 дата публикации

Voice Activity Detection (VAD) for a Coded Speech Bitstream without Decoding

Номер: US20150154981A1
Принадлежит: Nuance Communications Inc

A system, method and computer program product are described for voice activity detection (VAD) within a digitally encoded bitstream. A parameter extraction module is configured to extract parameters from a sequence of coded frames from a digitally encoded bitstream containing speech. A VAD classifier is configured to operate with input of the digitally encoded bitstream to evaluate each coded frame based on bitstream coding parameter classification features to output a VAD decision indicative of whether or not speech is present in one or more of the coded frames.

Подробнее
04-06-2015 дата публикации

DETECTING PAUSE IN AUDIBLE INPUT TO DEVICE

Номер: US20150154983A1
Принадлежит: Lenovo (Singapore) Pted. Ltd.

A device includes a processor and a memory accessible to the processor and bearing instructions executable by the processor to process an audible input sequence provided by a user of the device, determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device, cease to process the audible input sequence responsive to a determination that the pause has occurred, determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera, and resume processing of the audible input sequence responsive to a determination that providing the audible input sequence has resumed. 1. A device comprising:a processor;a memory accessible to the processor and bearing instructions executable by the processor to:process an audible input sequence, the audible input sequence being provided by a user of the device;determine that a pause in providing the audible input sequence has occurred at least partially based on a first signal from at least one camera communicating with the device;responsive to a determination that the pause has occurred, cease to process the audible input sequence;determine that providing the audible input sequence has resumed based at least partially based on a second signal from the camera; andresponsive to a determination that providing the audible input sequence has resumed, resume processing of the audible input sequence.2. The device of claim 1 , wherein the pause includes an audible sequence separator that is unintelligible to the device.3. The device of claim 2 , wherein the instructions are further executable by the processor to determine to cease to process the audible input sequence responsive to processing a signal from an accelerometer on the device except when also at least substantially concurrently therewith receiving the audible sequence separator.4. The device of claim 2 , wherein the ...

Подробнее
04-06-2015 дата публикации

ELECTRONIC APPARATUS AND RECORDING FILE TRANSMISSION METHOD

Номер: US20150155001A1
Принадлежит:

According to one embodiment, an electronic apparatus includes a memory and a processor. Each of files comprises positional information and time information. The files are prepared by apparatuses. The processing circuitry searches from files comprising a first file, for a second file corresponding to the first file based on the positional information and the time information associated with each of the files, determines whether recorded content of the second file comprises a part of recorded content of the first file, and transmits a third file comprising a part of the second file to a first apparatus when the recorded content of the second file comprises a part of the recorded content of the first file.

Подробнее
25-05-2017 дата публикации

SPEECH RECOGNITION METHOD AND SPEECH RECOGNITION APPARATUS TO IMPROVE PERFORMANCE OR RESPONSE OF SPEECH RECOGNITION

Номер: US20170148442A1
Автор: Nishikawa Tsuyoki
Принадлежит:

In a speech recognition method, a criteria value is determined to determine the length of a silent section included in a processing section, and a processing mode to use is determined in accordance with the criteria value. The criteria value is used to obtain audio information of the processing section. Audio processing is executed on the audio information in the processing section, using the processing mode that has been determined. Speech recognition processing is executed on the audio information in the processing section that has been subjected to audio processing. 1. A speech recognition method comprising:determining a criteria value to determine a length of a first silent section included in a processing section;determining a processing mode out of multiple processing modes for audio processing of which the processing amounts differ from each other, in accordance with the criteria value;obtaining audio information in the processing section including a section of interest and the first silent section following the section of interest, out of audio information in an input section including the processing section, using the criteria value;executing audio processing on the audio information in the section of interest, out of the audio information in the processing section, using the processing mode that has been determined; andexecuting speech recognition processing on the audio information in the section of the interest where the audio processing has been performed.2. The speech recognition method according to claim 1 , further comprising:detecting a silent section from audio information in the input section,wherein, in the determining of the criteria value, a threshold value is determined as the criteria value that is information for determining an end of the processing section and that indicates the length of the first silent section,wherein, in the determining of the processing mode, the processing mode is determined based on the threshold value, andwherein, ...

Подробнее
25-05-2017 дата публикации

VECTOR QUANTIZATION

Номер: US20170148455A1
Принадлежит:

It is inter alia disclosed to determine a first quantized representation of an input vector, and to determine a second quantized representation of the input vector based on a codebook depending on the first quantized representation 1. A method performed by an apparatus , said method comprising:grouping a plurality of vector components of an input vector into at least two groups of vector components in accordance with a rule based on the vector components, wherein each group of the at least two groups of vector components comprises at least one vector component of the plurality of vector components; anddetermining, for at least one of the at least two groups of vector components, a quantized representation of the respective group of vector components based on a codebook associated with the group of vector components.2. The method according to claim 1 , wherein said rule is one of:a rule based on energy values associated with the vector components; anda rule based on a predefined norm associated with the vector components.3. The method according to claim 2 , wherein said rule specifies that the vector components of each of the at least two groups of vector components must fulfill a predefined energy characteristic.4. The method according to claim 3 , wherein said rule specifies that an energy value associated with a respective vector component of each vector component of a first group of the at least two groups of vector components is higher than an energy value associated with a respective vector component of each vector component of each group of the remaining groups of the at least two groups of vector components.6. The method according to claim 1 , wherein said grouping said plurality of vector components of the input vector into at least two groups of vector components comprises:splitting the plurality of vector component of the input vector into said at least two groups of vector components; andswapping a vector component of a first group of the at least two ...

Подробнее
25-05-2017 дата публикации

WEARABLE DEVICE, WEARABLE DEVICE SYSTEM AND METHOD FOR CONTROLLING WEARABLE DEVICE

Номер: US20170149948A1
Принадлежит:

A wearable device comprising: a band unit; and a vibration generating unit coupled to the band unit. The band unit comprises a communication unit receiving a first signal by a predetermined communication method and a control unit determining an amplification extent of the first signal received from the communication unit and generating a second signal based on the determined amplification extent, and the vibration generating unit receives the second signal from the control unit and generates a vibration corresponding to the second signal. 1. A wearable device comprisinga band unit; anda vibration generating unit coupled to the band unit,wherein the band unit comprises a communication unit receiving a first signal by a predetermined communication method and a control unit determining an amplification extent of the first signal received from the communication unit and generating a second signal based on the determined amplification extent, and the vibration generating unit receives the second signal from the control unit and generates a vibration corresponding to the second signal, wherein the control unit determines the amplification extent of the first signal based on user information, and the user information includes at least one of user's sex, height, weight, finger length or wrist-to-finger length.2. The wearable device of claim 1 , further comprising:a first transform unit performing discrete fourier transform (DFT) on the first signal;an amplification unit obtaining a gain value based on the user information, applying the gain value to the DFT performed first signal and generating a third signal; anda second transform unit performing an inverse discrete fourier transform (IDFT) on the third signal and generating the second signal.3. The wearable device of claim 2 , further comprising a memory unit storing the user information claim 2 , wherein the amplification unit obtains the gain value based on the amplification extent stored in the memory unit.4. The ...

Подробнее
02-06-2016 дата публикации

Computer Implemented System and Method for Identifying Significant Speech Frames Within Speech Signals

Номер: US20160155441A1
Принадлежит: TATA CONSULTANCY SERVICES LTD.

The present disclosure envisages a computer implemented system for identifying significant speech frames within speech signals for facilitating speech recognition. The system receives an input speech signal having a plurality of feature vectors which is passed through a spectrum analyzer. The spectrum analyzer divides the input speech signal into a plurality of speech frames and computes a spectral magnitude of each of the speech frames. There is provided a suitability engine which is enabled to compute a suitability measure for each of the speech frames corresponding to spectral flatness measure (SFM), energy normalized variance (ENV), entropy, signal-to-noise ratio (SNR) and similarity measure. The suitability engine further computes a weighted suitability measure for each of the speech frames. 1. A computer implemented system for identifying significant speech frames within speech signals for facilitating speech recognition , said system comprising:an input module configured to accept at least an input speech signal, wherein the speech signal is represented by a plurality of feature vectors;a spectrum analyzer cooperating with said input module to receive the input speech signal, said spectrum analyzer comprising a divider configured to divide the input speech signal into plurality of speech frames, said spectrum analyzer further configured to compute a spectral magnitude of each of the speech frames;an extractor cooperating with said spectrum analyzer to receive said speech frames, and configured to extract at least a feature vector from each of the speech frames; a spectral flatness module configured to receive the spectral magnitude of each of the speech frames and compute a spectral flatness measure to determine suitability measure for each of said speech frames;', 'an energy normalized variance module configured to receive the spectral magnitude of each of the said speech frames and compute an energy normalized variance to determine suitability measure for ...

Подробнее
15-09-2022 дата публикации

METHOD OF DETECTING SPEECH AND SPEECH DETECTOR FOR LOW SIGNAL-TO-NOISE RATIOS

Номер: US20220293127A1
Принадлежит: GN Hearing A/S

The present disclosure relates in a first aspect to a method of detecting speech of incoming sound at a portable communication device. A microphone signal is divided into a plurality of separate frequency band signals from which respective power envelope signals are derived. Onsets of voiced speech of a first frequency band signal are determined based on a first stationary noise power signal and a first clean power signal and onsets of unvoiced speech in a second frequency band signal are determined based on a second stationary noise power signal and second clean power signal. 1. A method performed by a portable communication device , the method comprising:generating a microphone signal by a microphone arrangement of the portable communication device based on incoming sound;dividing the microphone signal into a plurality of separate frequency band signals comprising at least a first frequency band signal and a second frequency band signal;determining a first power envelope signal of the first frequency band signal and a second power envelope signal of the second frequency band signal;deriving a first stationary noise power signal and a first non-stationary noise power signal from the first power envelope signal;deriving a first clean power signal by subtracting the first stationary noise power signal and the first non-stationary noise power signal from the first power envelope signal;deriving a second stationary noise power signal and a second non-stationary noise power signal from the second power envelope signal;deriving a second clean power signal by subtracting the second stationary noise power signal and the second non-stationary noise power signal from the second power envelope signal;determining an onset of voiced speech that is associated with the first frequency band signal based on the first stationary noise power signal and the first clean power signal; anddetermining an onset of unvoiced speech that is associated with the second frequency band signal ...

Подробнее
31-05-2018 дата публикации

METHOD AND DEVICE FOR SEARCHING ACCORDING TO SPEECH BASED ON ARTIFICIAL INTELLIGENCE

Номер: US20180151183A1
Автор: LI CHAO, LI Xiangang, SUN Jue
Принадлежит:

A method and a device for searching according to a speech based on artificial intelligence are provided. The method includes: identifying an input speech of a user to determine whether the input speech is a child speech; filtrating a searched result obtained according to the input speech to obtain a filtrated searched result, if the input speech is the child speech; and feeding the filtrated searched result back to the user. 1. A method for searching according to a speech based on artificial intelligence , comprising:identifying, by at least one computing device, an input speech of a user to determine whether the input speech is a child speech;filtrating, by the at least one computing device, a searched result obtained according to the input speech to obtain a filtrated searched result, if the input speech is the child speech; andfeeding, by the at least one computing device, the filtrated searched result back to the user.2. The method according to claim 1 , wherein filtrating claim 1 , by the at least one computing device claim 1 , a searched result obtained according to the input speech comprises:converting, by the at least one computing device, the input speech into a text content;obtaining, by the at least one computing device, the searched result by searching according to the text content; andfiltrating, by the at least one computing device, the searched result to remove a sensitive content unsuitable for a child.3. The method according to claim 2 , wherein obtaining claim 2 , by the at least one computing device claim 2 , the searched result by searching according to the text content comprises:searching, by the at least one computing device, according to the text content in a first database pre-established for children; andsearching, by the at least one computing device, according to the text content in a second database to obtain the searched result, if no content related to the input speech is searched in the first database.4. The method according to claim 1 ...

Подробнее
11-06-2015 дата публикации

Spectral Comb Voice Activity Detection

Номер: US20150162021A1
Принадлежит: MALASPINA LABS (BARBADOS) Inc

The various implementations described enable voice activity detection and/or pitch estimation for speech signal processing in, for example and without limitation, hearing aids, speech recognition and interpretation software, telephony, and various applications for smartphones and/or wearable devices. In particular, some implementations include systems, methods and/or devices operable to detect voice activity in an audible signal by determining a voice activity indicator value that is a normalized function of signal amplitudes associated with at least two sets of spectral locations associated with a candidate pitch. In some implementations, voice activity is considered detected when the voice activity indicator value breaches a threshold value. Additionally and/or alternatively, in some implementations, analysis of the audible signal provides a pitch estimate of detectable voice activity.

Подробнее
09-06-2016 дата публикации

METHOD AND DEVICE FOR PROCESSING A SOUND SIGNAL

Номер: US20160163335A1
Принадлежит: SAMSUNG ELECTRONICS CO., LTD.

A method of processing a sound signal is disclosed. The method of processing a sound signal includes receiving a sound signal from the outside of a device, converting the sound signal into a first frequency domain signal, determining whether or not the sound signal is a voice signal using the first frequency domain signal acquired through the conversion, converting the first frequency domain signal into a second frequency domain signal based on the determination, and recognizing the sound signal using the second frequency domain signal acquired through the conversion. 1. A method of processing a sound signal , the method comprising:receiving, by a device, a sound signal from an external source;converting the sound signal into a first frequency domain signal in a first frequency domain, and determining whether or not the sound signal is a voice signal using the first frequency domain signal;converting the first frequency domain signal into a second frequency domain signal in a second frequency domain based on a result of the determining; andanalyzing the sound signal using the second frequency domain signal.2. The method of claim 1 , wherein the converting the first frequency domain signal into the second frequency domain signal based on the result of the determining comprises determining that the sound signal is a voice signal.3. The method of claim 1 , wherein the converting the first frequency domain signal into the second frequency domain signal based on the result of the determining comprises using at least one method selected from upsampling claim 1 , downsampling claim 1 , interpolation claim 1 , mirroring claim 1 , and phase shifting.4. The method of claim 1 , wherein the first frequency domain includes at least one selected from a discrete Fourier transform (DFT) domain claim 1 , a discrete cosine transform (DCT) domain claim 1 , a discrete sine transform (DST) domain claim 1 , and a modified discrete cosine transform (MDCT) domain.5. The method of claim 1 , ...

Подробнее
22-09-2022 дата публикации

SYSTEMS AND METHODS FOR DETECTING COGNITIVE CHANGE BASED ON VOICE AND SMARTPHONE SENSORS

Номер: US20220301581A1
Автор: BARUCHI, Itay
Принадлежит: M.You Cognitive Technologies Ltd

Generally, systems and methods for determining a change of a cognitive capability of a user are disclosed. The method may include: receiving at least one sensor signal acquired by at least one sensor (such as an accelerometer, gyro and/or magnetometer) mounted within a mobile phone of the user; determining a voice activity dataset based on the at least one sensor signal; and determining a change of a cognitive capability of the user based on the voice activity dataset. Advantageously, the disclosed systems and methods may enable determining anomalies and trends in the cognition of the user based on the sensor(s) mounted within the mobile phone of the user, without collecting and/or recording the voice of the user. 1. A system for determining a change of cognitive capability of a user , the system comprising:a storage module configured to receive at least one sensor signal acquired by at least one sensor during a phone conversation of the user, wherein the at least one sensor is mounted within a mobile phone of the user and selected from a group consisting of: an accelerometer, gyro and magnetometer;a voice activity determination module configured to determine a voice activity dataset based on the at least one sensor signal, wherein the voice activity dataset comprises a plurality of data values each representing a specific time window and indicating a voice detection or an absence thereof at the time window thereof; anda cognitive capability determination module configured to determine a change of a cognitive capability of the user based on the voice activity dataset.2. The system of claim 1 , further comprising a voice complexity determination module configured to determine a voice complexity based on the voice activity dataset.3. The system of claim 2 , wherein the cognitive capability determination module is configured to determine the change of the cognitive capability of the user based on the voice complexity.4. The system of claim 1 , wherein the at least one ...

Подробнее
22-09-2022 дата публикации

METHOD AND APPARATUS FOR DETERMINING SPEECH PRESENCE PROBABILITY AND ELECTRONIC DEVICE

Номер: US20220301582A1
Автор: Liang Min, WANG Fabing

A method and apparatus for determining a speech presence probability and an electronic device are provided. According to present disclosure, a metric parameter of a signal to noise ratio of a signal of a first channel and a metric parameter of a signal power level difference between the first channel and the second channel are introduced in determining the speech presence probability, the normalization and non-linear transformation processing is performed on the above-mentioned metric parameters, and the speech presence probability is obtained by fitting the product term and a first power term of a power exponent of the above-mentioned parameters. Therefore, the calculation amount of calculating the speech presence probability is reduced, the calculation result has good robustness to parameter fluctuations, and the disclosure can be widely applied to various application scenarios of dual-microphone speech enhancement systems. 1. A method for determining a speech presence probability , applied to a first microphone and a second microphone configured with an End-fire structure , comprising:calculating a first metric parameter and a second metric parameter according to a signal of a first channel collected by the first microphone and a signal of a second channel collected by the second microphone, wherein the first metric parameter is a signal to noise ratio of the signal of the first channel, and the second metric parameter is a signal power level difference between the first channel and the second channel;performing normalization and non-linear transformation processing on the first metric parameter and the second metric parameter respectively to obtain a third metric parameter and a fourth metric parameter; andcalculating a speech presence probability according to the third metric parameter, the fourth metric parameter, and a predetermined formula for calculating a speech presence probability, wherein the calculating formula is obtained by fitting the product term ...

Подробнее
29-09-2022 дата публикации

SUPERIMPOSING HIGH-FREQUENCY COPIES OF EMITTED SOUNDS

Номер: US20220310111A1
Принадлежит:

An audio emitter configured to emit a sound creates a high-frequency copy of the sound to be emitted. The high-frequency copy of the sound is superimposed over the sound, resulting in a composite signal. The composite signal is emitted by the emitter. The high-frequency copy is at a frequency inaudible to humans, enabling a receiver to identify the emitter and/or the sound. 1. A method , comprising:obtaining an audio signal to be emitted from an emitter device;creating an inaudible frequency copy of the audio signal, wherein the creating comprises applying a frequency change to the audio signal; andtransmitting the audio signal and the inaudible frequency copy of the audio signal concurrently.2. The method of claim 1 , further comprising:applying a configured amplitude reduction to the inaudible frequency copy of the audio signal to reduce the volume of the inaudible frequency copy.3. The method of claim 1 , wherein applying the frequency change changes the copy of the audio signal to an inaudible frequency that does not overlap with the audio signal.4. The method of claim 1 , wherein the creating further comprises inverting audio signal and wherein the inaudible frequency copy of the audio signal in an inverted inaudible frequency copy.5. The method of claim 1 , wherein the creating further comprises:applying a plurality of frequency changes at different frequencies concurrently to result in a plurality of inaudible frequency copies; andwherein transmitting the audio signal and the inaudible frequency copy of the audio signal transmits the audio signal and each of the plurality of inaudible frequency copies.6. The method of claim 1 , wherein the creating further comprises:applying a plurality of frequency changes at different frequencies sequentially to change the applied frequency change over time.7. The method of claim 1 , further comprising configuring the frequency change for the emitter device and communicating the configuration to a receiver device.8. A ...

Подробнее
01-07-2021 дата публикации

BACKGROUND NOISE ESTIMATION AND VOICE ACTIVITY DETECTION SYSTEM

Номер: US20210201936A1
Принадлежит:

A method includes selecting a frame of an audio signal. The method further includes determining a first power spectral density (PSD) distribution of the frame. The method further includes generating a first reference PSD distribution indicating an estimate of background noise in the frame based on a non-linear weight, a second reference PSD distribution of a previous frame of the audio signal, and a second PSD distribution of the previous frame. The method further includes determining whether voice activity is detected in the frame based on the first PSD distribution of the frame and the first reference PSD distribution. 1. A method comprising:selecting a frame of an audio signal;determining a first power spectral density (PSD) distribution of the frame;generating a first reference PSD distribution indicating an estimate of background noise in the frame based on a non-linear weight, a second reference PSD distribution of a previous frame of the audio signal, and a second PSD distribution of the previous frame; anddetermining whether voice activity is detected in the frame based on the first PSD distribution of the frame and the first reference PSD distribution.2. The method of claim 1 , further comprising generating the non-linear weight based on a divergence between the second PSD distribution and the second reference PSD distribution.3. The method of claim 2 , wherein the divergence corresponds to a Kullback-Leibler divergence.4. The method of claim 1 , further comprising:generating a first entropy of the first PSD distribution;generating a second entropy of the first reference PSD distribution;generating a first energy of the first reference PSD distribution; andgenerating a second energy of the first reference PSD distribution, wherein determining whether voice activity is detected in the frame based on the first PSD distribution of the frame and the first reference PSD distribution includes determining whether voice activity is detected in the frame based on ...

Подробнее
23-06-2016 дата публикации

HEARING DEVICE WITH IMAGE CAPTURE CAPABILITIES

Номер: US20160183014A1
Принадлежит: OTICON A/S

The present disclosure relates to hearing devices including an image capture device. Features of sound processing may be based on analysis of images from the image capture device. 1. Hearing device having a first part configured to be positioned behind the pinna of a user and a second part configured to be positioned in the ear canal of the user , a third part configured to mechanically connect the first part to the second part , the first part comprising: 'a processor for processing the received sound signal,', 'an input transducer for receiving sound signal,'} the processor is configured to detect presence of a face via the image capture device, and determine time instants of voice presence and voice absence from the face, and the processor is adapted to operate signal processing algorithms based on the detection, and', 'the hearing device including an output transducer for providing the processed sound signal perceivable as sound to the user, the output transducer being positioned in the first part or the second part., 'an image capture device in communication with the processor, the image capture device positioned in the housing so that the image capture captures images in the direction of the nose of the wearer, and'}2. The hearing device according to claim 1 , wherein the determination of time instants of the voice presence and voice absence is based on a combination of the image capture device and the input transducer.3. The hearing device according to claim 1 , wherein the image capture device is positioned remote from a housing of the hearing device and the image capture device is in wired or wireless communication with the processor.4. The hearing device according to claim 1 , wherein the processor is configured to detect lip movement in image sequences captured by the image capture device claim 1 , such as the processor is configured to identifying presence of vowels and/or consonants in speech claim 1 , and/or the processor is configured to determining ...

Подробнее
30-06-2016 дата публикации

APPARATUS AND METHOD FOR AUTOMATICALLY CREATING AND RECORDING MINUTES OF MEETING

Номер: US20160189107A1
Автор: LIU YOUNG-WAY
Принадлежит:

A computing device for automatically acquiring and revising minutes of a meeting and a method thereof includes the steps of: identifying one or more silences or notional silences (unvoiced segments) in voice data; determining a segment as being a satisfactory unvoiced segment if the gap of silence lasts for a time period equal to or larger than a predetermined period; dividing the audio data or text representing the audio data into one or more passages of text according to the satisfactory unvoiced segment, and creating an original minutes of the meeting according to the audio data or the representative text being divided into passages and a meeting minutes template stored in the non-transitory storage medium. 1. A electronic apparatus-based method for automatically creating minutes of a meeting the electronic device having at least one processor and a non-transitory storage medium coupled to the at least one processor and configured to store instructions , the method comprising:identifying, by the at least one processor, one or more unvoiced segments of audio data;determining, by the at least one processor, a segment as being a satisfactory unvoiced segment if the gap of silence lasts for a time period equal to or larger than a predetermined period;dividing, by the at least one processor, the audio data or text associating with the audio data into one or more passages of text according to the satisfactory unvoiced segment; andcreating, by the at least one processor, an original minutes of the meeting according to the divided audio data or the divided text and a meeting minutes template stored in the non-transitory storage medium.2. The method as claimed in claim 1 , further comprising: editing the original minutes of the meeting according to at least one predetermined revising and editing rule claim 1 , to obtain a minutes of the meeting.3. The method as claimed in claim 1 , wherein:the “identifying” comprises: identifying the one or more unvoiced segments of audio ...

Подробнее
30-06-2016 дата публикации

APPARATUS AND METHOD FOR AUTOMATICALLY CREATING AND RECORDING MINUTES OF MEETING

Номер: US20160189713A1
Автор: LIU YOUNG-WAY
Принадлежит:

An electronic apparatus for automatically acquiring and revising minutes of a meeting and a method thereof includes the steps of identifying one or more speakers from audio signals which are recorded during a meeting, based on pre-sampled audio signals and a voice feature table stored in a non-transitory storage medium. The audio signals are converted to text and divided into paragraphs, one paragraph being attributable to one speaker, and each speaker has a given user name. An original minutes of the meeting, based on the text and a meeting minutes template stored in the non-transitory storage medium, is prepared and revised and issued to all relevant persons. 1. An electronic apparatus-based method for automatically creating minutes of a meeting , the electronic device having at least one processor and a non-transitory storage medium coupled to the at least one processor and configured to store instructions , the method comprising:identifying, by the at least one processor, one or more users corresponding to audio signals of speech received during a meeting, based on the audio signals and a voice feature table stored in the non-transitory storage medium;converting, by the at least one processor, the audio signals of speech to a text comprising one or more user names of the identified one or more users, each user having a user name; andcreating, by the at least one processor, an original minutes of the meeting according to the text and a meeting minutes template stored in the non-transitory storage medium.2. The method as claimed in claim 1 , further comprising: editing the original minutes of the meeting according to at least one predetermined revising and editing rule claim 1 , to obtain a minutes of the meeting.3. The method as claimed in claim 2 , wherein the at least one predetermined revising and editing rule is to divide the text into one or more passages or paragraphs claim 2 , at the beginning of each is user name of an attendee of the meeting.4. The ...

Подробнее
13-06-2019 дата публикации

KEYWORD CONFIRMATION METHOD AND APPARATUS

Номер: US20190180734A1
Автор: Liu Yong, Yao Haitao
Принадлежит:

A keyword confirmation method and apparatus are provided. A keyword confirmation method includes: obtaining first audio data, the first audio data being recognized as a keyword; obtaining a pronunciation similarity probability of a similar pronunciation unit corresponding to at least one fragment of the first audio data and second audio data; determining that multiple contiguous silence fragments exist in second audio data contiguous in time with the first audio data; utilizing the silence probability, as well as a pronunciation similarity probability corresponding to fragment(s) of the first audio data and/or a pronunciation similarity probability corresponding to fragment(s) of the second audio data, evaluating whether the second audio data is silence; and confirming the first audio data as an effective keyword. 1. A method comprising:obtaining first audio data, the first audio data being recognized as a keyword;determining that multiple contiguous silence fragments exist in second audio data contiguous in time with the first audio data; andconfirming the first audio data as an effective keyword.2. The method of claim 1 , wherein the step of determining that multiple contiguous silence fragments exist in second audio data contiguous in time with the first audio data comprises:determining a pronunciation similarity probability of the fragments, the pronunciation similarity probability being the largest similarity probability of the fragments among multiple pronunciation units;determining a silence probability of the fragments, the silence probability being a similarity probability between the fragments and silence units;determining the fragments as silence fragments upon determining that a relationship between the pronunciation similarity probability and the silence probability satisfies a preset condition; anddetermining, based on the determined silence fragments, that the second audio data has multiple contiguous silence fragments therein.3. The method of claim 1 ...

Подробнее
16-07-2015 дата публикации

DETECTION OF CHOPPED SPEECH

Номер: US20150199979A1
Принадлежит:

Methods and systems are provided for detecting chop in an audio signal. A time-frequency representation, such as a spectrogram, is created for an audio signal and used to calculate a gradient of mean power per frame of the audio signal. Positive and negative gradients are defined for the signal based on the gradient of mean power, and a maximum overlap offset between the positive and negative gradients is determined by calculating a value that maximizes the cross-correlation of the positive and negative gradients. The negative gradient values may be combined (e.g., summed) with the overlap offset, and the combined values then compared with a threshold to estimate the amount of chop present in the audio signal. The chop detection model provided is low-complexity and is applicable to narrowband, wideband, and superwideband speech. 1. A method for detecting chop in an audio signal , the method comprising:creating a time-frequency representation for an audio signal;calculating a gradient of mean power per frame of the audio signal based on the time-frequency representation;determining an overlap offset between positive values of the gradient and negative values of the gradient;combining the positive values of the gradient or the negative values of the gradient with the overlap offset; andestimating an amount of chop in the audio signal based on a comparison of the combined values to a threshold.2. The method of claim 1 , further comprising defining positive and negative gradient signals based on the calculated gradient of mean power claim 1 , wherein the positive gradient signal includes the positive values of the gradient and the negative gradient signal includes the negative values of the gradient.3. The method of claim 2 , wherein determining the overlap offset between the positive values of the gradient and the negative values of the gradient includes calculating a value that maximizes the cross-correlation of the positive gradient signal and the negative gradient ...

Подробнее
06-07-2017 дата публикации

ACOUSTIC MAP COMMAND CONTEXTUALIZATION AND DEVICE CONTROL

Номер: US20170194008A1
Принадлежит:

Provided is a system where users performing a coordinated process are localized in a complex environment based upon audio input. Audio commands are detected and executed based on system user vocalization. Available commands are limited by user status, location, process type and process progress. Command execution is limited by the presence and locations of system users, non-users, or extraneous equipment. 1. A system comprising:acoustical sensors disposed as an array, coupled to an analog to digital converter and a digital data output transceiver, forming an acoustic array sensor;at least one acoustic array sensor operatively connected to an audio analysis processor;the audio analysis processor operatively connected to a command database and a control module;the command and control module operatively connected to one or more automated machines.2. The system of where in multiple array sensor are distributed throughout a designated control area.3. The system of wherein the audio analysis processor is configured to receive inputs from multiple audio acquisition arrays claim 1 , and parse the inputs to determine at least one of the presence claim 1 , identity claim 1 , and location of multiple users relative to the positions of the audio acquisition arrays.4. The system of wherein the database is interchangeably localized to specific languages.5. The system of wherein the database is interchangeably configured to contain commands relevant to a particular process.6. The system of wherein the acoustic array sensors are distributed throughout multiple distinct environments containing at least one of a multiplicity of individual users and machines.7. The system of wherein non-vocal audio signals are used to at least one of identify claim 1 , localize claim 1 , and command automated machines responsive to the command and control module.8. The system of comprising at least two audio sensors oriented to cover a separated designated primary area claim 1 , further oriented to ...

Подробнее
06-07-2017 дата публикации

BAND EXPANDER, RECEPTION DEVICE, BAND EXPANDING METHOD FOR EXPANDING SIGNAL BAND

Номер: US20170194013A1
Автор: ONODA Tatsuya
Принадлежит: JVC KENWOOD CORPORATION

An oversampling LPF unit receives a sound signal. A differentiator differentiates the sound signal. An overtone computation unit generates an overtone signal by multiplying a signal differentiated by the differentiator by the sound signal from the oversampling LPF unit. A HPF unit filters the overtone signal generated by the overtone computation unit. A combiner combines the overtone signal filtered by the HPF unit and the sound signal from the oversampling LPF unit. 1. A band expander , comprising:a differentiator that differentiates an input signal;an overtone computation unit that generates an overtone signal by multiplying a signal differentiated by the differentiator by the input signal;a high-pass filter unit that filters the overtone signal generated by the overtone computation unit; anda combiner that combines the overtone signal filtered by the high-pass filter unit and the input signal.2. A reception device provided with a band expansion function supplied with a received sound signal as an input signal and expanding frequency band of the input signal , comprising:a differentiator that differentiates the input signal;an overtone computation unit that generates an overtone signal by multiplying a signal differentiated by the differentiator by the input signal;a high-pass filter unit that filters the overtone signal generated by the overtone computation unit; anda combiner that combines the overtone signal filtered by the high-pass filter unit and the input signal.3. The band expander according to claim 1 , whereinthe input signal is sampled at a first sampling frequency and the sampled at a second sampling frequency higher than the first sampling frequency, anda cutoff frequency of the high-pass filter unit is configured to be equal to or less than ½ of the first sampling frequency.4. The band expander according to claim 1 , further comprising:a voiced/unvoiced detector that detects whether the input signal represents a voiced sound or an unvoiced sound; ...

Подробнее
21-07-2016 дата публикации

SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND SIGNAL PROCESSING PROGRAM

Номер: US20160210987A1
Автор: Sugiyama Akihiko
Принадлежит:

This invention enables to effectively detect an abrupt change in a signal. The signal processing apparatus includes a converter that converts an input signal into a phase component signal in a frequency domain, a first calculator that calculates a first phase gradient as a gradient of the phase of the phase component signal, a second calculator that calculates a second phase gradient using the first phase gradients at a plurality of frequencies, and a determiner that determines existability concerning an abrupt change in the input signal based on the first phase gradients and the second phase gradient. 1. A signal processing apparatus comprising:a converter that converts an input signal into a phase component signal in a frequency domain;a first calculator that calculates a first phase gradient of the phase component signal for each of a plurality of frequencies;a second calculator that calculates a second phase gradient at a plurality of frequencies using the first phase gradients; anda determiner that determines presence of an abrupt change in the input signal based on the first phase gradient and the second phase gradient.2. The signal processing apparatus according to claim 1 , wherein said second calculator calculates the second phase gradient at a plurality of frequencies using the first phase gradient claim 1 , and an amplitude or a power3. The signal processing apparatus according to claim 1 , wherein said determiner determines the presence of an abrupt change in the input signal based on a similarity between the first phase gradient and the second phase gradient.4. The signal processing apparatus according to claim 3 , wherein said determiner determines that the abrupt change in the signal exists at a frequency at which a difference between the first phase gradient and the second phase gradient does not exceed a predetermined value.5. The signal processing apparatus according to claim 1 , wherein said second calculator calculates an average value of the ...

Подробнее
02-10-2014 дата публикации

INTELLIGENT INTERACTIVE VOICE COMMUNICATION SYSTEM AND METHOD

Номер: US20140297272A1
Автор: Saleh Fahim
Принадлежит:

The present invention generally relates to intelligent voice communication systems. Specifically, this invention relates to systems and methods for providing intelligent interactive voice communication services to users of a telephony means. Preferred embodiments of the invention are directed to providing interactive voice communication services in the form of intelligent and interactive automated prank calling services. 1. A web-based system for providing intelligent interactive voice communications , the system comprising:a voice processing module comprising computer-executable code stored in non-volatile memory;a response processing module comprising computer-executable code stored in non-volatile memory;a processor; anda communications means,wherein said voice processing module, said response processing module, said processor, and said communications means are operably connected and are configured to:receive a voice communication from a call participant;identify one or more complex speech elements from said voice communication, wherein said one or more complex speech elements are selected from the group comprising tone, pitch, inflection, pause, tempo, volume, consistency and fluidity;generate a speech analysis based on said one or more complex speech elements;determine a response, wherein said response is based at least in part on said speech analysis;transmit said response via said communications means.2. The system of claim 1 , wherein said response is a complex response type claim 1 , selected from the group comprising an interruption claim 1 , a sound response claim 1 , a third-party contact inclusion and a switch in voice response.3. The system of claim 2 , wherein said complex response type is an interruption that is transmitted concurrently with receipt of said voice communication.4. The system of claim 2 , wherein said speech analysis comprises information selected from the group comprising call participant gender claim 2 , call participant tone claim 2 ...

Подробнее
27-07-2017 дата публикации

ROOM PRIVACY DEVICE

Номер: US20170213440A1
Автор: OLarte David J.
Принадлежит:

A room privacy device sounds an audible alarm when a door to a room is not closed and sound is detected in the room. This helps prompt people in the room to close the door to maintain the privacy of the conversation and/or prevent sound in the room from disturbing others outside the room. In some embodiments, the room privacy device includes a door sensor for detecting whether a door is open, a microphone for capturing sound, and a speaker for providing an audible alarm. The room privacy device also includes a processor that causes the speaker to sound an audible alarm based on a signal from the door sensor indicative of the door being open and a signal from the microphone indicative of sound being detected. 1. A device for providing a room privacy alarm , the device comprising:a door sensor;a microphone;a speaker; anda processor configured to receive a first signal from the door sensor, receive a second signal from the microphone, and cause the speaker to sound an audible alarm in response to determining a door is open based on the first signal in conjunction with identifying sound being detected by the microphone with a volume that satisfies a threshold volume level based on the second signal.2. The device of claim 1 , wherein the door sensor comprises two electrical contacts.3. The device of claim 1 , wherein the door sensor comprises a magnetic switch and a magnet.4. The device of claim 1 , wherein the door sensor comprises a photoelectric sensor and photoelectric transmitter.5. The device of claim 1 , wherein the microphone comprises a directional microphone.6. The device of claim 1 , wherein the device includes a main body having an enclosure that houses the microphone claim 1 , the speaker claim 1 , and the processor.7. The device of claim 1 , wherein the processor causes the speaker to sound the audible alarm when the sound of the second signal detected using the microphone is determined to comprise human speech.8. The device of claim 15 , wherein the device ...

Подробнее
26-07-2018 дата публикации

METHOD FOR IMPROVING A PICKED-UP SIGNAL IN A HEARING SYSTEM AND BINAURAL HEARING SYSTEM

Номер: US20180213337A1
Принадлежит:

A method improves a picked-up signal in a hearing system. The hearing system has at least one hearing device, particularly a hearing aid. The hearing aid device has an associated first directional microphone that has an adjustable first directional characteristic with a preferential direction. The first directional microphone converts sound into a first signal that is adopted in the picked-up signal. A speech activity of a user of the hearing system is monitored, and recognition of a speech activity of the user prompts the preferential direction of the first directional characteristic to be adjusted in comparison with a frontal direction of the user such that the sound sensitivity of the first directional microphone undergoes attenuation in the frontal direction. 1. A method for improving a picked-up signal in a hearing system having at least one hearing device , the at least one hearing aid device containing an associated first directional microphone having an adjustable first directional characteristic with a preferential direction , which comprises the steps of:converting, via the first directional microphone, sound into a first signal that is adopted in the picked-up signal;monitoring speech activity of a user of the hearing system; andprompting the preferential direction of the first directional characteristic to be adjusted in comparison with a frontal direction of the user such that sound sensitivity of the first directional microphone undergoes attenuation in the frontal direction in dependence on recognition of the speech activity of the user.2. The method according to claim 1 , which further comprises adjusting the preferential direction of the first directional characteristic at an angle of between 5° and 20° in comparison with the frontal direction of the user.3. The method according to claim 1 , which further comprises monitoring the picked-up signal for a hearing situation having a directional main sound source claim 1 , and wherein without recognition ...

Подробнее
16-10-2014 дата публикации

Information Processing Apparatus and Control Method

Номер: US20140309997A1
Автор: Hanyu Tomohiro
Принадлежит: KABUSHIKI KAISHA TOSHIBA

One embodiment provides an information processing apparatus, including: a sound synthesizer configured to output a combined sound signal of an alarm sound signal and a remaining sound signal other than the alarm sound signal; a sound separator configured to separate the combined sound signal into a human voice signal and a background sound signal; an alarm sound detector configured to detect whether a background sound corresponding to the background sound signal output from the sound separator includes an alarm sound or not; and an alarm sound receiver configured to, when the alarm sound detector detects that the background sound includes the alarm sound, read the alarm sound signal corresponding to the detected alarm sound, and further combine the read alarm sound signal with the combined sound signal to thereby output an adjusted combined sound signal. 1. An information processing apparatus , comprising:a sound synthesizer configured to output a combined sound signal of an alarm sound signal and a remaining sound signal other than the alarm sound signal;a sound separator configured to separate the combined sound signal into a human voice signal and a background sound signal;an alarm sound detector configured to detect whether a background sound corresponding to the background sound signal output from the sound separator includes an alarm sound or not; and read the alarm sound signal corresponding to the detected alarm sound, and', 'further combine the read alarm sound signal with the combined sound signal to thereby output an adjusted combined sound signal., 'an alarm sound receiver configured to, when the alarm sound detector detects that the background sound includes the alarm sound,'}2. The apparatus of claim 1 ,wherein the alarm sound detector detects whether the alarm sound is being emitted or not from the combined sound signal.3. The apparatus of claim 1 , further comprising:an alarm sound data storage configured to hold alarm sound data,wherein the alarm ...

Подробнее
20-08-2015 дата публикации

APPARATUS FOR DETECTING VOICE AND CONTROLLING METHOD THEREOF

Номер: US20150235656A1
Автор: Kim Joo Hyun
Принадлежит: SAMSUNG ELECTRO-MECHANICS CO., LTD.

Embodiments of the invention provide a method and apparatus for detecting a voice. The apparatus, according to various embodiments of the invention, includes a driving signal processing unit configured to generate a driving signal using a first output voltage induced from an initial voice signal sensed by a piezoelectric and a first preset reference voltage. The apparatus, according to various embodiments of the invention, further includes a controlling unit configured to be turned-on only in the case in which the driving signal is applied from the driving signal processing unit to receive a voice signal from a microphone and determining whether or not a specific word is included in the voice signal. 1. An apparatus for detecting a voice , the apparatus comprising:a driving signal processing unit configured to generate a driving signal using a first output voltage induced from an initial voice signal sensed by a piezoelectric and a first preset reference voltage; anda controlling unit configured to be turned-on only in the case in which the driving signal is applied from the driving signal processing unit to receive a voice signal from a microphone and determining whether or not a specific word is included in the voice signal.2. The apparatus for detecting the voice as set forth in claim 1 , wherein claim 1 , when the driving signal is applied claim 1 , the controlling unit is configured to vary the first reference voltage in the case in which the specific word is not detected from the voice signal for a predetermined time claim 1 , and claim 1 , when the driving signal is not applied claim 1 , the controlling unit is configured to compare the first reference voltage and a second reference voltage with each other at a predetermined period to vary the first reference voltage.3. The apparatus for detecting the voice as set forth in claim 1 , wherein the driving signal processing unit comprises:an initial voice signal sensing module configured to amplify a voltage ...

Подробнее
25-07-2019 дата публикации

AUDIO CIRCUIT AND METHOD FOR DETECTING SOUND ACTIVITY

Номер: US20190227614A1
Автор: Grand Emmanuel
Принадлежит:

A circuit for sound activity detection includes a transducer adapted to generate an electrical signal based on detected sound; a variable gain amplifier adapted to amplify the electrical signal to generate an amplified electrical signal; a comparator adapted to compare the amplified electrical signal with at least one first threshold level to generate a comparison signal indicating comparator events; and a control circuit adapted to generate, based on the comparison signal, a gain control signal for controlling the gain of the variable gain amplifier, and a sound activity alert signal indicating the detection of sound activity. 1. A circuit for sound activity detection comprising:a transducer adapted to generate an electrical signal based on detected sound;a variable gain amplifier adapted to amplify the electrical signal to generate an amplified electrical signal;a comparator adapted to compare the amplified electrical signal with at least one first threshold level to generate a comparison signal indicating comparator events; anda control circuit adapted to generate, based on the comparison signal (COMP), a gain control signal for controlling the gain of the variable gain amplifier, and a sound activity alert signal indicating the detection of sound activity, wherein the control circuit is adapted to assert the sound activity alert signal if the gain control signal remains below a second threshold for more than a second time duration.2. The circuit of claim 1 , wherein the control circuit is configured to generate the gain control signal in order to decrease the gain if the comparison signal indicates a comparator event having a duration greater than a first time duration.3. The circuit of claim 2 , wherein the control circuit is configured to generate the gain control signal in order to decrease the gain in response to each comparator event indicated by the comparison signal.4. The circuit of . wherein the control circuit is configured to generate the second ...

Подробнее
13-11-2014 дата публикации

Method and Apparatus for Performing Voice Activity Detection

Номер: US20140337020A1
Автор: Wang Zhe
Принадлежит:

A voice activity detection (VAD) apparatus configured to provide a voice activity detection decision for an input audio signal. The VAD apparatus includes a state detector and a voice activity calculator. The state detector is configured to determine, based on the input audio signal, a current working state of the VAD apparatus among at least two different working states. Each of the at least two different working states is associated with a corresponding working state parameter decision set which includes at least one voice activity detection parameter. The voice activity calculator is configured to calculate a voice activity detection parameter value for the at least one voice activity detection parameter of the working state parameter decision set associated with the current working state, and to provide the voice activity detection decision by comparing the calculated voice activity detection parameter value with a threshold. 1. A voice activity detection (VAD) apparatus comprising:a receiving unit configured to receive an input audio signal;a state detector configured to determine a current working state of the VAD apparatus based on the input audio signal, wherein the VAD apparatus has at least two different working states, wherein each of the at least two different working states is associated with a corresponding working state parameter decision set (WSPDS), and wherein each WSPDS includes at least one voice activity detection parameter (VADP);a voice activity calculator configured to calculate a value for the at least one VADP of the WSPDS associated with the current working state, and to generate a voice activity detection decision (VADD) by comparing the calculated VADP value with a threshold; andan output unit configured to output the VADD.2. The VAD apparatus according to claim 1 , wherein the VADD is generated by the voice activity calculator by using sub-band segmental signal to noise ratio (SNR) based voice activity detection parameters (VADPs).3. ...

Подробнее
30-08-2018 дата публикации

Method and device for detecting voice activity based on image information

Номер: US20180247651A1

Provided is a method of detecting a voice section, including detecting from at least one image an area where lips exist, obtaining a feature value of movement of the lips in the detected area based on a difference between pixel values of pixels included in the detected area, and detecting the voice section from the at least one image based on the feature value.

Подробнее
30-08-2018 дата публикации

Detector and Method for Voice Activity Detection

Номер: US20180247661A1
Автор: SEHLSTEDT Martin
Принадлежит:

A signal activity detector (SAD) combines at least three decision signals to generate a combined decision signal as input to a hangover addition circuit of the SAD. Each of the decision signals indicating whether or not activity is detected in the input signal according to respective decision criteria. The SAD sends the combined decision signal to the hangover addition circuit to generate a final decision signal of the SAD as to whether or not activity is detected in the input signal. 1. A method , implemented in a signal activity detector (SAD) , for detecting activity in an input signal , the method comprising:combining at least three decision signals to generate a combined decision signal as input to a hangover addition circuit of the SAD, each of the decision signals indicating whether or not activity is detected in the input signal according to respective decision criteria;sending the combined decision signal to the hangover addition circuit to generate a final decision signal of the SAD as to whether or not activity is detected in the input signal.2. The method of claim 1 , further comprising receiving the decision signals from one or more processing circuits of the SAD configured to apply the decision criteria to the input signal to generate the decision signals.3. The method of claim 1 , wherein the decision criteria of at least one of the decision signals is without regard to hangover.4. The method of claim 1 , wherein the decision criteria of at least one of the decision signals is based on hangover.5. The method of claim 1 , wherein the combining comprises combining by a logical AND of at least two of the decision signals.6. The method of claim 1 , wherein the combining comprises combining by a logical OR of at least two of the decision signals.7. The method of claim 1 , further comprising selecting a combination logic for the combining based on properties of the input signal.8. The method of claim 1 , wherein the combining corrects a false indication of ...

Подробнее
20-11-2014 дата публикации

Method, Apparatus, and Speech Synthesis System for Classifying Unvoiced and Voiced Sound

Номер: US20140343934A1
Автор: TANG Zongyao

A method, apparatus, and speech synthesis system are disclosed for classifying unvoiced and voiced sound. The method includes: setting an unvoiced and voiced sound classification question set; using speech training data and the unvoiced and voiced sound classification question set for training a sound classification model of a binary decision tree structure, where the binary decision tree structure includes non-leaf nodes and leaf nodes, the non-leaf nodes represent questions in the unvoiced and voiced sound classification question set, and the leaf nodes represent unvoiced and voiced sound classification results; and receiving speech test data, and using the trained sound classification model to decide whether the speech test data is unvoiced sound or voiced sound. 1. A method for classifying unvoiced and voiced sound , comprising:setting an unvoiced and voiced sound classification question set;using speech training data and the unvoiced and voiced sound classification question set for training a sound classification model of a binary decision tree structure, wherein the binary decision tree structure comprises non-leaf nodes and leaf nodes, the non-leaf nodes represent questions in the unvoiced and voiced sound classification question set, and the leaf nodes represent unvoiced and voiced sound classification results; andreceiving speech test data, and using the trained sound classification model to decide whether the speech test data is unvoiced sound or voiced sound.2. The method of claim 1 , further comprising:setting an excitation signal of the speech test data to be an impulse response sequence when the speech test data is decided to be a voiced sound; andsetting the excitation signal of the speech test data to be a white noise when the speech test data is decided to be an unvoiced sound.3. The method of claim 1 , wherein using speech training data and the unvoiced and voiced sound classification question set for training the sound classification model of a ...

Подробнее
24-09-2015 дата публикации

ADAPTIVE MICROPHONE SAMPLING RATE TECHNIQUES

Номер: US20150269954A1
Принадлежит:

An apparatus for adjusting a microphone sampling rate, the apparatus including an input to receive an audio signal from a microphone and a front-end processing module. The front-end processing module is to generate a plurality of frames from the audio signal received by the microphone, determine a noise profile using the plurality of frames, and adjust a sampling rate of the microphone based on the determined noise profile. 1. An apparatus for adjusting a microphone sampling rate , the apparatus comprising:an input to receive an audio signal from a microphone; and generate a plurality of frames from the audio signal received by the microphone;', 'determine a noise profile using the plurality of frames; and', 'adjust a sampling rate of the microphone based on the determined noise profile., 'a front-end processing module to2. The apparatus of claim 1 , wherein the front-end processing module is to:calculate at least one Mel-frequency cepstral coefficient (MFCC) for each of the plurality of frames; anddetect a presence of a human voice based on the at least one MFCC.3. The apparatus of claim 2 , wherein the front-end processing module is to detect a presence of a human voice based on the at least one MFCC and the noise profile is configured to:detect a presence of the human voice when the at least one MFCC is above a threshold value.4. The apparatus of claim 2 , wherein to adjust the sampling rate of the microphone based on the determined noise profile claim 2 , the front-end processing module is to adjust the sampling rate of the microphone from a first sampling rate to a second sampling rate based on the determined noise profile.5. The apparatus of claim 4 , wherein the first sampling rate is higher than the second sampling rate claim 4 , and wherein the front-end processing module is to adjust the sampling rate of the microphone from the first sampling rate to the second sampling rate when the determined noise profile is below a threshold.6. The apparatus of claim 2 ...

Подробнее
28-10-2021 дата публикации

SIGNAL PROCESSING APPARATUS, SIGNAL PROCESSING METHOD, AND SIGNAL PROCESSING PROGRAM

Номер: US20210335379A1
Принадлежит:

This invention provides a signal processing apparatus capable of obtaining an output signal of sufficiently high quality if the phase of an input signal is largely different from the phase of a true voice. The signal processing apparatus includes a voice detector that receives a mixed signal including a voice and a signal other than the voice and obtains existence of the voice as a voice flag, a corrector that receives the mixed signal and the voice flag and obtains a corrected mixed signal generated by correcting the mixed signal in accordance with a state of the voice flag, and a shaper that receives the corrected mixed signal and shapes the corrected mixed signal. 1. A signal processing apparatus comprising:a voice detector that receives a mixed signal including a voice and a signal other than the voice and obtains existence of the voice as a voice flag;a corrector that receives the mixed signal and the voice flag and obtains a corrected mixed signal generated by correcting the mixed signal in accordance with a state of the voice flag; anda shaper that receives the corrected mixed signal and shapes the corrected mixed signal.2. A signal processing apparatus comprising:a converter that receives a mixed signal including a voice and a signal other than the voice and obtains amplitudes and phases corresponding to a plurality of frequency components;a voice detector that obtains existence of a voice included in the amplitude as a voice flag;an amplitude corrector that receives the mixed signal and the voice flag and obtains a corrected amplitude generated by correcting the amplitude in accordance with a state of the voice flag;an inverse converter that receives the corrected amplitude and the phase and converts the corrected amplitude and the phase into a time domain signal; anda shaper that shapes the time domain signal.3. The signal processing apparatus according to claim 2 , further comprising:an impact sound detector that receives the amplitude and the phase and ...

Подробнее
21-09-2017 дата публикации

SPEECH SIGNAL PROCESSING CIRCUIT

Номер: US20170270946A1
Принадлежит:

A speech-signal-processing-circuit configured to receive a time-frequency-domain-reference-speech-signal and a time-frequency-domain-degraded-speech-signal. The time-frequency-domain-reference-speech-signal comprises: an upper-band-reference-component with frequencies that are greater than a frequency-threshold-value; and a lower-band-reference-component with frequencies that are less than the frequency-threshold-value. The time-frequency-domain-degraded-speech-signal comprises: an upper-band-degraded-component with frequencies that are greater than the frequency-threshold-value; and a lower-band-degraded-component with frequencies that are less than the frequency-threshold-value. The speech-signal-processing-circuit comprises: a disturbance calculator configured to determine one or more SBR-features based on the time-frequency-domain-reference-speech-signal and the time-frequency-domain-degraded-speech-signal by: for each of a plurality of frames: determining a reference-ratio based on the ratio of (i) the upper-band-reference-component to (ii) the lower-band-reference-component; determining a degraded-ratio based on the ratio of (i) the upper-band-degraded-component to (ii) the lower-band-degraded-component; and determining a spectral-balance-ratio based on the ratio of the reference-ratio to the degraded-ratio; and (ii) determining the one or more SBR-features based on the spectral-balance-ratio for the plurality of frames.

Подробнее
22-09-2016 дата публикации

METHODS AND SYSTEMS FOR ENHANCING PITCH ASSOCIATED WITH AN AUDIO SIGNAL PRESENTED TO A COCHLEAR IMPLANT PATIENT

Номер: US20160277849A1
Принадлежит:

An exemplary sound processor 1) identifies at least one frequency bin, included in a plurality of frequency bins included in a frequency spectrum of an audio signal that is presented to a cochlear implant patient, that contains spectral energy above a modified spectral envelope, 2) identifies each frequency bin that contains spectral energy below the modified spectral envelope, 3) enhances the spectral energy contained in the at least one frequency bin identified as containing spectral energy above the modified spectral envelope, and 4) compresses the spectral energy contained in each frequency bin identified as containing spectral energy below the modified spectral envelope. 1. A system comprising: identifies at least one frequency bin, included in a plurality of frequency bins included in a frequency spectrum of an audio signal that is presented to a cochlear implant patient, that contains spectral energy above a modified spectral envelope of the frequency spectrum of the audio signal,', 'identifies each frequency bin included in the plurality of frequency bins that contains spectral energy below the modified spectral envelope,', 'enhances the spectral energy contained in the at least one frequency bin identified as containing spectral energy above the modified spectral envelope by applying a positive gain to the spectral energy contained in the at least one frequency bin identified as containing spectral energy above the modified spectral envelope, and', 'compresses the spectral energy contained in each frequency bin identified as containing spectral energy below the modified spectral envelope by applying a negative gain to the spectral energy contained in each frequency bin identified as containing spectral energy below the modified spectral envelope., 'a sound processor that'}2. The system of claim 1 , wherein the sound processor enhances the spectral energy contained in the at least one frequency bin identified as containing spectral energy above the modified ...

Подробнее
18-11-2021 дата публикации

CONTROLLER, CONTROLLED APPARATUS, CONTROL METHOD, AND RECORDING MEDIUM

Номер: US20210354300A1
Принадлежит:

A controller includes at least one memory, and at least one processor. The at least one processor is configured to acquire speech, recognize the speech, determine whether the speech is uttered in a quiet voice, and control a movable part of a controlled apparatus in accordance with a result of the speech recognition. The at least one processor is configured to control the movable part of the controlled apparatus such that a sound pressure level of a sound generated by the movable part of the controlled apparatus is lower when it is determined that the speech is uttered in the quiet voice than when it is determined that the speech is not uttered in the quiet voice. 1. A controller comprising:at least one memory; and acquire speech,', 'recognize the speech,', 'determine whether the speech is uttered in a quiet voice, and', 'control a movable part of a controlled apparatus in accordance with a result of the speech recognition,, 'at least one processor configured towherein the at least one processor is configured to control the movable part of the controlled apparatus such that a sound pressure level of a sound generated by the movable part of the controlled apparatus is lower when it is determined that the speech is uttered in the quiet voice than when it is determined that the speech is not uttered in the quiet voice.2. The controller according to claim 1 , wherein the at least one processor is configured to control the movable part of the controlled apparatus such that an operating speed of the movable part of the controlled apparatus is lower when it is determined that the speech is uttered in the quiet voice than when it is determined that the speech is not uttered in the quiet voice.3. The controller according to claim 1 , wherein the at least one processor is configured to stop at least one movable element of the movable part of the controlled apparatus when it is determined that the speech is uttered in the quiet voice.4. The controller according to claim 1 , ...

Подробнее
29-09-2016 дата публикации

VOICE ACTIVITY DETECTION TECHNOLOGIES, SYSTEMS AND METHODS EMPLOYING THE SAME

Номер: US20160284363A1
Принадлежит: Intel Corporation

Voice activity detection technologies are disclosed. In some embodiments, the voice activity detection technologies determine whether the voice of a user of an electronic device is active based at least in part on biosignal data. Based on the determination, an audio sensor may be activated to facilitate the recording of audio signals containing audio data corresponding to an acoustic environment proximate the electronic device. The audio data may be fed to a speech recognition system to facilitate voice command operations, and/or it may be used to confirm or deny a prior determination that user voice activity is present. Device, systems, methods, and computer readable media utilizing such technologies are also described. 1. A voice activity detection system , comprising:a processor;a memory;a biosensor;an audio sensor; and receive biosignal data recorded by said biosensor', 'determine whether a voice of a user of an electronic device is active based at least in part on an analysis of said biosignal data; and', 'when said VADM determines that the voice of said user is active, said VADM is to cause said audio sensor to capture audio data from an acoustic environment proximate said electronic device., 'a voice activity detection module (VADM), wherein the VADM is to2. The voice activity detection system of claim 1 , wherein:said biosensor is in wired or wireless communication with said voice activity detection system and produces an biosignal containing said biosignal data; andsaid VADM is further to receive said biosignal and determine whether the voice of said user is active based at least in part on the biosignal data in said biosignal.3. The voice activity detection system of claim 1 , wherein:the VADM is to determine whether the voice of said user is active based at least in part on a comparison of a value of at least one characteristic of said biosignal data to a first threshold; andwhen the VADM determines that the value of said at least one characteristic of ...

Подробнее
25-11-2021 дата публикации

VOWEL SENSING VOICE ACTIVITY DETECTOR

Номер: US20210366508A1
Автор: Schiro Arthur Leland
Принадлежит: Plantronics, Inc.

Methods and apparatuses for detecting user speech are described. In one example, a method for detecting user speech includes receiving a microphone output signal corresponding to sound received at a microphone and identifying a spoken vowel sound in the microphone signal. The method further includes outputting an indication of user speech detection responsive to identifying the spoken vowel sound. 1. A method for detecting user speech comprising:receiving a microphone output signal corresponding to a sound received at a microphone;converting the microphone output signal to a digital audio signal;identifying a spoken vowel sound in the sound received at the microphone from the digital audio signal, wherein identifying the spoken vowel sound in the sound received at the microphone from the digital audio signal comprises finding a circular autocorrelation of an absolute value of a short time hamming windowed audio spectrum; andoutputting an indication of user speech detection responsive to identifying the spoken vowel sound.2. The method of claim 1 , further comprising reducing an impact of a stationary noise by applying a non-linear median filter to a result of the circular autocorrelation of the absolute value of the short time hamming windowed audio spectrum.3. The method of claim 1 , wherein identifying the spoken vowel sound in the sound received at the microphone from the digital audio signal further comprises filtering the digital audio signal using a band pass filter with a lower break frequency of 300 Hz and a higher break frequency of 2 kHz prior to finding the circular autocorrelation of the absolute value of the short time hamming windowed audio spectrum.4. The method of claim 1 , wherein identifying the spoken vowel sound in the sound received at the microphone from the digital audio signal further comprises phase shifting frequency components of the digital audio signal to zero phase prior to finding the circular autocorrelation of the absolute value of ...

Подробнее
06-10-2016 дата публикации

AUDIO CAPTURING ENHANCEMENT METHOD AND AUDIO CAPTURING SYSTEM USING THE SAME

Номер: US20160295320A1
Принадлежит:

Accordingly, the present disclosure is directed to an audio capturing enhancement method and an audio capturing system using the same method. The audio capturing system includes at least but not limited to two microphones for recording an audio data, an amplifier coupled to the at least two microphones for adjusting the audio data by applying automatic gain control (AGC) in order to generate a gain adjusted data that is within a predefined level, and a processing circuit coupled to the amplifier for calculating a linear predictive coding (LPC) residue of the gain adjusted data, determining from the LPC residue a first source at a first direction relative to the at least two microphones based on time different of arrival (TDOA), and attenuating any source at a second direction that is outside of a predefined direction. 1. An audio capturing enhancement method comprising:recording an audio data by using at least two microphones;adjusting the audio data by applying automatic gain control (AGC) in order to generate a gain adjusted data that is within a predefined level;calculating a linear predictive coding (LPC) residue of the gain adjusted data;determining from the LPC residue a first source at a first direction relative to the at least two microphones based on time different of arrival (TDOA); andattenuating any source at a second direction that is outside of a predefined direction.2. The method of claim 1 , wherein adjusting the audio data by applying AGC in order to generate the gain adjusted data that is within a predefined level comprising:adjusting the audio data by applying AGC in order to generate the gain adjusted data having a predefined upper limit and a predefined lower limit, wherein the center of the predefined upper limit and the predefined lower limit is at a fraction of a dynamic range of the AGC.3. The method of claim 1 , wherein the fraction of the dynamic range of the AGC is 0.25.4. The method of claim 1 , wherein determining from the LPC residue ...

Подробнее
05-10-2017 дата публикации

WEARABLE DEVICE, WEARABLE DEVICE SYSTEM AND METHOD FOR CONTROLLING WEARABLE DEVICE

Номер: US20170289330A1
Принадлежит:

A wearable device comprising: a band unit; and a vibration generating unit coupled to the band unit. The band unit comprises a communication unit receiving a first signal by a predetermined communication method and a control unit determining an amplification extent of the first signal received from the communication unit and generating a second signal based on the determined amplification extent, and the vibration generating unit receives the second signal from the control unit and generates a vibration corresponding to the second signal. 113.-. (canceled)14. A wearable device system comprising:a wearable device worn on a user's body; anda wireless communication device providing the wearable device with a first signal,wherein the wearable device comprises:a band unit determining an amplification extent of the first signal and generating a second signal based on the determined amplification extent; anda vibration generating unit coupled to the band unit, receiving the second signal and providing the user's body with a vibration corresponding to the second signal,wherein the band unit determines the amplification extent of the first signal based on user information, and the user information includes at least one of user's sex, height, weight, finger length and wrist-to-finger length.15. The wearable device system of claim 14 , wherein the first signal is a voice signal.16. The wearable device system of claim 14 , wherein the wearable device is worn on the wrist of the user's body and the vibration generating unit provides the user's wrist with the vibration.17. The wearable device system of claim 14 , wherein the wearable device further comprises a coupling member coupling the band unit and the vibration generating unit to each other.18. The wearable device system of claim 17 , wherein the coupling member includes a plurality of support units having resilience.19. The wearable device system of claim 17 , wherein the band unit further includes a groove portion in which ...

Подробнее