Voice recognition for information system access and transaction processing

16-02-2000 дата публикации

Номер:

CN0001244984A

Автор: DEVINNEY E J, SHARMA M, KEYSER C, E. J. DEVINNEY., JR., M. SHARMA, C. KEYSER

Принадлежит: T- Neitikesi Co

Контакты:

Номер заявки: 28-14-9718

Дата заявки: 21-11-1997

[1]

Cross-reference of related applications

[2]

The application requests from the United States Provisional application, Serial number for 60/031638, the application 22 November 1996 entitled "for information system access and transaction processing the user identification" priority.

[3]

Background Art

[4]

The present invention is one for determining the safety of transactions completed confirmation system. This invention uses the speech recognition rules to allow the user to complete the transaction.

[5]

1. Invention field

[6]

The invention relates to a signal processing, communication, voice identification and safety and reliable transaction field.

[7]

2. A description of related prior art

[8]

With the growth of the credit card use, the transaction processing computer security-related transactions is once again cause the problem of increasing concern. Is usually used for credit card of the credit card read at the point of sale of channels including the magnetic bar code. Stored in such as account information on the credit card information such as through a telephone connection is sent to the credit card Company's credit confirmation service. For example, the use of X. 25 is connected to the credit card validation system. From the credit card validation system of one of the responsive to the seller indicates that the customer's credit card is valid and whether the client whether there is sufficient credit guarantee. One example of the above-mentioned system is composed of the city of California Redwood VeriFone^ the produced. However, these prior art system has the disadvantage of: the credit card even by one who is not authorized use of the used, it may also be recognized as effective and has sufficient credit.

[9]

Site recognized by the merchant to the credit card of the person to produce identity of the user. The back of the credit card of the credit card issuing a sign the signature of the user. The purchase by the merchant to compare the user credit card behind the actual signature and the signature. If the judgement of a merchant, signature matching, allows the transaction to continue.

[10]

Other systems of the prior art included in the credit card is placed on a picture of the authorized user. When the transaction, merchant more card photographs of the faces of the cardholder. If they are the same, the transaction is allowed to continue.

[11]

Because the signature and photos is user personal characteristics, they are not very effective. Signed it is very easy to be forged and signature and photo difference might be overlooked by the merchant barmaid. These systems are artificial so as to easily as human error. Furthermore, in the absence of people to participate in the credit card transactions through the telephone transactions carried out by these systems can not be used.

[12]

With such as the access system, local area network, a computer network and a database (for example, "International Internet") the application of the system the computer usually use the password is inputted from a keyboard (the so-called personal identification number- "PINs") as the access information security method. Computer password can be stolen, intercepted, or by copying the shortcoming of the 3rd parties. Computer program can guess ("in hacking guess") password. Furthermore, computer password/PIN personal characteristic is not, this means that they do not complex and is easy to do not know of a personal characteristic of the authorized individual to produce 3rd parties.

[13]

The advent of electronic transactions on the Internet, the Internet by the submission of credit card or other "confirm" information to the seller of the goods and services to consumers was increasing. Is connected from a user of the Internet is initiated there has been limited by the transactions of the safe supply of the current. For example, from the Internet to the user credit card number received on the retailers cannot determine whether the number is authorized to use the credit card, or whether an illegal channels have been obtained from a credit card number.

[14]

In daily life as the computer plays a very important role, safety has been as a prominent in the attention point. Regardless of which limit their parents to children playing the tax return (the access), also is to prevent the theft of employees (network access) commercial secrets, or limiting access to a value-added WEB-site (remote network access), the application has the ability to determine the absolute need of the user is a real user.

[15]

Another need exists in the field of high security is a cellular telephone system and prison telephone system. In a cellular system, telephone embezzling is a recurring problem. In the prison system, for authorizing a certain such as a telephone call such the purpose of the transaction, the identity of the inmate must be closely monitored.

[16]

The need is using the user personal characteristics to identify and/or distinguish the user's local and remote and reliable access system and method.

[17]

Invention overview

[18]

The invention is used for increasing the credit card transactions, prison inmate transactions, database access request, in which Internet transactions other and requires a high degree of safety of the security of the transaction processing of the improved method and system. According to the present invention, the use of spectrograms and speaker recognition technique to identify transactions and confirm the user.

[19]

In the speaker recognition (also referred to as speech recognition), there are two aspects: a speaker identification and a speaker verification. A caller identification system for determining a group of the use of his or her voice sampling of the identity of the individual in the population. Can query by the comparison of the user's voice sampling and the above-mentioned database of the sound data of the selected and in the database so as to realize the most matching of speaker identification. The opposite, a speaker verification system is used for the use of his or her voice sample to determine the identity of the individual applications (that is, the person making the application) are valid. A speaker verification system, such as by individual application's name, credit card numbers, or social security numbers of such contents information to know the identity of the individual application. Therefore, a speaker verification system is generally of the user's voice and data stored in the database of the voice data in a group, this group of sound data is identified by the directory information.

[20]

A speaker than the other identification such as a password (including personal identification number) and personal information of the security measures the advantage of better, because individual sound with his or her identity is the only individual is connected. The enhanced security verified as a speaker provides a strong method.

[21]

A speaker verification system includes determining whether a voice sampling provides enough consistent with the matching of the identity of the application. Voice sampling can be text-related and text independent. Text relevant verification system after the issuance of a password phrase identifying a speaker. Selecting the registration process the password phrase and in continuous verification procedure of using the same password. Generally, in a specific password phrases in the vocabulary (i.e., the number of the Figures). A text independent speaker verification system without using any a pre-determined password phrase. However, because the non-limiting terms, text independent speaker verification system than text related to the computational complexity of the speaker verification system is much higher.

[22]

The invention of the present the multimedia network environment using voice biometrics as a natural interface to authorized user, this is better than the password is easy to be revealed.

[23]

According to the present invention, safety can be included in at least three access stage: in the desktop computer, on a common network server (NT, N0VELL, or UNIX), and in one WEB server (Internet/intranet/extranet). The security device can control the work station, the network file server, the visit to the web site, or can keep the secret a particular transaction. Nested these security levels additional safety can be provided; for example, a Company can be selected through the desktop security mechanism has its local security work station, and with protection NT, NOVELL or FTP file server the safety mechanism of the shared data.

[24]

The use of speaker recognition, and using the biometrics data, can be in the user need to provide on the basis of different safety standards. Biometrics determine the user's true identity; other pop the high safety of the method of the disclosure may also be such as token card , if this token card stolen from the owner. A system can be in any kind of access levels in the use of these in any of the method. In all in the creative method described here, the user must understand the additional information identifying section. Regardless of whether this information is the information obtained by the public channels, these information such as their name, or for example PIN, the social safe number , or the account of the private information, the security system will not leak.

[25]

According to the present invention, in order to improve the safety of the "simple" provided by the security system and method (separate voice password), multi-layer security system (voice password multi-layer) random prompt and sound token (through random searching the obtained instant sentence). Can use these security system and method to improve the safety of these systems: a point-of-sale marketing system, local authorization system, the establishment of a facing the called party of a call (including prison telephone system), the Internet access system, web site access system, for obtaining access to the protected computer network system, a restriction is used to access the super link system, desktop computer security system, for obtaining and access to a networked server system.

[26]

Brief description of the Figure

[27]

Figure 1 is block diagram of a speech recognition unit.

[28]

Fig. 2 is a Figure 1 shown in the form of the high performance of the unit.

[29]

Figure 3 shows a "simple" security method and system.

[30]

Figure 4A is block diagram of the multi-layer security method and system.

[31]

Figure 4B is block diagram of a multi-layer layer with conditions the security of the method and system.

[32]

Figure 4C is block diagram of card sound command random prompt of the method and system.

[33]

Figure 5A is a schematic diagram of a caller authentication method and general structure of the system.

[34]

Figure 5B is a schematic diagram of map 5A method and system of a more specific.

[35]

Figure 6 is a schematic diagram of a point-of-sale marketing system of the method and system for identifying a speaker.

[36]

Figure 7 is a schematic diagram of one of the through a call center to obtain an embodiment of the local authorized.

[37]

Figure 8 is a schematic diagram of the embodiment using a speaker recognition to establish a face the calling party's call.

[38]

Figure 9 is a schematic diagram of an embodiment of the use of a speaker identification to establish a connection of the used in the Internet.

[39]

Figure 10A is a schematic diagram of an embodiment of the use of a speaker identification to establish a web site of the used in the connection.

[40]

Figure 10B is a schematic diagram of an embodiment of the use of a speaker identification to establish a protection of the used in the connection to the network.

[41]

Figure 10C is a schematic diagram of an embodiment of the use of a speaker identification to establish a web server on the super of restrictions used in the link connection.

[42]

Figure 11 shows the use of a speaker identification to determine a desktop computer used in the embodiment of.

[43]

Figure 12A shown in use the speaker identification to obtain access to a network server for use in the system.

[44]

Figure 12B shown in use the speaker identification to obtain access to a network server for use in the method.

[45]

A description of the best embodiment

[46]

Use of this invention with various security and communication system and method is combined with the voice recognition. The result, is a creative, can be remote-access and full automatic voice verification and/or recognition system.

[47]

1. The voice recognition unit

[48]

Figure 1 illustrates a speech recognition system 201. Detecting a voice from the user 202 input to a voice recognition unit 204, in this speech recognition unit contains one of the stored database of voice data. Prompt 203 presentation to the user to notify the user of a password or input a directory information. In a caller authentication system, usually provides a voice recognition unit 204 on database 208 with the user match the data in the directory 206. In a speech recognition system, is usually not the input contents 206, and the voice recognition unit 204 in the database in all of the stored voice data to search for best matches the cycle, and recognition and the matching of the corresponding user. Furthermore, if not encountered a certain threshold value, the speech recognition system 204 the presence of no matching decision.

[49]

In another example, the voice recognition unit 204 using a comparison processing unit 210 to the detected voice 202 with the database 208 is stored in the voice data is compared. The stored voice data may extract the speech characteristic, mode, recording, speech feature, analog or digital speech samples, or any kind of the voice-related or from the voice information. Then the voice recognition unit 204 outputs a or verification (or not) the user or identification (or not) the decision of a user 216. On the other hand, from the voice recognition unit of the "decision" 216 includes a, with or without the credibility of the authentication/recognition decision. This reliability can be displaying speech recognition matching how close to one of the data, voice recognition unit or other and how successfully obtained information related to a match. Then use may be an identification, verification, and/or reliability of 216 "decision" to "recognize" the user, in other words, means that the identification or verification of the user, or some other type of identification. Can be graphics 1 system shown in 201 to carry out verification or identification. Database 208 circulation in order to obtain the best match, therefore, recognition should be the most optimal.

[50]

Can be used for realizing Image 1 in the system of a speech recognition system is disclosed in issued 28 May 1996 entitled "voice identification and verification system" U.S. patents 5522012, entitled "voice verification system" Patent application No. 08/479012, by Kevin William Mistretta 3 December 1997 and Farrell of the application entitled "used for speaker verification mode adaptation system and method" the United States Patent application of No. 08 /--, by Richard J. Mammone, Zhang Xiaoyu, and Manish in Sharma of the application 21 December 1997 entitled "system and method for acoustic spectrogram" of the US Patent application No. 08 /--in, wherein each of the all the contents of the referenced as the reference.

[51]

Reference fig. 1, the voice recognition unit 024 can include one used for the comparison before any one of the pre-processing the speech preprocessor unit 212. Pre-treatment process comprises a voice signal of the analog-to-digital conversion. Such as can be used for the production of those criteria Dialogic telephone plate to carry out analog-to-digital conversion. ITU can be used such as G711 μ and A standard method such a speech encoding method of encoding of the speech sample. Preferably using 8000Hz sampling rate.

[52]

Pre-processing unit can detect any number of voice on delete or mute eliminate the noise of the technology, these techniques include the following known techniques:

[53]

Digital filter * elimination of pre-emphasis. In this case, using the digital filter H (z) = 1-αz^-1, wherein α is set at 0.9-1.0 between.

[54]

* use of energy and the elimination of the zero-crossing statistics.

[55]

The success of this kind of technology is based on the finding that the mute short-term time interval (usually

[56]

Before beginning to record the actual speaker, when talking to find the beginning of a period of milliseconds time).

[57]

The mute elimination * based on energy histogram. In this method, the frame energy of the columnar pattern. In the low energy region of the the maximum peak in a histogram corresponding to the background should be the assumption that unvoiced frame energy threshold energy value is determined on the basis of. Using this threshold energy value with the mute relative to voice recognition.

[58]

Furthermore, the voice recognition unit can optionally include a microprocessor-based feature extraction unit 214 before comparing extracted in the characteristic of the sound. In the processing of the speech signal of each frame of the speech feature vectors to represent spectral speech feature. In the feature extraction unit 214 in, can be used for LP cepstrum coefficient (cepstral) such as determining a linear prediction (LP) analysis, Fourier transform analysis filter bank analysis and such a general method to obtain the spectral feature vector. One type of feature extraction in the front of the referred to issued 28 May 1996 entitled "voice identification and verification system" U.S. patents 5522012 and is disclosed in all the contents of the reference it.

[59]

Can use one has at least about 10 MB of the relevant RAM memory and a hard disk or as a storage of the fixed driving of 10MHz the Intel Pentium platform general-purpose computer processing unit (CPU) to realize the voice recognition unit 204. Alternatively, another embodiment may be Dialogic Antares card.

[60]

Although lead as the reference speech recognition system is the most optimal, the invention can use other speech recognition system. To this invention, the type of this speech recognition system is not very critical, can use any a kind of well-known voice recognition system. The invention is safe to use in the field of speech recognition system to improve the existing, non-virtualized system security level.

[61]

2. Security method and system

[62]

According to the present invention, identifying a speaker based on the requirement of the user can be provided various safety standards. Bioassay such as voice verification of the real identity of a user is determined. Other generally of the high safety of leakage possible token card method for example, if this token card stolen from the owner in words. The use of speaker recognition, users only need to know a single information section, what to say, sound itself and the provision of another information segment. The invention intends to at least three safety standards, safety "simple", multi-layer safety, and token sound command random prompt.

[63]

Figure 2 shows a more detailed a speaker identification system 215 general description. As shown in Figure 2, provide a speak a password the user 217 to the voice recognition unit 204. This spoken password the user is to input the best (not shown) the position of the microphone or the voice recognition unit 204 (not shown) in. This password can also be from the telephone or other voice communication device obtained in (not shown). In response to the password, or continuous data, the voice recognition unit 204 outputs a likely to be or include a decision of the reliability 216. In order to increase the safety, can include an optional user contents input unit 218 to obtain such as credit card numbers, social security numbers, or PIN this kind of contents information. User contents input unit 218 can be a keyboard, card reader, joystick, mouse or other input device. According to the safety standard, the contents may be a security or open. An optional prompting input unit 220 can include prompting the user to voice password or the contents information. This prompting input unit can be a display, speaker or other audio/video equipment.

[64]

Fig. 3 a is shown the safety method of the "simple". This kind of method can be in fig. 1 or 2 in the in the system. This "simple" security system only requires a password and voice biometrics. This type of authorized to provide the current unique token of the system is based on a safety standard. Therefore, in Figure 3 in, say to obtain a password to the directory information at 224 and 226. This information and contents can be from the prompt 228 obtained in user. Then the speech recognition unit 204 for processing this information. This voice recognition unit 204 can identify 230 speaker of this password (if the input case, the contents of the information identification of the person). If the identified speaker, can allow an authorized or in 232 identifying the person. If there are no identification from this speaker, refuse to authorize (in other words, in 234 generated at not consent or "does not identify" result). Optionally, the decision of the voice recognition unit 216 is or includes a credibility.

[65]

Figure 4A the multi-layer security flow diagram is shown. Figure 4A method can in fig. 1 or 2 in the system. Figure 4A in the illustrated method 241 using the spoken password a plurality of levels to further enhance the safety. For example, requiring the user to say the current password of their choice is usually used for authorized and the additional random prompt information, for example, the mother's Maiden name, birthday, home, or SSN. Through a multi-layer system such as a digital recording of such a mechanical device, the randomness is added to the system to prevent the invasion, and provide enhanced the effectiveness of the bioassay. For example, system performance general verification with 99.5% accuracy, then the two layer system will verify 99.9975%, and three-layer system to reach the 99.999988%. Furthermore, multi-layer system verification multi-segment awareness and a plurality of the biometrics sampling. Because speech easy to use, is a natural interface, therefore, in the multi-layer system applied to the user in the burden of the token will be far less than the increase of the burden of the system to the customer. The system can rely on the language or does not rely on language.

[66]

As shown in Figure 4A shown, at 242 at 1st voice password obtained from the user. In 244 can be optionally obtained from a user of directory information. The received voice password 1st and, optionally, after the contents information, voice recognition unit 204 in 246 prompt 2nd password (random) 246. This prompt can be through Figure 2 prompting input unit 220 display. Furthermore, to obtain 2nd voice password 248. Then this voice recognition unit 204 determines whether to understand this 1st password 250. If it is not aware of this 1st password, at 252 and identification is not authorized. If aware of this 1st password, then the voice recognition unit determines whether it is knowledge password 2nd 251. If it is not aware of this 2nd password, of the 252 and identification is not authorized. If aware of this 2nd password, in 254 will be generating an authorization and/or identification. Optionally, output as a decision 216 credibility, or included in a decision 216 in.

[67]

A two-layer system can be rejected in the 1st, the set condition at the password. Figure 4B the conditions shown in the two layer system 261. As shown in Figure 4B illustrated, in 262 1st voice password is. Optionally, the 264 obtain contents information. The voice recognition unit 204 then determines whether it is aware of this 1st password 266. If aware of this 1st password, the 268 and the identification of authorized is generated.

[68]

If the voice recognition unit is not aware of this 1st password, the 270 generated at a 2nd password 270 (random). Through the voice recognition unit 204 this 2nd randomly generated password. The 271 at a prompting input unit 220 (Figure 2) the password displayed on the prompt. The 272 voice password obtain this 2nd, and if in the 274 2nd is to understand the password, at 278 place authorization or identification. If it is not aware of this 2nd password 270, the 268 is not generating an authorization or identification. Optionally, decision 216 can be or include a credibility.

[69]

Figure 4C shows a method for a random prompt token 281. At a random in the system prompt token, system mode specific, such as in particular the special spoken vowel the dispersed characteristic of the sound. System then from the hundreds of thousands, or even tens of thousands of a large database of 283 in randomly select a word or phrase, and prompting the user to speak the word. System then separated out from the importance of the word and verify the particular characteristic of these characteristics. This provides a completely random word choice to the digital record so as to achieve a high level of immunity and do not require users to remember one password.

[70]

As shown in Figure 4C shown, the voice recognition unit 204 from the database 283 selects a particular spoken sound the specific discrete characteristics of the mode 282. Then prompting the user to speak one and includes relevant information with the mode to one of the word or phrase, this can be by the prompt input unit 220 (fig. 2) in 284 is prompt. Then in 286 obtain this voice password. In this kind of situation, the voice password with the prompt on the voice characteristics.

[71]

In 286 after receiving the voice password, voice recognition unit 204 in the 288 password is the characteristic of the voice recognition. Voice recognition unit 204 then in 290 that determines whether it is at these characteristics and the characteristic is consistent with the mode selected. If the awareness of these characteristics, in 292 place authorization and/or identification. If not recognize these characteristics, in 294 is not authorized or identification. Optionally, can be included in the decision reliability 216 in.

[72]

"Simple" system, multi-layer system and random prompt sound command tag system can be combined with each other in an alternative embodiment. For example, a voice password and a random prompt voice token in a single or multi-layer system can be used together in. Can be in the present invention use a generic security system in other types of method, voice or non-voice, such as smart card system or password system. This invention on the known system and method for voice recognition of the benefits of increased.

[73]

3. Other embodiments

[74]

The invention can be used in the following detailed description of some embodiments. "Simple" system, multi-layer system, the random prompt sound command tag system and/or other system can be used with the embodiment shown the combined together.

[75]

3.1 a speaker recognition system/service-general

[76]

Figure 5A is generally a structure diagram of a sound verification method and system 50 a. As shown in Figure 5A shown, through the user terminal 54 is connected to the voice recognition system/service 56. Connection 54 can be a voice connection (such as a telephone connection), the data connection (such as the modem is connected) or sound combination of connection and data connection (such as ISDN connection). Voice recognition system/service 56 establish a voice recognition database unit (VIDB) 16 with the connection 57. VIDB16 storage such as voice or sound spectrogram of such information.

[77]

If the connection 54 is connected is one voice, voice verification system 56 from the subscriber end 52 VIDB16 the sound sampling and sampling of sound in the matching. If the established data connection, by the user terminal 52 into the sampling voice of a user in the user end 52 the data characteristic of the sites. Through the connection 54 is sending of the data may optionally be encrypted. Voice recognition system/service 56 matching from the user 52 VIDB16 data and the presence of the data in, thereby, the user performing voice recognition on the voice.

[78]

Figure 5B shown the tu 5A the user end indicated by 52, voice recognition system/service 56 and VIDB16 a more detailed description. Figure 1 the preprocessor unit 212 and Figure 1 the feature extraction unit 214 is included in the map 5B the user end 52 in. Figure 1 the comparison processing unit 210 preferably includes the pattern 5B a voice recognition system/service 56 in, but alternatively, can also be in the picture 5B VIDB16 of in the 210 providing [...]. Preferably the chart 1 database 208 in VIDB16 is in.

[79]

Figure 5B the best system further clarify the position of the installation of the additional element. User end 52 typically includes a sound input unit 402, data input unit 404, sound output unit data output unit 406 and 408. This sound input unit can be a for providing analog sound signal to a A/D conversion unit 410 of the loudspeaker. A data input unit 404 can be a user input data can be a keyboard or mouse, or card reader machine. This data can be can also be no need to A D conversion, in order to clear the purpose of the, shown a data input unit 404 to A D with the connection of the converter.

[80]

Using the sound output unit 406 to the user by providing prompt and other information. A voice output unit 406 can be a speaker or headset. Data output unit 408 for the user to provide data and/or the prompt. Data output unit 408 can be a cathode ray tube, LCD display, LED display or other visual indicator. Many types of data outputs of all the information needed to be simulated and, therefore, a digital-to-analog converter 412 and the sound output unit 406 and a data output unit 408 is connected to the input end of. Also to provide a AUX unit 414. AUX unit 414 can be a switch or other in the event of successful or not successful on the basis of the authentication or identification, or in a certain reliability indicating working on the basis of the selected device. AUX unit 414 can be required before operation or does not require digital-to-analog conversion.

[81]

User end 52 used to obtain sound input information and/or data input information (such as contents). This information can be directly provided to the communication unit 416 will be transmitted to the voice recognition system/service 56. However, preferably, the voice/data information is the A D of the change-over, and (if necessary) in the pre-processing unit 212 in the other pre-treatment. This pre-treatment process can be according to fig. 1 such as the former saying happens. Similarly, the following pre-processing process, feature extraction also takes place in the characteristic extracting unit 214 in. Feature extraction unit is used to from the sound information of interest is extracted in the digitized characteristic and according to fig. 1 such as the former saying happens. These extraction is characterized in that the intricate, therefore, once the data from the user end of the leakage of the sound data is not possible.

[82]

After feature extraction, information is preferably enter the encryption/decryption unit 418. This encryption/decryption unit 418 digitally encrypted information and allow its security is transmitted to the voice recognition system/service 56.

[83]

At the user end of the communication unit 416 may be a telephone communication device, modem, Internet portal device, cellular telephone, digital PCS transmitter or any known comprises a known bus and interface local or remote voice/data interface.

[84]

Voice recognition system/service 56 includes a 1st communication unit 420, comparison processing unit 210 and 2nd communication unit 422. 1st communication unit 420 receiving from user 52 or other source of the transmission. On-line 54 from a user is received on the end 52 of the communication transmission and on-line 424 from other sources is received at a communication transmission. The communication unit on the user on-line 54 with the voice recognition system/service communication and on-line 426 with other source communication.

[85]

Comparison/processing unit 210 from VIDB through 16 the database 208 to obtain sound information work performing voice recognition. Comparison/processing unit 210 in the comparison of the user's voice characteristics and from the database 208 storage of the sound data of the form on the basis of a recognition decision 216. Speaker verification can be performed and a speaker identification.

[86]

If the user does not include A/D converter 410, the preprocessor 212 or the feature extraction unit 214, the voice recognition system/service 56 includes these components (not shown). Voice recognition system/service 56 preferably includes an encryption/decryption unit 428. This encryption/decryption unit 428 for from the subscriber end 52 of the information of the encryption or decryption. Voice recognition system/service 56 through a 2nd communication unit 422 and VIDB 16 communication. The communication unit also can be with including line 430 on the user end 52 of the other any destination of communication.

[87]

VIDB 16 includes a communication unit 432 and database 208. Optionally, VIDB 16 comprises a comparison/processing unit 210 the [...]. Only in the sound recognition system/service 56 is used as the switching network to be sent to all of the introduction of the information VIDB 16 under the condition of this comparison/processing unit 210 the VIDB [...] in the only. If the voice recognition system/service 56 is transmitted to the encrypted information VIDB 16, then VIDB 16 can also include an encryption/decryption unit (not shown). However, assume that the voice recognition services and VIDB communication line between 57 is safe, or VIDB 16 and voice recognition system/service 56 is the set up. In this case, on-line 57 would not need the security transmission.

[88]

use chart 5A and 5B from the system to obtain the user's voice and/or data input, the implementation of remote or local voice recognition, and the success or failure of the identification information to the user. In the sound identification system/service 52 performing voice recognition on, and to make the user's sound output 406 or data output on the identification decision to convey to the user 216. On the other hand, as shown in Figure 5B shown, identification decision 216 can be through voice recognition system/service 52 and wire 430 3rd party of the communication. As on the other hand, as shown in Figure or 5B illustrated, if the user attempts to enter a need to identify system, the user's communication device identification can be directly transmitted to the success or failure of the line 426 of the 3rd party. As on the other hand, as shown in Figure 5B illustrated, VIDB can include a relatively/processing unit and thus can be directly the identification decision 216 is transmitted to the on-line 434 on the user end 52, on-line 57 of the voice recognition system/service 56, or on-line 434 3rd parties on.

[89]

Other types of information may also be at the user terminal 52, voice recognition system/service 56, and VIDB 16 communication between. For example, through the user information may be 52 provided to the voice recognition system/service 56, and/or VIDB 16 these recognition decisions should be transmitted local, through other and a certain part of the system.

[90]

As an example, if the user 11 wishes to access a database (not shown), the user 11 VIDB provide a and 16 identity of sound in the matching of the spoken password. When the and VIDB 16 in the capacity of the memory of the matching sound, voice recognition system/service 56 provided to the user 11 a about his password is accepted or refused to the conclusions. Then this conclusions automatically through the line 426 is transmitted to the database provider. Furthermore, this conclusion can be on-line 424 is directly transmitted to the database provider, if the user end 52 of such indication. The database providers ORACLE can be, for example, by providing one of the service program.

[91]

The voice recognition system/services to realize aforesaid the safety method of the in, Figure 3 of the "simple" system, chart 4A and 4B multi-layer system 241,261, Figure and 4C random prompt sound command plate 281. The sound input 402 was spoken password, through the data input 404 to obtain contents information (if necessary) and through sound output 406 or data output 408 transfer instruction. Therefore, for a general system, chart 5A and 5B can provide a very high safety standards.

[92]

3.2 credit card confirmation

[93]

Figure 6 is a schematic diagram of a credit card validation system 10 of the present invention for a method and system for voice recognition. In this credit card validation system 10 in, in is disposed in the one of the end point of the sale sale point place 12 confirms whether the user at 11. This sales endpoint 12 can be according to chart 5B is shown in the user terminal 52 a as. In this case, through the card reader 450 read credit card number. Through keyboard 452 input such as user wishes to purchase the item of such other information. A a user's spoken password input microphone 454. Card reader 450 and keyboard 452 corresponding to the chart 5B data input 404, microphone 454 corresponding to the sound input 402. Through such a universal connection such as telephone line 13 the credit card numbers, other relevant information (if it exists) and is transmitted to the confirmation service spoken password 14. According to the sound identification system/service 56 will confirm service 14 as shown in Figure a 5B, as shown.

[94]

Confirmation service 14 with the sound identification database (VIDB) 16 a universal link 15. Voice recognition database can be formed (VIDB) Figure 5B, as shown. In order to store account information corresponding to the voice of an identity or sound spectrogram, VIDB 16 received from the confirmation service 14 of the account information. Furthermore, VIDB 16 can be in a database (not shown) to confirm the account data stored in the user's account is valid and will not because beyond the purchase of the application. On the other hand, VIDB 16 or confirm service 14 can be respectively linear 460,462 contact with an external credit card board, to determine the user's account is not efficient and the beyond the purchase of the application.

[95]

Confirmation service 14 carry out a speaker of the spoken password to determine whether the identification of the spoken password used for in the database by the contents of the information identification of human speech data matching. As discussed earlier, confirmation service 14 can also get the results of credit card bureau.

[96]

Confirmation decision 216 and credit card bureau (if it exists) the result of the link 13 is fed back to the sales endpoint 12. On the other hand, if the comparison/processing unit 210 the VIDB [...] is set at 16 in, through the decision VIDB 16 and sales endpoint 12 is directly connected between the 464 transmission. There is a point of sale corresponding to the chart 5B data output 408 display 456. This display 456 whether the user informs the seller is authorized, whether the user has exceeded a maximum number of credit card accounts, and/or the credit card is valid.

[97]

Preferably in the credit card validation system 10 in the end point of the sale 12 use in pre-processing unit, the feature extraction unit and encryption/decryption unit (not shown). The role of these elements as according to chart 5B the described.

[98]

The credit card validation system 10 to realize aforesaid the safety method of the in, Figure 3 of the "simple" system, chart 4A and 4B multi-layer system 241,261, Figure and 4C random prompt sound command plate 281. Through the speaker 454 was spoken password, through keyboard 452 access to contents information and through the display 456 will prompt to the user. Therefore, this invention can significantly improve the safety more than in the prior art provided by the credit card confirmation system.

[99]

3.3 local authorized call center-oriented

[100]

In Figure 7 is shown in another embodiment, the user 11 can at the user terminal 52 and the call center 20 a connection is established between the 21 to provide local authorization of credit card transactions. In Figure 7 the system shown in the 9 in, as indicated in the preceding Figure 5A and 5B described and shown in the user end structure 52.

[101]

Reference fig. 7, the user 52 can be through the telephone line 21 from the local is connected to a distribution network with the inlined state 470 and external state sales call center are connected to a network 20. The user 11 through the data input unit device such as a keyboard providing account information (can be used as contents information), and through the voice input unit 402 such as the microphone of the sound identification password to the user terminal 52. Display 456 for displaying the decisions and prompt. User terminal 52 through the telephone line 21 or other standard connection with the call center 20 is connected. Call center 20 can be a telephone line through the standard connection 23 will sound and contents information (if it exists) is transmitted to the voice recognition system/service 56.

[102]

As described the front according to fig. 5A and 5B structure voice recognition system/service 56. Received sound and contents information after (if it exists), voice authorization service 56 from the sound information database unit requests the sound data 16 (VIDB). As shown in Figure 5A and 5B described and shown the VIDB structure.

[103]

Optionally, the voice recognition system/service 56 and user 52 establish a connection between 28 are used to provide the relevant on the display unit by the voice recognition system/service 56 made by the user whether to accept or reject 11 results. VIDB can be in 16 and the call center 20 the other is established between the connection of a replaceable 29, VIDB should include OZP 5B shown in the comparison processing unit 210 the [...].

[104]

The perspective from the market, can be in the sound identification system/service 56 or VIDB16 provide a purchase of like or similar user 11 outline.

[105]

In another embodiment, the user 52 can be through a seller the retail service bridge 30 with the call center 20 is connected. The user 52 can be with the seller the retail service bridge 30 to establish a connection 32, the connection or telephone connection or a modem connection and with the seller the retail service bridge 30 is connected to the computer in the retail seller. The seller the retail service bridge 30 through link 34 with the call center 56 is connected to receive on whether to accept or reject the user 11 decision 216. From the sound recognition system/service 56 decision 56 through the link 23 is sent to the call center 20, and can be continuously after link 21 is sent to the user terminal 52 or can be after link 30 is sent to the seller the retail service bridge.

[106]

Preferably in the local call center embodiment of the user terminal 52 using a pre-processor, a feature extractor and a encryptor (not shown). These components function as according to chart 5B previously described.

[107]

The embodiment of the call center can be 9 to realize aforesaid the safety method of the in, Figure 3 of the "simple" system, chart 4A and 4B multi-layer system 241/261, Figure and 4C random prompt sound command plate 281. The sound input 402 can obtain the spoken password, through the data input 402 directory information can be obtained, through the display 456 will prompt to the user. Therefore, the use of a call center of the principle of the present invention can provide higher safety.

[108]

3.4 telephone call verification/identification

[109]

Figure 8 illustrates the use of the telephone network used for 12 to establish a call of the user to receive voice recognition of the of the present invention method and system 60. The establishment of this use from the prison inmates to the prison system groups with special safety benefits of the call. A prison inmate may be refused access to telephone privileges, the system can ensure that these household not receive telephone call to the user.

[110]

In the Figure 8 embodiment, can be a prison the calling party with the service 61 use of the telephone device 62 to access the telephone interface hardware 64. This telephone interface hardware 64 with a host computer system 66 is connected. The host computer system 66 with the creation of a voice recognition system 56 connection 67.

[111]

Calling party 61 62 the sound samples from a telephone to a telephone interface hardware 64 by a host system and 66 reach a sound identification system/service 56. The sound or voice or sampling can be performed in a host computer system 66 of the data sampling of the sound produced.

[112]

In this embodiment, the host computer system 66 including map 5B shown in the user end 52 an integral part of, and use a switch 480 to as AU unit 414. Voice recognition system/service 56 is preferably constructed as shown in Figure 5B, as shown. Voice recognition system/service 56 establishing and VIDB 16 of the link 69 to the stored sound identity or calling party 61 of sound spectrogram of a contents (if contents data exists). The contents data may be by the prisoners or calling party 61 by the telephone call at the start of the manual input keys.

[113]

Voice recognition system/service 56 made on whether to accept or to refuse to calling party 61 decision 216. This decision 216 are notified to the host computer system 66, if this decision is positive, then the host computer system 66 through the switch 480 the establishment of the telephone network 72 connection. After this, the telephone network 72 establish the user receiver 74 to allow the connection of the calling party 61 communication.

[114]

Whether the host computer system 66 is also sound recognition system 56 can warp 482,484 with the credit Bureau contact to determine the calling party has sufficient credit to complete the call. Furthermore, the host computer system 66 or voice recognition system 56 can be with the prison database 486 connection/the identified whether there is a caller authorized by the usual call privileges, or whether it is prevented from shifting a particular telephone number. This prison database 486 can be selectively included in VIDB unit 16 in.

[115]

The user system receiver 60 to realize aforesaid the safety method of the in, Figure 3 of the "simple" system, chart 4A and 4B multi-layer system 241,261, Figure and 4C random prompt sound command plate 281. Through the telephone 62 was spoken password, by a push-button or a rotary-dial to obtain contents information, and through the use of voice or tone of the audible sound output 406 will prompt to the user.

[116]

Therefore, in order to make the prisoner telephone, must be the contents (if desired) and voice password informed to the host computer system 66. If no voice recognition, and if there is no a certain appropriate access standard, will not open the switch and do not allow to continue the call. Therefore, through the database 486 update, prison inmates official can control the capacity of the telephone.

[117]

3.5 Internet access

[118]

Figure 9 is a schematic diagram of the establishment of the Internet connection is used in the voice recognition of the present invention method and system. User 11 to provide a sound sample Figure 5B according to the user end indicated by 52 of the structure by the PC 602. Replaceable PC 602 can be a Figure 5B according to the user end indicated by 52 the constructed television network.

[119]

PC 602 by an access link 604 with the call center 20 communication. The seller's call center 20 to provide a voice recognition system/service 56 to visit the seller's web page 606 connection 608. Voice recognition system/service 56 is constructed as shown in Figure 5B, as shown.

[120]

In operation, the user 11 to provide a spoken password to the PC 602. Preferably, PC 602 includes a sound input (i.e., microphone), the preprocessor, feature extractor and encryption (not shown). Furthermore, the user can provide a as the catalog information of the digital mark. This digital identification may be a user's key allocated to the Internet. For example, used in the present invention can be of the digital mark is VeriSign by the California Mountain View production of "digital ID".

[121]

Sound and contents information is notified to the call center 20, and warp 608 is transmitted to the seller's web page 606, then transmitted to the voice recognition system/service 56. Then by the sound recognition system/service 56 will identify a decision 216 is transmitted to the seller's web page 606, and through the link 608 to reach the call center 20. Therefore, web page informs the seller or the recognition whether the user is authenticated. Call center 20 of the relevant decisions can be 216 tell PC 602.

[122]

On the other hand, the user 11 by a single connection 612 will be spoken password to the voice recognition system/service 56. In this case, voice recognition system/service including map 5B the voice input is shown (i.e., a microphone), the preprocessor and feature extraction device. By the sound recognition system/service will identify a decision 216 is sent to the call center 20, and through the link 608 is sent to the seller's website.

[123]

The connection of the other communication can be produced. For example, if Figure 5B comparison processing unit is set up in the VIDB 16 in, then in VIDB 16 and the PC 602, call center 20 or seller's web page 606 can be a connection is established between (not shown).

[124]

The embodiment of the access to the Internet 200 to realize aforesaid the safety method of the in, Figure 3 of the "simple" system, chart 4A and 4B multi-layer system 241,261, Figure and 4C random prompt sound command plate 281. Through the PC 602 was spoken password and the contents information. PC 602 can also display Image 3, 4A, and 4B 4C prompt shown.

[125]

Therefore, it is very safe can make the access to the Internet so as to increase the credibility of the Internet, using their assumption that only authorized users access system.

[126]

3.6 e-commerce

[127]

Figure 10A, 10B and 10C web environment for the authentication of the present invention method and system 300 of a schematic diagram. Can be in different way and with different standard speaker verification so as to ensure that the Internet access and transaction in the environment. These include:

[128]

By allowing * such as a secure electronic transactions (SET) or authorization certificates (CA) in this way existing standard safe transaction to support voice biometrics. This is through the sound mode of the sound mode or the reference is inserted in a Certificate of completion or message.

[129]

Bioassay for sound * supported in the firewall product, such can be restricted in the protected network to distinguish the voice of the periphery of the access of the user. In WEB server security feature in addition to adding support to support the voice password keyed in password, thus preventing access to WEB-site.

[130]

WEB * prevent in a certain area of the site to the voice password enable user access ultra link is sound protection. This is through a subsidiary JAVA procedure or ActiveX control such control to realize, after the identification of users ultra link this kind of control to.

[131]

* produce an ownership transaction interface to ensure WEB site such as the purchase of such a transaction.

[132]

According to chart 10A, user 11 running PC602. PC 602 configured to as map 5B shown in the user end. User supplied to the PC 602 a spoken password. PC 602 can include a series of distinctive tone to prompt the user to perform some specific action, for example, prompt the user that his password. These characteristics can be used to replace tone of the PC 602 conventional prompt.

[133]

PC 602 preferably includes a pre-processor, feature extractor and encryption (not shown). Then the encrypted speech feature 303 is transmitted to the web server 302. By the web server 302 exist for web server 302 to a key in the decryption of the encrypted speech feature 303. Web server 302 through a connection 305 with the identification server 307 for communication. According to the sound identification system/service 56 as shown in Figure 5B illustrated structure recognition server 307.

[134]

Recognition server 307 establishing and VIDB 304 and a link by the user 11 whether or not the decision is to accept or reject 216. This decision on the link 305 is sent to the web server 302. If the user 11 is accepted, allowing access to a web site web server 306. On the other hand, web server can establish a connection and visit another (protected) web-site to entertain the protected site (not shown). This access allows the user 11 access to stored information or can set up a transaction. For example, the user can build to the following access: is used for storing the account of the user 401 (k) database of relevant information of; to the one used for placing the order to buy or sell a total of funds or stock investment application access, or access to an information service point of retail goods for the purchase order application or the like.

[135]

As shown in Figure 10B shown, according to the present invention to correct firewall system 620 function. When the at the user end 52 for a user to attempt to pass through the Internet, to access a protected network, connection must first pass through a firewall 624. Firewall 624 in various elsewhere level is the implementation of inspection to determine the validity of the user, and the initial visit in the course of running, the integrity of the connection is maintained to ensure that the use is not hostile. A typical example of the initial visit place the authentication method is a logarithmic ID/password or a question/ reply command card based system.

[136]

A speaker identification is a more robust mechanism to ensure the reliability of the real user, and it is not a can be a section of information is easy to leak, or one can be stolen token generating card. Figure 10B the user end 52 is preferably constructed as shown in Figure 5B the user end indicated by 52. Recognition server 628 configured to as the best plan 5B a voice recognition system/service 56, and VIDB 16 configured to map the best 5B of the.

[137]

The reference Figure 10B, in the from the user terminal at the initial visit, prompting the user to express their password. If the user is using the browser to visit the HTTP protocol, this is through a Active X control or a subsidiary program to finish. At the user terminal, the voice data can be optionally reduced to a feature set, then through the one such as confidential receptacle level (SSL) is connected with one of the encrypted connection is transmitted to the firewall.

[138]

Firewall-transfer this data together with the logarithm of the together with the user ID to the recognition server 628. Recognition server 628 VIDB search from the mode of the user and compares the voice data and the storage mode. If the identification of the user, the firewall 624 allows for the setting up of a connection, otherwise refuses to user access.

[139]

Firewall 624 also prevent from the Internet users of the protected network into the malicious data or program. Speaker verification can also be used to limit external network access authorized user.

[140]

Figure 10C shown a protection by sound ultra link system 630. As shown in Figure 10C shown, the system, user terminal 52, recognition management server 632, recognition server, and VIDB 16 is used for allowing the web server 638 to a restriction at ultra link 636 of the key components of the visit. As shown in Figure the best 5B a shown in the user terminal 52, and run an authentication program 640. As shown in Figure the best 5B a voice recognition system/service 56 constitute an identification server 634, and preferably as shown in Figure 5B VIDB a 16.

[141]

Continuing to reference fig. 10C, at the user end of the viewing of a web site user to select a sound protection is ultra link 636. Such as JAVA subsidiary procedures or ActiveX control differentiation program 640 through the user's browser at the user end is started to starting, rather than immediately goes into the position of one is ultra link. This authentication program 640 needs the user to input an identifier the sign leaves , such as their name or account number. This the sign leaves identifier is used as the directory information for verification.

[142]

At the user terminal 52 identification program 640 then a request for the identification management server 632 recognizes the sign leaves identifier of the user, and if this identifier requesting users to express their effective the pass phrase. Authentication program 640 then recording the user speaking the phrase of movement of them. Can be made from the procedures to perform any of a feature extraction thereby reducing and transmitting the setting data for the voice is wub of the. Voice information is then through the authentication program 640 to the recognition management server 632, recognition management server transfers it to the recognition server 634 in order to for the arbitrary safety standards.

[143]

Recognition server 634 speech data and compared to the user's search mode, and the comparison of the decision or result is transmitted back to the recognition management server 632. If the authenticating the user, server 632 will be protection ultra link transmitted back to the user the name of the end 52 of the authentication program 640. Then identification program 640 indicating browser to visit a web-site 638 ultra link the limit of 636.

[144]

Can be in Figure 10A, 10B and 10C Internet security in the embodiment of the security method, Figure 3 of the "simple" system, chart 4A and 4B multi-layer system 241,261, Figure and 4C random prompt sound command plate 281. Through the PC 602 or user end 52 was spoken password and the contents information. PC 602 or the user 52 can also be through an audio device or instruction to display 3, 4A, and 4B 4C prompt shown.

[145]

Therefore, can greatly improve the safety of electronic commerce via the Internet thereby improving user access to information, the capacity of the products and services.

[146]

3.7. PC the security of the

[147]

Figure 11 shows a desktop security system 650. This desktop security system 650-in-place stored in the desktop workstation 652. In this embodiment, graph 5B includes all the elements in this desk-top workstation in the, and the communication unit are native interface.

[148]

The desk work station of some components may be included in to provide voice biometrics protection, including:

[149]

The sound system * security registration. In desk-top workstation, a registration prompting replacement of the existing safety (if any). This registered in the permitted access to the system needs a voice biometrics authentication.

[150]

* a confidential sound screen save the deactivation device. In this way ensure that the work station in an extended period after the idle is locked. Hotkey starting can also be password-protected starting the sound immediately without waiting for the start-up screen is saved. When the disable this screen when the apparatus is saved, this logic circuit calls voice registration. Once receive an effective spoken password it allows for disabling the.

[151]

* uses in set in the system registration of the user and the user's profile management application.

[152]

Encryption * document (optionally). The system only through the encrypted spoken password phrase to the accessed file. From this spoken password phrase can be to obtain the file encryption key, this by a single user access to the documents to increase the safety level of the particularly high, but the shared encrypted files. On the other hand, after the identification, can be made in that document in an encrypted database to find the key, or from the relevant the document in the information of the key is obtained, then the decryption.

[153]

The desk-top workstation embodiment to realize the aforesaid security method, Figure 3 of the "simple" system, chart 4A and 4B multi-layer system 241,261, Figure and 4C random prompt sound command plate 281. Through this desk-top workstation was spoken password and the contents information. This desk-top workstation through the audio device may also be displayed or instruction 3, 4A, and 4B 4C prompt shown.

[154]

These precautionary measures can help ensure that only desk-top workstation the authority of the user to get access to the desktop workstation and/or the access of the document.

[155]

3.8. network security

[156]

Figure 12A and 12B show the embodiment of a network security 660. Figure 12A shown a network device, including a user, the user 52 (such as a PC), network server 662, the authentication server 662 and VIDB 16. As shown in Figure the best 5B a shown in the user terminal 52. As shown in Figure the best 5B a voice recognition system/service 56 constitute an identification server 664. As shown in Figure the best 5B VIDB a.

[157]

The main dominant network server has been established in in a safety device, generally through a registration name or password to limited access server resources. These servers include in NT Windows, in NOVELL and UNIX based system. As the attack strategy of these systems are becoming more complex, in an alternative method the need to become obvious. Voice biometrics provides one than the typical server authentication method more difficult to reveal the complex device.

[158]

The following features can be combined with the network server in the security system and method;

[159]

Registered sound server * security. Access to the server, a registration prompting alternatively the safety of existing (if any). Generally, in order to access server resources, server needs a registration name or password. Server generally to a given user of a group of privileges and of right to access. Biometrics registration replacement password registration. Once the user has been entered, it is more necessary to rely on the potential security mode to provide access control to a system resource.

[160]

* uses in set in the system registration of the user and the user's profile management application. In general, management will combine in the existing server tool, unless the server-specific operating system does not allow the tool correction.

[161]

As shown in Figure 12B illustrated, server security system 660 can be run in a mode, in which the with voice password phrase only allows the user to access the server, or operating in a mixed mode, in this mode through a conventional password device number of the registered user can also in order to reduce or to access the same safety standards. As far as possible the security management of the user and the standard seamlessly system management feature in the operation of the; for example, in under NT Window, maintains the territory users and the server management program style and feel.

[162]

The reference Figure 12B, when the user attempts to access the networked server 670 time, prompting of their commonly used name/password prompt set on the user identification information of the user. User terminal, the user information is sent to the networked server. This network server on the basis of the subscriber identification in 672 made whether the user is to allow the user of the voice of the decision of the password.

[163]

If the user is not the voice password allows users and the server is configured to only allow access voice password allows the user, or if the user ID is not arranged in the user database 676 in, in 678 to refuse the registration. If the server is not configured to allow password only permitted access to the voice of the user, and if the user ID is set in the database of 676 in, the 680 test the user's authorization.

[164]

If the user is non-voice authorized 680 authorized, and if the input password matches the presence of user related with the user ID password in the database the 684 matching, the 682 allow a user to access a server. If unsatisfied with the first two conditions, then the server will reject authentication.

[165]

Re-reference chart 12B access attempt to 640, if the user is allowed, the system can optionally use a conventional password in the 690 to provide 1st standard identification. If this 1st standard distinguishes that allowed, then the system implementation in one standard 1st 692 check the user's password. If the password is not correct, in 694 refused to this visit, if the password is correct, in the 696 implementation between storage mode and recording the matching of the password.

[166]

If in the 690 does not allow distinguishing standard 1st, the implementation of the memory mode or the matching between the record password. The decision to continue to 696 on the basis of matching, user prompting the user to express their spoken password. At this point the user can be arbitrarily on the speech data to feature extraction in order to reduce the magnitude of the data and is put in the case of an external application in the form of hard. This voice data is then encrypted or characteristic of the time mark and, then transferred to the network server 662. The networked server with this information is conveyed to an optional specific safety standards in order to display the authentication server the biometrics authentication used by the tightness of the threshold value.

[167]

The authentication server 664 search from VIDB 16 spoken password mode and compares the spoken language pass phrase from the data and this mode, and provides a binary result and the credibility of the one can be selected.

[168]

Network server 662 use this identification standard to determine whether recording of the password with the stored pattern matching in order to reach an acceptable degree. If the matching degree is acceptable, the 698 permitted to access, otherwise the 699 refuse to access. Will allow the re-attempt of a configurable number. If it is beyond the number of allowed re-attempt, then the server will prohibit this account.

[169]

The patent refers to the field of 'speech analysis or synthesis and speech recognition'. The present invention applies speech recognition technology to remote access, verification, and identification applications. Speech recognition is used to raise the security level of many types of transaction systems, including: point of sale systems (10), home authorization systems (9), systems for establishing a call to a called party (60)(including prison telephone systems), Internet access systems (600), web site access systems (300), systems for obtaining access to protected computer networks (620), systems for accessing a restricted hyperlink (636), desktop computer security systems (650), and systems for gaining access to a networked server (660). A general speech recognition system using communication (54) is also presented. Further, different types of speech recognition methodologies are useful with the present invention, such as 'simple' security methods and systems (221), multi-tiered security methods and systems (241), conditional multi-tiered security methods and systems (261), and randomly prompted voice token methods and systems (281).

1. One kind is used for the speech recognition to recognize the user's system, including;

A user end, including:

A sound input of voice data; and

The sound input is connected with the voice data of the 1st communication unit;

A voice recognition system, can be operatively coupled to receive database from the sound information of the user information, including;

For receiving from the 1st communication unit of the information related to voice data 2nd communication unit; and

A processing unit, for providing the input voice data and from the sound information database of voice recognition of the user information between the output information of the concerned.

2. System according to Claim 1, characterized in that the user terminal further comprises:

Sound input is connected with the one of the pre-processor;

The pre-processor and the 1st communication unit is a feature extraction unit, wherein this feature extraction unit extracts information on the voice data, and wherein the processing unit using the extracting information related to voice data.

3. System according to Claim 1, characterized in that the 1st communication unit may receive the output information, wherein the voice recognition system will output information conveyed to the user, the user to retain a output unit to display the output information.

4. System according to Claim 1, characterized in that the output unit is a display and wherein the input unit is a microphone.

5. System according to Claim 2, characterized in that the user end is a sales endpoint, and the 1st and 2nd communication unit is connected through the telephone line.

6. System according to Claim 2, 1st and 2nd is characterized in that the communication unit is connected by a call center.

7. System according to Claim 2, characterized in that the call center through a seller the retail service bridge is connected with the 1st communication unit.

8. System according to Claim 1, characterized in that the sound input is a telephone input, 1st communication unit may receive the output information, voice recognition system will output information conveyed to the user, and wherein the user terminal comprises a successful voice recognition of the basis of the telephone network is connected to the telephone input switch.

9. System according to Claim 8, characterized in that the is provided with telephone input is in the prison of the telephone is connected.

10. System according to Claim 2, characterized in that the user terminal is a personal computer. Wherein the 1st and 2nd communication unit through a seller's website and a call connected with the center of the, and wherein the seller website in the successful voice recognition of the personal computer to provide Internet access.

11. System according to Claim 2, 1st and 2nd is characterized in that the communication unit is connected through a firewall, wherein the firewall and the successful voice recognition of the end user to provide the protected network access.

12. System according to Claim 2, 1st and 2nd is characterized in that the communication unit through a web server and recognition management server, and wherein the recognition management server in the successful voice recognition of the user under the condition of providing for a restricted ultra link access.

13. System according to Claim 1, 1st and 2nd is characterized in that the communication unit is a desktop computer interface, and wherein the entire system is set up on a desktop computer.

14. System according to Claim 2, characterized in that the user is a user terminal, wherein the 1st and 2nd communication unit is connected through a network server, and wherein the network server in the successful voice recognition of the end user to provide the protected network access.

15. System according to Claim 14, characterized in that a successful voice recognition of the standard 1st authentication.

CPC - классификация

G G0 G06 G06Q G06Q2 G06Q20 G06Q20/G06Q20/0 G06Q20/00 G06Q20/4 G06Q20/40 G06Q20/401 G06Q20/4014 G06Q20/40145 G07 G07C G07C9 G07C9/G07C9/0 G07C9/00 G07C9/001 G07C9/0015 G07C9/00158 G07C9/0016 G07C9/00166 G07C9/3 G07C9/37 G07C9/38 G1 G10 G10L G10L1 G10L15 G10L15/G10L15/2 G10L15/26 H H0 H04 H04L H04L2 H04L24 H04L246 H04L2463 H04L2463/H04L2463/1 H04L2463/10 H04L2463/102 H04L29 H04L29/H04L29/0 H04L29/06 H04L6 H04L63 H04L63/H04L63/0 H04L63/08 H04L63/086 H04L63/0861 H04L9 H04L9/H04L9/4 H04L9/40

IPC - классификация

G G0 G06 G06F G06F2 G06F21 G06F21/G06F21/2 G06F21/20 G06Q G06Q2 G06Q20 G06Q20/G06Q20/0 G06Q20/00 G07 G07C G07C9 G07C9/G07C9/0 G07C9/00 G1 G10 G10L G10L1 G10L15 G10L15/G10L15/0 G10L15/00 G10L15/2 G10L15/26 G10L17 G10L17/G10L17/0 G10L17/00 H H0 H04 H04L H04L2 H04L29 H04L29/H04L29/0 H04L29/06

Получить PDF