CS3340:   Intro OOP and Design

 

Training

 

The process of collecting speech samples for the purpose of tuning the recognition algorithm for better performance.

 

There are two basic ways in which data is gathered for training. One is referred to as "supervised" and the other is "unsupervised".

Supervised Training

Here the user is asked to say specific utterances and the samples are stored with reference to this identity.

Example for Speech Verification

  • When a speaker attempts to verify himself with this system, his incoming signal is compared to that of a "key".

  • This key should be a signal that produces a high correlation for both magnitude and pitch data when the authorized user utters the password, but not in cases where:
    • the user says the wrong word (the password is forgotten)
    • an intruder says either the password or a wrong word

  • To develop such a key, the system is trained for recognition of the speaker. In this instance, the speaker first chooses a password, and it is acquired five separate times. The pitch and magnitude information are recorded for each. The signal that matches the other four signals best in both cases is chosen as the key.

 

Unsupervised Training

Here the user is asked to speek but, not given a predefined, known text reference to say. This kind of training is more challenging to use.

 

© Lynne Grewe