Automatic speaker identification

From Glottopedia
Revision as of 14:09, 23 May 2013 by KPolitt (talk | contribs) (cross reference)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

According to the application area, speaker recognition systems can be divided into speaker identification systems and speaker verification systems. Speaker identification consists in assigning the input speech signal to one person of a known group, while speaker verification consists in confirming or not the identity of the user of the system.


The two largest factors affecting automatic speaker identification performance are the size of the population to be distinguished among and the degradations introduced by noise (e.g. telephone transmission). Automatic speaker recognition is fundamental in all systems that deliver services or reserved information, particularly when an high degree of security is necessary.

Possible applications are represented by retrieval of private information, automatic financial transactions, control of access to security or reserved areas, etc. Another range of possible applications lies in the area of crime: e.g. identifing telephone speakers in sexual harassment cases, bomb threats, etc.

The usual approach to speaker recognition is based on the classification of acoustic parameters derived from the speech signal. Generally, the parameters are obtained via short time spectral analysis and contain both phonetic information, related to the uttered text, and individual information, related to the speaker. Since the task of separating the phonetic information from the individual one is not yet solved, many speaker recognition systems behave in a text dependent way (i.e. the user must utter a predefined sentence).


Utrecht Lexicon of Linguistics