Authors V.Yu. Shelepov, A.V. Nicenko
Month, Year 04, 2016 @en
Index UDC 004.89:004.93
Abstract The paper discusses the method of word recognition of the super dimensioned vocabularies of Russian word forms using the authors’ automatic segmentation of speech signal system. The method can be applied both for DTW-recognition and recognition by means of the Hidden Markov models. But we mean method of diphone DTW- recognition (developing by authors) when practical recognition of speech units is described. Hence we adduce the conception of formalized diphone. Then the notions of quasi-word stem and quasi-inflection are defined. The general algorithm of quasi-word stems construction for the given list of word forms is suggested. The algorithm of Russian participles recognition using quasi-word stems is expounded. We suggest the recognizing algorithm of the words initial sounds (or sufficiently narrow classes of they belong to) in order to accelerate the recognition of the large vocabulary of quasi-word stems. In conclusion it should be noted that there is an objective problem of robust recognition of short words. Therefore quasiword stems have good recognition when they are sufficiently long. In general case the vulnerable point of quasi-word stems using is the recognition of shorter speech segments in comparison with the primary words (the step to overcome this difficulty lies in above-mentioned procedure of the first sound classification). Nevertheless using of quasi-word stems seems reasonable in recognition of the super dimensioned vocabularies of Russian word forms.

Keywords Segmentation, quasi-word stem, parts of the first sound, recognition classes, transitions depending on intermediate recognition results
