Version: 2024.08
Loading Scheme...
G10L 13/00 | Speech synthesis; Text to speech systems [2013-01] |
G10L 13/02 | . | Methods for producing synthetic speech; Speech synthesisers [2013-01] |
G10L 2013/021 | . . | {Overlap-add techniques} [2013-01] |
G10L 13/027 | . . | Concept to speech synthesisers; Generation of natural phrases from machine-based concepts (generation of parameters for speech synthesis out of text G10L 13/08) [2013-01] |
G10L 13/033 | . . | Voice editing, e.g. manipulating the voice of the synthesiser [2013-01] |
G10L 13/0335 | . . . | {Pitch control} [2013-01] |
G10L 13/04 | . . | Details of speech synthesis systems, e.g. synthesiser structure or memory management [2013-01] |
G10L 13/047 | . . . | Architecture of speech synthesisers [2013-01] |
G10L 13/06 | . | Elementary speech units used in speech synthesisers; Concatenation rules [2013-01] |
G10L 13/07 | . . | Concatenation rules [2013-01] |
G10L 13/08 | . | Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination [2013-01] |
G10L 2013/083 | . . | {Special characters, e.g. punctuation marks} [2013-01] |
G10L 13/086 | . . | {Detection of language} [2013-01] |
G10L 13/10 | . . | Prosody rules derived from text; Stress or intonation [2013-01] |
G10L 2013/105 | . . . | {Duration} [2013-01] |
G10L 15/00 |
G10L 15/005 | . | {Language recognition} [2013-01] |
G10L 15/01 | . | Assessment or evaluation of speech recognition systems [2013-01] |
G10L 15/02 | . | Feature extraction for speech recognition; Selection of recognition unit [2013-01] |
G10L 2015/022 | . . | {Demisyllables, biphones or triphones being the recognition units} [2013-01] |
G10L 2015/025 | . . | {Phonemes, fenemes or fenones being the recognition units} [2013-01] |
G10L 2015/027 | . . | {Syllables being the recognition units} [2013-01] |
G10L 15/04 | . | Segmentation; Word boundary detection [2013-01] |
G10L 15/05 | . . | Word boundary detection [2013-01] |
G10L 15/06 | . | Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice (G10L 15/14 takes precedence) [2013-01] |
G10L 15/063 | . . | {Training} [2013-01] |
G10L 2015/0631 | . . . | {Creating reference templates; Clustering} [2013-01] |
G10L 2015/0633 | . . . . | {using lexical or orthographic knowledge sources} [2013-01] |
G10L 2015/0635 | . . . | {updating or merging of old and new templates; Mean values; Weighting} [2013-01] |
G10L 2015/0636 | . . . . | {Threshold criteria for the updating} [2013-01] |
G10L 2015/0638 | . . . | {Interactive procedures} [2013-01] |
G10L 15/065 | . . | Adaptation [2013-01] |
G10L 15/07 | . . . | to the speaker [2013-01] |
G10L 15/075 | . . . . | {supervised, i.e. under machine guidance} [2013-01] |
G10L 15/08 | . | Speech classification or search [2013-01] |
G10L 2015/081 | . . | {Search algorithms, e.g. Baum-Welch or Viterbi} [2013-01] |
G10L 15/083 | . . | {Recognition networks (G10L 15/142, G10L 15/16 take precedence)} [2013-01] |
G10L 2015/085 | . . | {Methods for reducing search complexity, pruning} [2013-01] |
G10L 2015/086 | . . | {Recognition of spelled words} [2013-01] |
G10L 2015/088 | . . | {Word spotting} [2013-01] |
G10L 15/10 | . . | using distance or distortion measures between unknown speech and reference templates [2013-01] |
G10L 15/12 | . . | using dynamic programming techniques, e.g. dynamic time warping [DTW] [2013-01] |
G10L 15/14 | . . |
G10L 15/142 | . . . | {Hidden Markov Models [HMMs]} [2013-01] |
G10L 15/144 | . . . . | {Training of HMMs} [2013-01] |
G10L 15/146 | . . . . . | {with insufficient amount of training data, e.g. state sharing, tying, deleted interpolation} [2013-01] |
G10L 15/148 | . . . . | {Duration modelling in HMMs, e.g. semi HMM, segmental models or transition probabilities} [2013-01] |
G10L 15/16 | . . | using artificial neural networks [2013-01] |
G10L 15/18 | . . | using natural language modelling [2013-01] |
G10L 15/1807 | . . . | {using prosody or stress} [2013-01] |
G10L 15/1815 | . . . | {Semantic context, e.g. disambiguation of the recognition hypotheses based on word meaning} [2013-01] |
G10L 15/1822 | . . . | {Parsing for meaning understanding} [2013-01] |
G10L 15/183 | . . . | using context dependencies, e.g. language models [2013-01] |
G10L 15/187 | . . . . | Phonemic context, e.g. pronunciation rules, phonotactical constraints or phoneme n-grams [2013-01] |
G10L 15/19 | . . . . | Grammatical context, e.g. disambiguation of the recognition hypotheses based on word sequence rules [2013-01] |
G10L 15/193 | . . . . . | Formal grammars, e.g. finite state automata, context free grammars or word networks [2013-01] |
G10L 15/197 | . . . . . | Probabilistic grammars, e.g. word n-grams [2013-01] |
G10L 15/20 | . | Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech (G10L 21/02 takes precedence) [2013-01] |
G10L 15/22 | . | Procedures used during a speech recognition process, e.g. man-machine dialogue [2013-01] |
G10L 2015/221 | . . | {Announcement of recognition results} [2013-01] |
G10L 15/222 | . . | {Barge in, i.e. overridable guidance for interrupting prompts} [2013-01] |
G10L 2015/223 | . . | {Execution procedure of a spoken command} [2013-01] |
G10L 2015/225 | . . | {Feedback of the input speech} [2013-01] |
G10L 2015/226 | . . | {using non-speech characteristics} [2020-08] |
G10L 2015/227 | . . . | {of the speaker; Human-factor methodology} [2013-01] |
G10L 2015/228 | . . . | {of application context} [2013-01] |
G10L 15/24 | . | Speech recognition using non-acoustical features [2013-01] |
G10L 15/25 | . . | using position of the lips, movement of the lips or face analysis [2013-01] |
G10L 15/26 | . |
G10L 15/28 | . | Constructional details of speech recognition systems [2013-01] |
G10L 15/285 | . . | {Memory allocation or algorithm optimisation to reduce hardware requirements} [2013-01] |
G10L 15/30 | . . | Distributed recognition, e.g. in client-server systems, for mobile phones or network applications [2013-01] |
G10L 15/32 | . . | Multiple recognisers used in sequence or in parallel; Score combination systems therefor, e.g. voting systems [2013-01] |
G10L 15/34 | . . | Adaptation of a single recogniser for parallel processing, e.g. by use of multiple processors or cloud computing [2013-01] |
G10L 17/00 | Speaker identification or verification techniques [2024-01] |
G10L 17/02 | . | Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction [2013-01] |
G10L 17/04 | . | Training, enrolment or model building [2013-01] |
G10L 17/06 | . | Decision making techniques; Pattern matching strategies [2013-01] |
G10L 17/08 | . . | Use of distortion metrics or a particular distance between probe pattern and reference templates [2013-01] |
G10L 17/10 | . . | Multimodal systems, i.e. based on the integration of multiple recognition engines or fusion of expert systems [2013-01] |
G10L 17/12 | . . | Score normalisation [2013-01] |
G10L 17/14 | . . | Use of phonemic categorisation or speech recognition prior to speaker recognition or verification [2013-01] |
G10L 17/16 | . | Hidden Markov models [HMM] [2023-02] |
G10L 17/18 | . | Artificial neural networks; Connectionist approaches [2013-01] |
G10L 17/20 | . | Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions [2013-01] |
G10L 17/22 | . | Interactive procedures; Man-machine interfaces [2013-01] |
G10L 17/24 | . . | the user being prompted to utter a password or a predefined phrase [2013-01] |
G10L 17/26 | . | Recognition of special voice characteristics, e.g. for use in lie detectors; Recognition of animal voices [2013-01] |
G10L 19/00 | Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis (in musical instruments G10H) [2017-08] |
G10L 2019/0001 | . | {Codebooks} [2013-01] |
G10L 2019/0002 | . . | {Codebook adaptations} [2013-01] |
G10L 2019/0003 | . . | {Backward prediction of gain} [2013-01] |
G10L 2019/0004 | . . | {Design or structure of the codebook} [2013-01] |
G10L 2019/0005 | . . . | {Multi-stage vector quantisation} [2013-01] |
G10L 2019/0006 | . . . | {Tree or treillis structures; Delayed decisions} [2013-01] |
G10L 2019/0007 | . . | {Codebook element generation} [2013-01] |
G10L 2019/0008 | . . . | {Algebraic codebooks} [2013-01] |
G10L 2019/0009 | . . . | {Orthogonal codebooks} [2013-01] |
G10L 2019/001 | . . . | {Interpolation of codebook vectors} [2013-01] |
G10L 2019/0011 | . . | {Long term prediction filters, i.e. pitch estimation} [2013-01] |
G10L 2019/0012 | . . | {Smoothing of parameters of the decoder interpolation} [2013-01] |
G10L 2019/0013 | . . | {Codebook search algorithms} [2013-01] |
G10L 2019/0014 | . . . | {Selection criteria for distances} [2013-01] |
G10L 2019/0015 | . . . | {Viterbi algorithms} [2013-01] |
G10L 2019/0016 | . . | {Codebook for LPC parameters} [2013-01] |
G10L 19/0017 | . | {Lossless audio signal coding; Perfect reconstruction of coded audio signal by transmission of coding error (G10L 19/24 takes precedence)} [2013-01] |
G10L 19/0018 | . | {Speech coding using phonetic or linguistical decoding of the source; Reconstruction using text-to-speech synthesis} [2013-01] |
G10L 19/002 | . |
G10L 19/005 | . | Correction of errors induced by the transmission channel, if related to the coding algorithm [2013-01] |
G10L 19/008 | . | Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing [2020-08] |
G10L 19/012 | . | Comfort noise or silence coding [2013-01] |
G10L 19/018 | . | Audio watermarking, i.e. embedding inaudible data in the audio signal [2013-01] |
G10L 19/02 | . | using spectral analysis, e.g. transform vocoders or subband vocoders [2013-01] |
G10L 19/0204 | . . | {using subband decomposition} [2013-01] |
G10L 19/0208 | . . . | {Subband vocoders} [2013-01] |
G10L 19/0212 | . . | {using orthogonal transformation} [2013-01] |
G10L 19/0216 | . . . | {using wavelet decomposition} [2013-01] |
G10L 19/022 | . . | Blocking, i.e. grouping of samples in time; Choice of analysis windows; Overlap factoring [2013-01] |
G10L 19/025 | . . . | Detection of transients or attacks for time/frequency resolution switching [2013-01] |
G10L 19/028 | . . | Noise substitution, i.e. substituting non-tonal spectral components by noisy source (comfort noise for discontinuous speech transmission G10L 19/012) [2013-01] |
G10L 19/03 | . . | Spectral prediction for preventing pre-echo; Temporary noise shaping [TNS], e.g. in MPEG2 or MPEG4 [2013-01] |
G10L 19/032 | . . | Quantisation or dequantisation of spectral components [2013-01] |
G10L 19/035 | . . . | Scalar quantisation [2013-01] |
G10L 19/038 | . . . | Vector quantisation, e.g. TwinVQ audio [2013-01] |
G10L 19/04 | . | using predictive techniques [2013-01] |
G10L 19/06 | . . | Determination or coding of the spectral characteristics, e.g. of the short-term prediction coefficients [2013-01] |
G10L 19/07 | . . . | Line spectrum pair [LSP] vocoders [2013-01] |
G10L 19/08 | . . | Determination or coding of the excitation function; Determination or coding of the long-term prediction parameters [2013-01] |
G10L 19/083 | . . . |
G10L 19/087 | . . . | using mixed excitation models, e.g. MELP, MBE, split band LPC or HVXC [2013-01] |
G10L 19/09 | . . . | Long term prediction, i.e. removing periodical redundancies, e.g. by using adaptive codebook or pitch predictor [2013-01] |
G10L 19/093 | . . . | using sinusoidal excitation models [2013-01] |
G10L 19/097 | . . . | using prototype waveform decomposition or prototype waveform interpolative [PWI] coders [2013-01] |
G10L 19/10 | . . . | the excitation function being a multipulse excitation [2013-01] |
G10L 19/107 | . . . . | Sparse pulse excitation, e.g. by using algebraic codebook [2013-01] |
G10L 19/113 | . . . . | Regular pulse excitation [2013-01] |
G10L 19/12 | . . . | the excitation function being a code excitation, e.g. in code excited linear prediction [CELP] vocoders [2013-01] |
G10L 19/125 | . . . . | Pitch excitation, e.g. pitch synchronous innovation CELP [PSI-CELP] [2013-01] |
G10L 19/13 | . . . . | Residual excited linear prediction [RELP] [2013-01] |
G10L 19/135 | . . . . | Vector sum excited linear prediction [VSELP] [2013-01] |
G10L 19/16 | . . | Vocoder architecture [2013-01] |
G10L 19/167 | . . . | {Audio streaming, i.e. formatting and decoding of an encoded audio signal representation into a data stream for transmission or storage purposes} [2013-01] |
G10L 19/173 | . . . | {Transcoding, i.e. converting between two coded representations avoiding cascaded coding-decoding} [2013-01] |
G10L 19/18 | . . . | Vocoders using multiple modes [2013-01] |
G10L 19/20 | . . . . | using sound class specific coding, hybrid encoders or object based coding [2013-01] |
G10L 19/22 | . . . . | Mode decision, i.e. based on audio signal content versus external parameters [2013-01] |
G10L 19/24 | . . . . | Variable rate codecs, e.g. for generating different qualities using a scalable representation such as hierarchical encoding or layered encoding [2013-01] |
G10L 19/26 | . . | Pre-filtering or post-filtering [2013-01] |
G10L 19/265 | . . . | {Pre-filtering, e.g. high frequency emphasis prior to encoding} [2013-01] |
G10L 21/00 | Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility (G10L 19/00 takes precedence) [2024-01] |
G10L 21/003 | . | Changing voice quality, e.g. pitch or formants [2013-01] |
G10L 21/007 | . . | characterised by the process used [2013-01] |
G10L 21/01 | . . . | Correction of time axis [2013-01] |
G10L 21/013 | . . . | Adapting to target pitch [2013-01] |
G10L 2021/0135 | . . . . | {Voice conversion or morphing} [2013-01] |
G10L 21/02 | . |
G10L 21/0208 | . . | Noise filtering [2013-01] |
G10L 2021/02082 | . . . | {the noise being echo, reverberation of the speech} [2013-01] |
G10L 2021/02085 | . . . | {Periodic noise} [2013-01] |
G10L 2021/02087 | . . . | {the noise being separate speech, e.g. cocktail party} [2013-01] |
G10L 21/0216 | . . . | characterised by the method used for estimating noise [2013-01] |
G10L 2021/02161 | . . . . | {Number of inputs available containing the signal or the noise to be suppressed} [2013-01] |
G10L 2021/02163 | . . . . . | {Only one microphone} [2013-01] |
G10L 2021/02165 | . . . . . | {Two microphones, one receiving mainly the noise signal and the other one mainly the speech signal} [2013-01] |
G10L 2021/02166 | . . . . . | {Microphone arrays; Beamforming} [2013-01] |
G10L 2021/02168 | . . . . | {the estimation exclusively taking place during speech pauses} [2013-01] |
G10L 21/0224 | . . . . | Processing in the time domain [2013-01] |
G10L 21/0232 | . . . . | Processing in the frequency domain [2013-01] |
G10L 21/0264 | . . . | characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques [2013-01] |
G10L 21/0272 | . . | Voice signal separating [2013-01] |
G10L 21/028 | . . . | using properties of sound source [2013-01] |
G10L 21/0308 | . . . | characterised by the type of parameter measurement, e.g. correlation techniques, zero crossing techniques or predictive techniques [2013-01] |
G10L 21/0316 | . . | by changing the amplitude [2021-08] |
G10L 21/0324 | . . . | Details of processing therefor [2013-01] |
G10L 21/0332 | . . . . | involving modification of waveforms [2013-01] |
G10L 21/034 | . . . . | Automatic adjustment [2013-01] |
G10L 21/0356 | . . . | for synchronising with other signals, e.g. video signals [2013-01] |
G10L 21/0364 | . . . | for improving intelligibility [2021-08] |
G10L 2021/03643 | . . . . | {Diver speech} [2021-08] |
G10L 2021/03646 | . . . . | {Stress or Lombard effect} [2021-08] |
G10L 21/038 | . . | using band spreading techniques [2013-01] |
G10L 21/0388 | . . . | Details of processing therefor [2013-01] |
G10L 21/04 | . | Time compression or expansion [2013-01] |
G10L 21/043 | . . | by changing speed [2013-01] |
G10L 21/045 | . . . | using thinning out or insertion of a waveform [2013-01] |
G10L 21/047 | . . . . | characterised by the type of waveform to be thinned out or inserted [2013-01] |
G10L 21/049 | . . . . | characterised by the interconnection of waveforms [2013-01] |
G10L 21/055 | . . | for synchronising with other signals, e.g. video signals [2013-01] |
G10L 21/057 | . . | for improving intelligibility [2013-01] |
G10L 2021/0575 | . . . | {Aids for the handicapped in speaking} [2013-01] |
G10L 21/06 | . | Transformation of speech into a non-audible representation, e.g. speech visualisation or speech processing for tactile aids (G10L 15/26 takes precedence) [2013-01] |
G10L 2021/065 | . . | {Aids for the handicapped in understanding} [2013-01] |
G10L 21/10 | . . | Transforming into visible information [2017-08] |
G10L 2021/105 | . . . | {Synthesis of the lips movements from speech, e.g. for talking heads} [2013-01] |
G10L 21/12 | . . . | by displaying time domain information [2013-01] |
G10L 21/14 | . . . | by displaying frequency domain information [2013-01] |
G10L 21/16 | . . | Transforming into a non-visible representation (devices or methods enabling ear patients to replace direct auditory perception by another kind of perception A61F 11/04) [2017-08] |
G10L 21/18 | . . | Details of the transformation process [2013-01] |
G10L 25/00 | Speech or voice analysis techniques not restricted to a single one of groups G10L 15/00 - G10L 21/00 (muting semiconductor-based amplifiers when some special characteristics of a signal are sensed by a speech detector, e.g. sensing when no signal is present, H03G 3/34) [2020-08] |
G10L 25/03 | . | characterised by the type of extracted parameters [2013-01] |
G10L 25/06 | . . | the extracted parameters being correlation coefficients [2013-01] |
G10L 25/09 | . . | the extracted parameters being zero crossing rates [2013-01] |
G10L 25/12 | . . | the extracted parameters being prediction coefficients [2013-01] |
G10L 25/15 | . . | the extracted parameters being formant information [2013-01] |
G10L 25/18 | . . | the extracted parameters being spectral information of each sub-band [2013-01] |
G10L 25/21 | . . | the extracted parameters being power information [2013-01] |
G10L 25/24 | . . | the extracted parameters being the cepstrum [2013-01] |
G10L 25/27 | . | characterised by the analysis technique [2013-01] |
G10L 25/30 | . . | using neural networks [2013-01] |
G10L 25/33 | . . | using fuzzy logic [2013-01] |
G10L 25/36 | . . | using chaos theory [2013-01] |
G10L 25/39 | . . | using genetic algorithms [2013-01] |
G10L 25/45 | . | characterised by the type of analysis window [2013-01] |
G10L 25/48 | . | specially adapted for particular use [2013-01] |
G10L 25/51 | . . | for comparison or discrimination [2013-01] |
G10L 25/54 | . . . | for retrieval [2013-01] |
G10L 25/57 | . . . | for processing of video signals [2013-01] |
G10L 25/60 | . . . | for measuring the quality of voice signals [2013-01] |
G10L 25/63 | . . . | for estimating an emotional state [2013-01] |
G10L 25/66 | . . . | for extracting parameters related to health condition (detecting or measuring for diagnostic purposes A61B 5/00) [2013-01] |
G10L 25/69 | . . | for evaluating synthetic or decoded voice signals [2013-01] |
G10L 25/72 | . . | for transmitting results of analysis [2013-01] |
G10L 25/75 | . | for modelling vocal tract parameters [2013-01] |
G10L 25/78 | . | Detection of presence or absence of voice signals (switching of direction of transmission by voice frequency in two-way loud-speaking telephone systems H04M 9/10) [2013-01] |
G10L 2025/783 | . . | {based on threshold decision} [2013-01] |
G10L 2025/786 | . . . | {Adaptive threshold} [2013-01] |
G10L 25/81 | . . | for discriminating voice from music [2013-01] |
G10L 25/84 | . . | for discriminating voice from noise [2013-01] |
G10L 25/87 | . . | Detection of discrete points within a voice signal [2013-01] |
G10L 25/90 | . | Pitch determination of speech signals [2013-01] |
G10L 2025/903 | . . | {using a laryngograph} [2013-01] |
G10L 2025/906 | . . | {Pitch tracking} [2013-01] |
G10L 25/93 | . | Discriminating between voiced and unvoiced parts of speech signals (G10L 25/90 takes precedence) [2013-01] |
G10L 2025/932 | . . | {Decision in previous or following frames} [2013-01] |
G10L 2025/935 | . . | {Mixed voiced class; Transitions} [2013-01] |
G10L 2025/937 | . . | {Signal energy in various frequency bands} [2013-01] |
G10L 99/00 | Subject matter not provided for in other groups of this subclass [2013-01] |