Title: SPEECH PROCESSING
Catalog Description: Speech production theory, acoustic tube model, linear prediction model, cepstrum analysis, homomorphic speech processing, vector quantization and speech coding, speech enhancement, text-to-speech synthesis, hidden Markov models and their application to speech recognition.
Coordinator: Murat Saraçlar, Asst. Prof.
Goals: To learn the necessary tools for analyzing digital speech signals. To understand the acoustic model for speech production. To learn the basics of automatic speech recognition.
At the end of this course the students will be able to:
• To analyze a speech signal in terms of its frequency content
• Extract certain acoustic features from a speech signal
• Understand the basics of human speech production mechanism
• Understand which speech coding methods are used for what reasons
Textbook: Huang, Acero and Hon, Spoken Language Processing, Prentice Hall.
• Benesty, Sondhi, and Huang (Eds.), Springer Handbook of Speech Processing, Springer.
• Quatieri, Discrete-Time Speech Signal Processing, Prentice Hall.
• Rabiner and Schafer, Digital Processing of Speech Signals, Prentice Hall.
• Rabiner and Juang, Fundamentals of Speech Recognition, Prentice Hall.
• Deller, Proakis, and Hansen, Discrete Time Processing of Speech Signals.
Prerequisite by Topic:
• Sampling Theorem
• Digital Filter Design
• Fourier Analysis
• Wave Theory
• Linear Algebra
• Signals and Systems
• Sound and Human Speech Production and Perception (1 week)
• Phonetics and Phonology, Acoustical Transducers (1 week)
• DSP Review (1 week)
• Speech Production Mechanism (1 week)
• Short-Term Processing of Speech (1 week)
• Linear Prediction Analysis (1 week)
• Cepstral Analysis (1 week)
• Perceptually Motivated Representations (1 week)
• Filterbanks and Wavelets (1 week)
• Formants and Pitch (1 week)
• Applications in Speech Coding, Speech Enhancement, Speech Synthesis and Speech Recognition (3 weeks)
The class meets for two lectures a week -- one lecture consists of a two-hour session and the second lecture is one-hour. There is a semester project assigned to 1-2 students. In the projects, students are expected to first prepare a literature survey on the topic and then to apply the techniques that they learned in the class.
• Students use Wavesurfer or Praat for speech analysis.
• Homeworks require MATLAB programming
• Semester project requires MATLAB or C or C++ programming
• Final Examination %30
• Homeworks %30
• Semester project %40
Semester Project Topics:
• Pitch contour estimation
• Formant estimation and tracking
• Speech enhancement
• Speaker verification
• LPC vocoder
• Acoustic event classification
• Isolated word recognition
• Keyword spotting
a. an ability to apply knowledge of mathematics, DSP and computer programming. Acoustics, and Wave equation are also used in the course for explaining speech productiopn theory.
b. an ability to analyze and interpret speech signals in terms of their frequency characteristics
c. an ability to design a speech coder that meets specific requirements
g. an ability to communicate effectively. In the semester projects students are asked to present their work with a Power Point presentation, and also write a project report. Therefore there is a chance to improve both oral and written communication skills.
k. an ability to use the techniques, skills, and modern engineering tools necessary for engineering practice. This is assessed in the semester project. Project topics require either MATLAB programming or C/C++ programming skills.
Prepared By: Levent M. Arslan
Modified By: Murat Saraçlar