Project Development: April 2010

## Thursday, April 22, 2010

### Speech Processing, Proj05

//Purdue Cal - ECE 595C.
//Spring 2010.
//Project 05.
//All codes are in Matlab.

1. Getting the Mel-Scale Frequencies Cepstral Coefficient.
1.1 Using overlapping Triangular window.
1.2 Calculate the energy for each filter band. (16 in total).
1.3 Set all values (except energy values on these 16 frequencies) to be zeros.
1.4 Do Discrete Cosine Transform.
Plot the first feature through all the frames for one utterance. (trajectories).

2. Dynamic Time Warping with simple Euclidean and cosine distance.
Here, I didn't add any constraints in DTW.

3. Getting the features for equal spaced frequencies' magnitude.

Use equal spaced frequencies is sure not better than Mel Scale frequencies. This is why we use MFCCs instead of using these. MFCCs can better approximate the human auditory system.

## Wednesday, April 14, 2010

### Speech Processing, Proj04

//Purdue Cal - ECE 595C.
//Spring 2010.
//Project 04.
//All codes are in Matlab.

1. Get the Real Cepstrum of the windowed speech.
2. Separate the Vocal Tract and the Glottal part by using a low time lifter.
3. Do DFT to the h[n] in quefrency.
4. Get the H(w) by using exp(reverse of the operation 'log').

Hint:
1. How to read pitch frequency: use two of the adjacent excitation peaks.
2. How to read F1: from the result of step 4 above, we can easily read the F1, F2...(Formants).
3. Pitch frequency has nothing relationship with H(w) of VT(Vocal Tract).