Project Development: April 2010

Thursday, April 22, 2010

Speech Processing, Proj05

//Purdue Cal - ECE 595C.
//Spring 2010.
//Project 05.
//Copyright @ 2010 antonio081014  ;
//All codes are in Matlab.

Report Link:

1. Getting the Mel-Scale Frequencies Cepstral Coefficient.
    1.1 Using overlapping Triangular window.
    1.2 Calculate the energy for each filter band. (16 in total).
    1.3 Set all values (except energy values on these 16 frequencies) to be zeros.
    1.4 Do Discrete Cosine Transform.
   Plot the first feature through all the frames for one utterance. (trajectories).

2. Dynamic Time Warping with simple Euclidean and cosine distance.
    Here, I didn't add any constraints in DTW.

3. Getting the features for equal spaced frequencies' magnitude.

    Use equal spaced frequencies is sure not better than Mel Scale frequencies. This is why we use MFCCs instead of using these. MFCCs can better approximate the human auditory system.

Wednesday, April 14, 2010

Speech Processing, Proj04

//Purdue Cal - ECE 595C.
//Spring 2010.
//Project 04.
//Copyright @ 2010 antonio081014  ;
//All codes are in Matlab.

Report Link:

Task one:
1. Get the Real Cepstrum of the windowed speech.
2. Separate the Vocal Tract and the Glottal part by using a low time lifter.
3. Do DFT to the h[n] in quefrency.
4. Get the H(w) by using exp(reverse of the operation 'log'). 

1. How to read pitch frequency: use two of the adjacent excitation peaks.
2. How to read F1: from the result of step 4 above, we can easily read the F1, F2...(Formants).
3. Pitch frequency has nothing relationship with H(w) of VT(Vocal Tract).

Task Two:
1. Using DFT to get the spectrum of H(w);
2. Using LPC to get the envelop of the spectrum of H(w).
3. Using Cepstrum to get the spectrum of H(w).
4. Compare the F1, F2,... and also pitch frequency with each other method.

Task Three:
1. Using a unvoiced frame(fricative) to do the cepstrum.
2. Using a voiced frame to do the cepstrum.
3. Check the difference of F1, F2 ... with these two frames.