Project Development: Speech Processing, Proj02

Sunday, March 21, 2010

Speech Processing, Proj02

//Purdue Cal - ECE 595C.
//Spring 2010.
//Project 02.
//Copyright @ 2010 antonio081014  antonio081014 ;
//All codes are in Matlab.


Report Link:


Task 1: Obtain estimated F0  by using the log harmonic product spectrum with K = 4 and K=10.

PS: The log product of the harmonic spectrum is the summation of the log harmonic spectrum.\
1. Down sampling with K=4 and K=10 to get the spectrum for each case.
2. Sum them up.
3. Find the highest peak in the final spectrum. That is the estimated F0.

Code:
fdate = fft(dat(:,15).*hamming(320), 1024);  % The 15th frame in the data.
tmp = log(abs(fdat3(1:512)));
figure;
subplot(4,1,1);
plot(tmp);
title(['The frequency plot of the frame #:', num2str(23)]);
xlabel('Freq index');
ylabel('Log-Magnitude');
hold on;
tmp1 = downsample(tmp, 4);
z = zeros(512 - length(tmp1),1);
tmp1 = [tmp1; z];
tmp2 = downsample(tmp, 10);
z = zeros(512 - length(tmp2),1);
tmp2 = [tmp2; z];

subplot(4,1,2);
plot(tmp1, 'r');
xlabel('Freq index, K=4');
ylabel('Log-Magnitude');

subplot(4,1,3);
plot(tmp2, 'g');
xlabel('Freq index, K=10');
ylabel('Log-Magnitude');

product = tmp + tmp1 + tmp2;
subplot(4,1,4);
plot(product, 'r');
xlabel('Freq index');
ylabel('Log-Magnitude');
[x, y] = ginput(3);
% x = [11.79; 57.60; 93.18;]; These are the index in frequency domain.
Task 2: Using overlap-add method, reconstruct the given utterance after filtering each frame using a bandpass filter from 50Hz-2000Hz.

1. Filter each frame.
2. Reconstruct the utterance by using overlap-add method.

Code:
% Filter:
recdat = zeros(320, 100);
for i=1:100                                    % There is 100 frames here.
    fmdate = dat(:,i);
    ffmdat = fft(fmdate.*hamming(320),1024); % Take Fourier Transform using 1024 points.
    LF = ceil(1024 / fs * 50);
    HF = floor(1024 / fs * 2000);
    ffmdat(1:LF,1) = 0;
    ffmdat(HF:512,1) = 0;
    ffmdat(512:end-HF,1) = 0;
    ffmdat(end-LF:end,1) = 0;
    recdat(:,i) = real(ifft(ffmdat,320));
    clear ffmdat;
end
% Reconstruct
update_sp = zeros(1,16000);
start = 1;
stop = start + 320 - 1;
for i=1:100
    update_sp(start:stop) = update_sp(start:stop) + recdat(:,i)';    % overlap-add;
    start = start + 160;
    stop = start + 320 - 1;
    if stop > 16000
        break
    end
end

figure;
plot(update_sp);
title('The filterd speech.');
% soundsc(update_sp);










fm = recdat(:,23);
ffm = fft(fm.*hamming(320), 1024);
figure;
plot(abs(ffm(1:512)));
title(['The filterd frame #:', num2str(23)]);
[xx, yy] = ginput(3);
% xx = 35.2534562211982;72.5806451612903;107.142857142857;];










%%
figure;
x = nn_spch;
window = 128;
noverlap = 64;
nfft = 128;
fs = 16000;
spectrogram(x,window,noverlap,nfft,fs, 'yaxis');
colormap(gray);
title('Power Spectral Density, Original');
xlabel('In Time Domain.');
ylabel('In Freq Domain.');










figure;
x = update_sp;
window = 128;
noverlap = 64;
nfft = 128;
fs = 16000;
spectrogram(x,window,noverlap,nfft,fs, 'yaxis');
colormap(gray);
title('Power Spectral Density, Filterd');
xlabel('In Time Domain.');
ylabel('In Freq Domain.');