ps
consider that fa,fb and fc, if we extimate music, are discrete values
iirc 12 values per octave, 8 octaves in all gives 96 values
look on google for the exact formula that links frequency and notes
Nice! I wonder how an electric guitar would sound in it. I never saw a decent rendition of a guitar in PSG...
A bit offtopic:
I was experimenting with much more simple tone regognition long time ago... It was a simple program for MSX tR that regognized just a frequency you whistled to tR and played it back on PSG... I was hoping to add this as a user interface for MoonBlaster, but newer got that far... Maybe today it could be more easy to binary patch in as we have all these wonderful debugging environments on OpenMSX and BlueMSX.
Nice ! it could help not musician people ( like me )
leo post the matlab (or mail it to me)
it would be nice to make them run
anyway, do u maximize the abs(fft(data_chunk)).^2 ?
have u considered to maximize on fa,fb,fc,Aa,Ab,Ac
int_on_data_chunk_time( abs( data_chunk-psga(fa,Aa)-psgb(fb,Ab)-psgc(fc,Ac)).^2 )
and, moreover what about chunks of 1/50 sec?
this is the resolution of a normal psg player
In fact this a difference approach , i coded something approaching an other matlab trial with that : try to approximate the data_chunk with psg synthesis .
it was very very long because for every data_chunk i had to vary psg volume & tone freq , i build a table of 400 different value , still it was slow ( on my eee PC ) and the result was very bad , maybe the fault of my poorly coded algo also. I read pcmenc was using "viterbi" i wonder if it is faster .
by the way i am not using matlab bur ""octave" with "qtoctave" ide which are free and very matlab compatible.
I have tried faster that 5/sec like 50/sec but the result is awlfull , because on a given shor perriod of time the strongest signal could perfectly be a cymbal or other drum rather than the melody .
In fact i think there an infinite quantity of solutions depending on what you want :
1- play a wav as closest possible as original but with a cpu power compatible with a game coding.
2- retrieve the main theme and re-arrange by hand dealing with frequencies and not notes + instruments
3- recognize the partition & notes so you able to load that into a tracker or midi tool.
i believe 3 is very very hard, 2 is possible on some carefully choosen songs with some instruments ( guitar, and harmonic rich instruments does not suit very well , flute is ideal to give an idea ) , 1 could be also possible on a few sound but my solution might not be the good approach for that , mean square error + rate @ 1/50th should be better ( on the paper )
mail the scripts
I could try to help with them!
here is the last code , i have other trials or variations :
%psg % wav 3 psg % clear a = wavread ('ocarina1b.wav'); Fs = 44100/4; % Sampling frequency a =a * 127; % scale amplitude between +/- 127 tseg = round (Fs / 6 + 0.01); %% define a segment of time of 1/6th sec tsynth = [1:tseg] /Fs; % define the sample segment for psg synthesis integ = zeros(tseg/2,1); % integration init mv = zeros(8,1); mb = zeros(8,1); id1 = 1; id2 = tseg/2; % only 0-Fs/2 is interesting n =1; % Main loop for t =[1:tseg:length(a)-tseg] ofs = t; e = fft (a(ofs:ofs+tseg)); % do fft over little time segment f = abs(e(id1:id2)); id3 = round ( 0.3 * tseg /2); f(id3:length(f))= f (id3:length(f)) * 0; id3 = round ( 0.1 * tseg /2); f(1:id3)= f (1:id3) * 0.3; integ = integ * (1/4) + f * (3/4); melody = integ; s = sum(f)/ length(f); % defines kind of threshold for i=1:3 [mx(i) my(i)] = max(melody); if mx(i) > (s*3) % trim max & harmonic id3 = max (1, my(i)-8); id4 = min(my(i)+8,tseg/2); melody(id3:id4) = zeros(id4-id3+1,1); id3 = max (1, 2*my(i)-8); id4 = min(2*my(i)+8,tseg/2); melody(id3:id4) = zeros(id4-id3+1,1); id3 = round(max (1, my(i)*3-8)); id4 = round(min(my(i)*3+8,tseg/2)); melody(id3:id4) = zeros(id4-id3+1,1); %export maxs mb(i) = (Fs/2) * my(i) /(tseg/2); mv(i)= mx(i); else mv(i) = 0; mb(i) = 1; % to avoid div /0 later on end end % Begin gene synth sine a1(n)=mv(1); %* sign(sin (2* pi *tsynth * mb(1) )); a2(n)=mv(2); %* sign(sin (2* pi *tsynth * mb(2) )); a3(n)=mv(3); % * sign(sin (2* pi *tsynth * mb(3) )); t1(n)=mb(1); t2(n)=mb(2); t3(n)=mb(3); n=n+1; end ma = max( [max(a1) max(a2) max(a3)]); n =1; a1 =256*a1 / ma; a2 =256*a2 /ma; a3 =256*a3 /ma; la1 = max(round (2*log(a1)/log(2))/2,0.5); la2 = max(round (2*log(a2)/log(2))/2,0.5); la3 = max(round (2*log(a3)/log(2))/2,0.5); % imitate the log scale og PSG volume % signal re-synthesis in a PSG way for t =[1:tseg:length(a)-tseg] ap1 = 2^la1(n); ap2 = 2^la2(n); ap3 = 2^la3(n); x1 = ap1 * sign(sin (2* pi *tsynth * t1(n) )); x2 = ap2 * sign(sin (2* pi *tsynth * t2(n) )); % square signal is sign(sin(freq)) x3 = ap3 * sign(sin (2* pi *tsynth * t3(n) )); n=n+1; zout(t:t+tseg-1) = x1+x2+x3; % adds the 3x voices end rout3 = zout / max (zout); wavwrite("out.wav",rout3',Fs,8); % wav sample output %msxwrite("out.asm",la1,la2,la3,t1,t2,t3); fid = fopen("out.asm",'w'); n=1; ln =10; fz = 3579000 / 2 /16; for t =[1:length(la3)] fprintf(fid," db %d,%d,%d %c%c", ln , 2*la1(n)-1,2*la2(n)-1,2*la3(n)-1,char(10),char(13)); ln= ln +10; fprintf(fid ," dw %d,%d,%d %c%c", ln , round(fz/t1(n)) , round(fz/t2(n)) , round(fz/t3(n)),char(10),char(13) ); ln= ln+10; n=n+1; end fclose(fid);
Thanks Leo! l'll look into it asap, but the idea seems very promising already.
Consider that a channel of the PSG can generate 96 notes (8 octaves) and that
it could nice to restrict the optimization to that.
Each channel can have 16 volume levels, so we get
3*16*96 = 4608 possible points to evaluate
If a chunk is 1/60 sec, at 11Khz, it lasts about 184 samples
thus brute force approach that evaluates an "integral" of the squared differences
would work on 184 samples...
Quite doable on modern pc's
an other issue i found using this method is that we cant work directly in temporal , because we also have to take into account the phase , that even multiplicate the number of possibilities.
doing FFT allow us not to care about phase , but since you do fft you can get very quickly the frequency, without least squared approach.
i realized it during my trials , squared diff has to be "in phase" , so you look for freq/ampl/phase ...
do you see the problem : it is not the same if you substract a waveform in phase or quadrature or oposition .
good point, moreover, my proposal is a totally dummy,
as the number of values to be tested is
(16*96) ^ 3 that is "slightly" bigger than 3*16*96
if you have a cluster computer with 1000 cores you can try my code
%psg % wav 3 psg % clear [Y,Fs,NBITS] = wavread ('ocarina1b.wav'); a = Y/max([-Y;Y]) * 127; % scale amplitude between +/- 127 tseg = ceil(Fs/60); time = [1:tseg]; % define the sample segment for psg synthesis f0 = 440; n = [0:40] - 4*12; psgfreq = f0 * (2^(1/12)).^n; n = [8:15]; psgampl = 2.^-((15-n)/2); % Main loop Aseq = []; Fseq =[]; synseq = []; for i=0:tseg:length(a) t = time + i; synt = []; cmin = inf; for Aa = psgampl for Ab = psgampl for Ac = psgampl for Fa = psgfreq for Fb = psgfreq for Fc = psgfreq psg = Aa*sign(sin(2*pi*t*Fa/Fs)); psg = psg+Ab*sign(sin(2*pi*t*Fb/Fs)); psg = psg+Ac*sign(sin(2*pi*t*Fc/Fs)); c = sum( (a(t)'-psg).^2); if (c