2 weeks ago - last edited 2 weeks ago
Hi, I need your help to count values of a variable but with a condition based on time ranges of another variable.
This is my data:
I have to find in 50 seconds time ranges, if there are 5 or more "Demanda" Category and a "Oferta" just after that or also the opposite case, 5 o more "Oferta" followed by a "Demanda".
Please I have no experience with loops or arrays, I tried to code a loop, but I do not how to make a reference to previous values of another variable. I would really appreciate if you could explain me the lines in your code to understand each step. Thank you very much!
2 weeks ago
So if there are 5 or more "Demanda" category values immediately followed by a "Oferta" in a range of 50 second in the Time variable, what do you want to happen then? An indicator variable of 1?
a week ago
You can use RETAIN.
By this you can shift information one row lower and then make comparisons, set counters and calculate time distances and sum then up.
Below i have coded an example. I have not much experience in using RETAIN, so it might be coded more efficient.
minute = 1; transcode = 'A'; revenue = 11.64;
minute = 4; transcode = 'A'; revenue = 8.34;
minute = 7; transcode = 'A'; revenue = 6.09;
minute = 10; transcode = 'B'; revenue = 5.38;
minute = 15; transcode = 'C'; revenue = 4.34;
if _n_= 1 then do; tmp1= minute; tmp2= transcode; tmp3 = revenue; end;
retain tmp1 tmp2 tmp3;
prae_minute= tmp1; prae_transcode= tmp2; prae_revenue_chain = tmp3;
if prae_transcode = transcode and _n_ > 1 then revenue_chain= prae_revenue_chain + revenue; else revenue_chain = revenue;
tmp1= minute; tmp2= transcode; tmp3 = revenue_chain;
keep minute transcode revenue revenue_chain;
proc print data = a; run;
Hi, I googled about retain to build my code, but it's not perfect yet... This is my code so far:
data data_lay2; set data_lay1; by fecha hora; retain lapse compra cod dif 0; if hora>=lag(hora) then do; if tipo='Demanda' and clase='Ingreso' then compra+1; if codigo ne lag(codigo) then cod+1; dif=intck('second',lag(hora),hora); if lapse=. then lapse=0; lapse=lapse+dif; if compra>=5 and cod>=5 and tipo='Oferta' then flag=1; end; if lapse>50 then lapse=0 and compra=0 and cod=0 and dif=0; run;
This program tracks the time stamp for demand 4 demands prior to the most recent demand, and for offer 4 offers prior to the most recent offer. You can then compare those time stamps to the current time:
data data_lay2; set data_lay1; by fecha hora; retain lagtime4d lagtime4o ; format lagtime4d lagtime4o time8.0 ; if tipo='Demanda' then lagtime4d=lag4(hora); else if tipo='Oferta' then lagtime4o=lag4(hora); if (tipo='Demanda' and lagtime4o>=hora-50) or (tipo='Oferta' and lagtime4d>=hora-50); run;
This program forms 2 queues, one for time stamps of offers, and one for time stamps of demands. Each queue is 4 elements deep. As a result, when the current record is an offer, then the lagtime4d is the time stamp of the 5th preceding demand. I say the 5th preceding demand for an offer, because it is calculated as the 4th preceding demand of the most recent preceding demand.
As a result the subsetting IF statement just compares the current time stamp to either the lagtime4d or lagtime4o, depending on the current record type.
I presume HORA is a time variable, with sas time values.