Hi, I need your help to count values of a variable but with a condition based on time ranges of another variable.
This is my data:
Date | Time | Type |
01-dic | 08:00:10 | Demanda |
01-dic | 08:00:13 | Demanda |
01-dic | 08:00:14 | Oferta |
01-dic | 08:00:16 | Oferta |
01-dic | 08:00:19 | Demanda |
01-dic | 08:00:24 | Demanda |
01-dic | 08:00:24 | Oferta |
01-dic | 08:00:30 | Demanda |
01-dic | 08:00:31 | Demanda |
01-dic | 08:00:34 | Oferta |
01-dic | 08:00:34 | Oferta |
01-dic | 08:00:41 | Demanda |
01-dic | 08:00:42 | Demanda |
01-dic | 08:00:53 | Demanda |
01-dic | 08:00:53 | Demanda |
01-dic | 08:00:53 | Demanda |
01-dic | 08:01:02 | Oferta |
01-dic | 08:01:02 | Oferta |
01-dic | 08:01:12 | Demanda |
01-dic | 08:01:12 | Demanda |
01-dic | 08:01:17 | Oferta |
01-dic | 08:01:21 | Oferta |
01-dic | 08:01:24 | Demanda |
01-dic | 08:01:29 | Oferta |
01-dic | 08:01:35 | Oferta |
01-dic | 08:01:39 | Demanda |
01-dic | 08:01:39 | Oferta |
01-dic | 08:01:39 | Oferta |
01-dic | 08:01:42 | Demanda |
I have to find in 50 seconds time ranges, if there are 5 or more "Demanda" Category and a "Oferta" just after that or also the opposite case, 5 o more "Oferta" followed by a "Demanda".
Please I have no experience with loops or arrays, I tried to code a loop, but I do not how to make a reference to previous values of another variable. I would really appreciate if you could explain me the lines in your code to understand each step. Thank you very much!
So if there are 5 or more "Demanda" category values immediately followed by a "Oferta" in a range of 50 second in the Time variable, what do you want to happen then? An indicator variable of 1?
Yes, a indicator variable in the same row that the opposite category appears.
You can use RETAIN.
By this you can shift information one row lower and then make comparisons, set counters and calculate time distances and sum then up.
Below i have coded an example. I have not much experience in using RETAIN, so it might be coded more efficient.
data a;
minute = 1; transcode = 'A'; revenue = 11.64;
output;
minute = 4; transcode = 'A'; revenue = 8.34;
output;
minute = 7; transcode = 'A'; revenue = 6.09;
output;
minute = 10; transcode = 'B'; revenue = 5.38;
output;
minute = 15; transcode = 'C'; revenue = 4.34;
output;
data a;
set a;
if _n_= 1 then do; tmp1= minute; tmp2= transcode; tmp3 = revenue; end;
retain tmp1 tmp2 tmp3;
prae_minute= tmp1; prae_transcode= tmp2; prae_revenue_chain = tmp3;
if prae_transcode = transcode and _n_ > 1 then revenue_chain= prae_revenue_chain + revenue; else revenue_chain = revenue;
tmp1= minute; tmp2= transcode; tmp3 = revenue_chain;
keep minute transcode revenue revenue_chain;
proc print data = a; run;
Hi, I googled about retain to build my code, but it's not perfect yet... This is my code so far:
data data_lay2;
set data_lay1;
by fecha hora;
retain lapse compra cod dif 0;
if hora>=lag(hora) then do;
if tipo='Demanda' and clase='Ingreso' then compra+1;
if codigo ne lag(codigo) then cod+1;
dif=intck('second',lag(hora),hora);
if lapse=. then lapse=0;
lapse=lapse+dif;
if compra>=5 and cod>=5 and tipo='Oferta' then flag=1;
end;
if lapse>50 then lapse=0 and compra=0 and cod=0 and dif=0;
run;
This program tracks the time stamp for demand 4 demands prior to the most recent demand, and for offer 4 offers prior to the most recent offer. You can then compare those time stamps to the current time:
data data_lay2;
set data_lay1;
by fecha hora;
retain lagtime4d lagtime4o ;
format lagtime4d lagtime4o time8.0 ;
if tipo='Demanda' then lagtime4d=lag4(hora); else
if tipo='Oferta' then lagtime4o=lag4(hora);
if (tipo='Demanda' and lagtime4o>=hora-50)
or (tipo='Oferta' and lagtime4d>=hora-50);
run;
This program forms 2 queues, one for time stamps of offers, and one for time stamps of demands. Each queue is 4 elements deep. As a result, when the current record is an offer, then the lagtime4d is the time stamp of the 5th preceding demand. I say the 5th preceding demand for an offer, because it is calculated as the 4th preceding demand of the most recent preceding demand.
As a result the subsetting IF statement just compares the current time stamp to either the lagtime4d or lagtime4o, depending on the current record type.
I presume HORA is a time variable, with sas time values.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.