BookmarkSubscribeRSS Feed
fri0
Quartz | Level 8

Hi, I need your help to count values of a variable but with a condition based on time ranges of another variable.

 

This is my data:

 

Date Time Type
01-dic 08:00:10 Demanda
01-dic 08:00:13 Demanda
01-dic 08:00:14 Oferta
01-dic 08:00:16 Oferta
01-dic 08:00:19 Demanda
01-dic 08:00:24 Demanda
01-dic 08:00:24 Oferta
01-dic 08:00:30 Demanda
01-dic 08:00:31 Demanda
01-dic 08:00:34 Oferta
01-dic 08:00:34 Oferta
01-dic 08:00:41 Demanda
01-dic 08:00:42 Demanda
01-dic 08:00:53 Demanda
01-dic 08:00:53 Demanda
01-dic 08:00:53 Demanda
01-dic 08:01:02 Oferta
01-dic 08:01:02 Oferta
01-dic 08:01:12 Demanda
01-dic 08:01:12 Demanda
01-dic 08:01:17 Oferta
01-dic 08:01:21 Oferta
01-dic 08:01:24 Demanda
01-dic 08:01:29 Oferta
01-dic 08:01:35 Oferta
01-dic 08:01:39 Demanda
01-dic 08:01:39 Oferta
01-dic 08:01:39 Oferta
01-dic 08:01:42 Demanda

 

I have to find in 50 seconds time ranges, if there are 5 or more "Demanda" Category and a "Oferta" just after that or also the opposite case, 5 o more "Oferta" followed by a "Demanda".

 

Please I have no experience with loops or arrays, I tried to code a loop, but I do not how to make a reference to previous values of another variable. I would really appreciate if you could explain me the lines in your code to understand each step. Thank you very much!

5 REPLIES 5
PeterClemmensen
Tourmaline | Level 20

So if there are 5 or more "Demanda" category values immediately followed by a "Oferta" in a range of 50 second in the Time variable, what do you want to happen then? An indicator variable of 1?

fri0
Quartz | Level 8

Yes, a indicator variable in the same row that the opposite category appears.

atvf
Calcite | Level 5

You can use RETAIN.

By this you can shift information one row lower and then make comparisons, set counters and calculate time distances and sum then up.

Below i have coded an example. I have not much experience in using RETAIN, so it might be coded more efficient.

 

data a;

minute = 1; transcode = 'A'; revenue = 11.64;

output;

minute = 4; transcode = 'A'; revenue = 8.34;

output;

minute = 7; transcode = 'A'; revenue = 6.09;

output;

minute = 10; transcode = 'B'; revenue = 5.38;

output;
minute = 15; transcode = 'C'; revenue = 4.34;

output;


data a;

set a;

if _n_= 1 then do; tmp1= minute; tmp2= transcode; tmp3 = revenue; end;

retain tmp1 tmp2 tmp3;

prae_minute= tmp1; prae_transcode= tmp2; prae_revenue_chain = tmp3;

if prae_transcode = transcode and _n_ > 1 then revenue_chain= prae_revenue_chain + revenue; else revenue_chain = revenue;

tmp1= minute; tmp2= transcode; tmp3 = revenue_chain;

keep minute transcode revenue revenue_chain;

 

proc print data = a; run;

fri0
Quartz | Level 8

Hi, I googled about retain to build my code, but it's not perfect yet... This is my code so far:

 

data data_lay2;
set data_lay1;
by fecha hora;
retain lapse compra cod dif 0;
if hora>=lag(hora) then do;
	if tipo='Demanda' and clase='Ingreso' then compra+1;
	if codigo ne lag(codigo) then cod+1;
	dif=intck('second',lag(hora),hora);
	if lapse=. then lapse=0;
	lapse=lapse+dif;
	if compra>=5 and cod>=5 and tipo='Oferta' then flag=1;
end;
if lapse>50 then lapse=0 and compra=0 and cod=0 and dif=0;
run;
mkeintz
PROC Star

This program tracks the time stamp for demand 4 demands prior to the most recent demand, and for offer 4 offers prior to the most recent offer.  You can then compare those time stamps to the current time:

 

data data_lay2;
  set data_lay1;
  by fecha hora;
  retain lagtime4d lagtime4o ;
  format lagtime4d lagtime4o time8.0 ;

  if tipo='Demanda' then lagtime4d=lag4(hora); else
  if tipo='Oferta'  then lagtime4o=lag4(hora);

  if (tipo='Demanda' and lagtime4o>=hora-50) 
  or (tipo='Oferta'  and lagtime4d>=hora-50);
run;

 

This program forms 2 queues, one for time stamps of offers, and one for time stamps of demands.  Each queue is 4 elements deep.  As a result, when the current record is an offer, then the lagtime4d is the time stamp of the 5th preceding demand.  I say the 5th preceding demand for an offer, because it is calculated as the 4th preceding demand of the most recent preceding demand.

 

As a result the subsetting IF statement just compares the current time stamp to either the lagtime4d or lagtime4o, depending on the current record type.

 

I presume HORA is a time variable, with sas time values.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 2221 views
  • 1 like
  • 4 in conversation