BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hvempati
Obsidian | Level 7


I need to get all the dates where the difference between the previous dates is 30.

ex. 

data cards;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;

with the code below 

data want;
set WORK.cards;
days_between = intck('days',dts,lag(dts));
run;


I can get the difference with the previous date and can filter if it is less than 30.(since the days_between are -ve)

But my desired output should be all dates with greater than 30 days. i.e 


dts
02Jan2017
04Feb2017
31Mar2017
03Jun2017
08Jul2017
23Nov2017
14Mar2018
02May2018

So basically the count resets to 0 after it finds a date. I can do this easily with loops in Python. I am not sure as how to do it in SAS.

Any help is highly Appreciated

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @hvempati  Please ignore my Hash. I feel silly and stupid to have posted that.

 

Here you go

 


data have;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;


data want;
set have;
retain k;
if _n_=1 then do;output;k=dts;end;
else if intck('days',k,dts)>30 then do;
output;
k=dts;
end;
drop k;
run;

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

@hvempati wrote:

 

I can get the difference with the previous date and can filter if it is less than 30.(since the days_between are -ve)

But my desired output should be all dates with greater than 30 days. i.e 


What a shame you don't show your code for this part.

 

If you have

 

if days_between >= -30 then output;

to get the cases where it is less than or equal to 30 days, then the code to get cases where it is more than 30 days is ...

 

if days_between < -30 then output;

 

--
Paige Miller
hvempati
Obsidian | Level 7

sorry If it was not clear but 

data want;
set WORK.cards;
days_between = intck('days',dts,lag(dts));
run;

 

is the code that gives the value

novinosrin
Tourmaline | Level 20

data have;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;


proc sql;
create table temp as
select min(dts) as min, max(dts) as max
from have;
quit;

data want;
if _n_=1 then do;
if 0 then set have;
  dcl hash H (dataset:'have',ordered: "A") ;
   h.definekey  ("dts") ;
   h.definedata ("dts") ;
   h.definedone () ;
   dcl hiter hi('h');
end;
set temp;
dts=min;
output;
do while(hi.next()=0);
if min>max then leave;
if intck('days',min,dts)>30 then do;
output;
min=dts;
end;
end;
keep dts;
run;

 

novinosrin
Tourmaline | Level 20

Hi @hvempati  Please ignore my Hash. I feel silly and stupid to have posted that.

 

Here you go

 


data have;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;


data want;
set have;
retain k;
if _n_=1 then do;output;k=dts;end;
else if intck('days',k,dts)>30 then do;
output;
k=dts;
end;
drop k;
run;
hvempati
Obsidian | Level 7

@novinosrin  Thank you so much!!!! It works like a charm!!!


But is it looping inside the data step!!! Even though it works well, I still did not get the logic as to how is it working.? Please excuse me if I am asking a very basic doubt.

Like if _n_=1, it means the 1st row correct?
like if _n_=1, then put that row a output and put k=dts

Am I correct? 

 

How is dts going to the next value is what I don't understand.

Is this almost equal to looping?



Thank you.

 

novinosrin
Tourmaline | Level 20

Hi @hvempati  Your understanding is indeed correct. I see that your python mastery is putting into effect here with your quick grasp of SAS. I envy you. Have a good one!

novinosrin
Tourmaline | Level 20

HI @hvempati 

 

 

How is dts going to the next value is what I don't understand.?

 

I am basically resetting the current date value as initial value once a difference of >30 is found  and repeat across observations.  It's no biggie.

 

I am still embarrassed about my stupid first post. I am sure that would have given everyone at-least a chuckle on our community. I need Valium. lol  Anyways, it's one of those days. 

hvempati
Obsidian | Level 7

@novinosrin  I know you have a very high standards set for yourself. But because of that, Novice SAS user can learn a bit more about hash function. I am in fact glad that you had posted for me to learn how it works.

 

Thank you so much!! 😄