BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hvempati
Obsidian | Level 7


I need to get all the dates where the difference between the previous dates is 30.

ex. 

data cards;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;

with the code below 

data want;
set WORK.cards;
days_between = intck('days',dts,lag(dts));
run;


I can get the difference with the previous date and can filter if it is less than 30.(since the days_between are -ve)

But my desired output should be all dates with greater than 30 days. i.e 


dts
02Jan2017
04Feb2017
31Mar2017
03Jun2017
08Jul2017
23Nov2017
14Mar2018
02May2018

So basically the count resets to 0 after it finds a date. I can do this easily with loops in Python. I am not sure as how to do it in SAS.

Any help is highly Appreciated

Thank you.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Hi @hvempati  Please ignore my Hash. I feel silly and stupid to have posted that.

 

Here you go

 


data have;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;


data want;
set have;
retain k;
if _n_=1 then do;output;k=dts;end;
else if intck('days',k,dts)>30 then do;
output;
k=dts;
end;
drop k;
run;

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26

@hvempati wrote:

 

I can get the difference with the previous date and can filter if it is less than 30.(since the days_between are -ve)

But my desired output should be all dates with greater than 30 days. i.e 


What a shame you don't show your code for this part.

 

If you have

 

if days_between >= -30 then output;

to get the cases where it is less than or equal to 30 days, then the code to get cases where it is more than 30 days is ...

 

if days_between < -30 then output;

 

--
Paige Miller
hvempati
Obsidian | Level 7

sorry If it was not clear but 

data want;
set WORK.cards;
days_between = intck('days',dts,lag(dts));
run;

 

is the code that gives the value

novinosrin
Tourmaline | Level 20

data have;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;


proc sql;
create table temp as
select min(dts) as min, max(dts) as max
from have;
quit;

data want;
if _n_=1 then do;
if 0 then set have;
  dcl hash H (dataset:'have',ordered: "A") ;
   h.definekey  ("dts") ;
   h.definedata ("dts") ;
   h.definedone () ;
   dcl hiter hi('h');
end;
set temp;
dts=min;
output;
do while(hi.next()=0);
if min>max then leave;
if intck('days',min,dts)>30 then do;
output;
min=dts;
end;
end;
keep dts;
run;

 

novinosrin
Tourmaline | Level 20

Hi @hvempati  Please ignore my Hash. I feel silly and stupid to have posted that.

 

Here you go

 


data have;
input id dts date9.;
format dts date9.;
datalines;
1 02JAN2017
2 07JAN2017
3 14JAN2017
4 21JAN2017
5 28JAN2017
6 04FEB2017
7 14FEB2017
8 31MAR2017
9 01APR2017
10 03JUN2017
11 10JUN2017
12 01JUL2017
13 08JUL2017
14 23NOV2017
15 14MAR2018
16 02APR2018
17 02MAY2018
18 05MAY2018
;
run;


data want;
set have;
retain k;
if _n_=1 then do;output;k=dts;end;
else if intck('days',k,dts)>30 then do;
output;
k=dts;
end;
drop k;
run;
hvempati
Obsidian | Level 7

@novinosrin  Thank you so much!!!! It works like a charm!!!


But is it looping inside the data step!!! Even though it works well, I still did not get the logic as to how is it working.? Please excuse me if I am asking a very basic doubt.

Like if _n_=1, it means the 1st row correct?
like if _n_=1, then put that row a output and put k=dts

Am I correct? 

 

How is dts going to the next value is what I don't understand.

Is this almost equal to looping?



Thank you.

 

novinosrin
Tourmaline | Level 20

Hi @hvempati  Your understanding is indeed correct. I see that your python mastery is putting into effect here with your quick grasp of SAS. I envy you. Have a good one!

novinosrin
Tourmaline | Level 20

HI @hvempati 

 

 

How is dts going to the next value is what I don't understand.?

 

I am basically resetting the current date value as initial value once a difference of >30 is found  and repeat across observations.  It's no biggie.

 

I am still embarrassed about my stupid first post. I am sure that would have given everyone at-least a chuckle on our community. I need Valium. lol  Anyways, it's one of those days. 

hvempati
Obsidian | Level 7

@novinosrin  I know you have a very high standards set for yourself. But because of that, Novice SAS user can learn a bit more about hash function. I am in fact glad that you had posted for me to learn how it works.

 

Thank you so much!! 😄

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1221 views
  • 1 like
  • 3 in conversation