BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
KS99
Obsidian | Level 7

Hi, SAS community! 

 

While I was working with intnx function, I simply got entangled in a mire. 

What I want to get is the following: 

 

CUSIP ANNDATS mark Oneyear
00036110 6/25/1999   0
00036110 6/28/1999   0
00036110 9/1/1999 1 1
00036110 9/20/1999   1
00036110 10/14/1999   1
00036110 12/17/1999   1
00036110 12/22/1999   1
00036110 12/23/1999   1
00036110 1/13/2000   1
00036110 2/3/2000   1
00036110 7/11/2000   1
00036110 7/11/2000   1
00036110 7/17/2000   1
00036110 7/25/2000   1
00036110 9/1/2000   0
00036110 9/5/2000   0
00040010 3/18/2003   0
00040010 5/27/2003 1 1
00040010 7/15/2003   1
00040010 1/2/2004   1
00040010 1/14/2004   1
00040010 4/13/2004   1
00040010 7/21/2004   0

 

The data is proc sorted by CUSIP and ANNDATS. 

I have occasional mark=1. What I want to accomplish is, starting from mark=1, create Oneyear=1's over the subsequent 1-year period. So, Oneyear has values 1 over 9/1/1999~7/25/2000 for CUSIP 00036110; values 1 over 5/27/2003~4/13/2004 for CUSIP 00040010. 

My code, which is also provided by a SAS community fellow, runs as follows: 

 

data Brokers2(drop = a); do until (last.cusip);
      set Brokers1; by CUSIP; if mark = 1 then a = anndats; end;
   do until (last.cusip); set Brokers1; by CUSIP; 
Oneyear = (anndats <= intnx('year', a, 1, 's')); output; end; run;

However, what I get is a mumble-jumble Oneyear variable, like below: 

 

CUSIP ANNDATS mark Oneyear
00036110 6/25/1999   1
00036110 6/28/1999   1
00036110 9/1/1999 1 1
00036110 9/20/1999   1
00036110 10/14/1999   1
00036110 12/17/1999   1
00036110 12/22/1999   1
00036110 12/23/1999   1
00036110 1/13/2000   1
00036110 2/3/2000   1
00036110 7/11/2000   1
00036110 7/11/2000   1
00036110 7/17/2000   1
00036110 7/25/2000   1
00036110 9/1/2000   1
00036110 9/5/2000   1
00040010 3/18/2003   1
00040010 5/27/2003 1 1
00040010 7/15/2003   1
00040010 1/2/2004   1
00040010 1/14/2004   1
00040010 4/13/2004   1
00040010 7/21/2004   0

 

Any help to fix my code will be greatly appreciated!! 

 

Sincerely, 

KS -, 

 

1 ACCEPTED SOLUTION

Accepted Solutions
mkeintz
PROC Star

I can see why you are getting unintended ONEYEAR=1 for dates preceding the DATE for mark=1.  That's because in your second DO UNTIL ... group you have

    Oneyear = (anndats <= intnx('year', a, 1, 's')); 

while you should have

    Oneyear = (a<=anndats <= intnx('year', a, 1, 's')); 

 

This does not explain why you are getting ONEYEAR=1 for dates after INTNX('year',a,1,'s').

 

But I would suggest replacing what you've written with far simpler code, which doesn't need the double DO UNTIL structure:

 

data want (drop=_:);
  set brokers1;
  by cusip;
  retain _cutoff_date ;

  if first.cusip then _cutoff_date=.;
  if mark=1 then _cutoff_date=intnx('year',anndats,1,'sameday');
  oneyear=(anndats<=_cutoff_date);
run;

This doesn't require testing for a lower bound on anndats - i.e. it doesn't need the equivalent of "a<=" in ("a<=anndats<=intnx....") - because it is not re-reading the CUSIP, and therefore the test is not exposed to dates preceding the MARK=1 date.  

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

View solution in original post

2 REPLIES 2
mkeintz
PROC Star

I can see why you are getting unintended ONEYEAR=1 for dates preceding the DATE for mark=1.  That's because in your second DO UNTIL ... group you have

    Oneyear = (anndats <= intnx('year', a, 1, 's')); 

while you should have

    Oneyear = (a<=anndats <= intnx('year', a, 1, 's')); 

 

This does not explain why you are getting ONEYEAR=1 for dates after INTNX('year',a,1,'s').

 

But I would suggest replacing what you've written with far simpler code, which doesn't need the double DO UNTIL structure:

 

data want (drop=_:);
  set brokers1;
  by cusip;
  retain _cutoff_date ;

  if first.cusip then _cutoff_date=.;
  if mark=1 then _cutoff_date=intnx('year',anndats,1,'sameday');
  oneyear=(anndats<=_cutoff_date);
run;

This doesn't require testing for a lower bound on anndats - i.e. it doesn't need the equivalent of "a<=" in ("a<=anndats<=intnx....") - because it is not re-reading the CUSIP, and therefore the test is not exposed to dates preceding the MARK=1 date.  

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
KS99
Obsidian | Level 7
Thank you, mkeintz, for you help out all the time!
Your code works perfectly 😉

Sincerely,
KS -

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 567 views
  • 0 likes
  • 2 in conversation