BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
KS99
Obsidian | Level 7

Hi, SAS Community, 

 

I should again reach out for help from you. 

My dataset looks like this: 


data have;
  infile cards truncover;
  input CUSIP $ YEAR Couplets;
cards;
00036110	1990	0
00036110	1991	0
00036110	1992	1
00036110	2000	0
00036110	2001	1
00036110	2002	1
00036110	2010	0
00036110	2011	1
00036110	2012	0
00036110	2013	1
00037520	2009	0
00037520	2010	0
00037520	2011	1
00037520	2012	1
00037520	2014	0
00037520	2015	1
00086T10	2005	0
00086T10	2006	0
00086T10	2007	1
00095710	2014	0
00095710	2015	0
00095710	2016	1
;

I proc sorted my data by cusip and year. 

What I want to do is, sieve only the consecutive (by year) 0-1 couplets and weed out others from my dataset. In other words: 

CUSIP YEAR Couplets
00036110 1990 0
00036110 1991 0
00036110 1992 1
00036110 2000 0
00036110 2001 1
00036110 2002 1
00036110 2010 0
00036110 2011 1
00036110 2012 0
00036110 2013 1
00037520 2009 0
00037520 2010 0
00037520 2011 1
00037520 2012 1
00037520 2014 0
00037520 2015 1
…..........

 

Same CUSIP-consecutive year-consecutive 0-1. 

This task looks simple, but for me as an eternal novice in SAS, it is quite a puzzle. 

I tried lag, first.variable, last.variable, descending proc sort, but I cannot solve this. 

 

Sincerely, I appreciate your help! 

 

Sincerely, 

 

KS -, 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

It is hard to look ahead.  Looking back is pretty easy using the LAG() function.

So here is a way to do what you want that does not need to look ahead. (at least uses BY group processing ability to look ahead).

Instead it looks for a one that follows a zero and then uses POINT= option on a separate SET statement to re-read the pair of observations and output them.

data want ;
  set have;
  by CUSIP Couplets notsorted;
  lag_c = lag(Couplets);
  if first.Couplets and Couplets=1 and lag_c=0 and not first.CUSIP then do p=_n_-1,_n_;
    set have point=p;
    output;
  end;
  drop lag_c;
run;

You might also want to check that you don't have gaps in the YEAR values.

View solution in original post

8 REPLIES 8
Tom
Super User Tom
Super User

It is hard to look ahead.  Looking back is pretty easy using the LAG() function.

So here is a way to do what you want that does not need to look ahead. (at least uses BY group processing ability to look ahead).

Instead it looks for a one that follows a zero and then uses POINT= option on a separate SET statement to re-read the pair of observations and output them.

data want ;
  set have;
  by CUSIP Couplets notsorted;
  lag_c = lag(Couplets);
  if first.Couplets and Couplets=1 and lag_c=0 and not first.CUSIP then do p=_n_-1,_n_;
    set have point=p;
    output;
  end;
  drop lag_c;
run;

You might also want to check that you don't have gaps in the YEAR values.

KS99
Obsidian | Level 7

Wow, the code is exquisite! 

Thank you Tom, it works perfectly. 

Trying to understand your code, I remembered I learned from SAS community something like this: 

if <condition> then do p=max(_n_-1,1) to min(_n_+1,nobs);

I undertand this is a step forward application from your code. 

 

Have a nice weekend! 

 

Sincerely, 

 

KS -, 

 

Tom
Super User Tom
Super User

@KS99 wrote:

Wow, the code is exquisite! 

Thank you Tom, it works perfectly. 

Trying to understand your code, I remembered I learned from SAS community something like this: 

if <condition> then do p=max(_n_-1,1) to min(_n_+1,nobs);

I undertand this is a step forward application from your code. 

 

Have a nice weekend! 

 

Sincerely, 

 

KS -, 

 


In this situation _N_ can never be less 2 nor greater than NOBS so the MAX() and MIN() are not need.  And since we only want two values of P there is no need to use an iterative range ( lowerbound TO upperbound BY increment) since we can just use two distinct values.  Remember you can have as many value (or ranges) as you want in the DO loop, just separate them by commas.  Example:  do time=1,2,3,5,10 to 50 by 10, 100 to 1000 by 100;

KS99
Obsidian | Level 7
Thank you Tom, I gotcha.
mkeintz
PROC Star

You can avoid explicit look ahead and the point= option.   If the value of couplet is always a 0 or 1, and you want a pair of consecutive records in which the zero is followed by the one:  

 

data want ;
  set have;
  by CUSIP Couplets notsorted;

  if last.couplets=1  and couplets=0 and last.cusip=0
  or first.couplets=1 and couplets=1 and first.cusip=0;
run;

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
KS99
Obsidian | Level 7

Thank you mkeintz, 

 

 I will copy your code, and use it in a similar context. 

 

Sincerely, 

KS - 

 

Ksharp
Super User

data have;
  infile cards truncover;
  input CUSIP $ YEAR Couplets;
cards;
00036110	1990	0
00036110	1991	0
00036110	1992	1
00036110	2000	0
00036110	2001	1
00036110	2002	1
00036110	2010	0
00036110	2011	1
00036110	2012	0
00036110	2013	1
00037520	2009	0
00037520	2010	0
00037520	2011	1
00037520	2012	1
00037520	2014	0
00037520	2015	1
00086T10	2005	0
00086T10	2006	0
00086T10	2007	1
00095710	2014	0
00095710	2015	0
00095710	2016	1
;
data want;
 merge have have(firstobs=2 keep= CUSIP  YEAR Couplets rename=(
 CUSIP=_CUSIP  YEAR=_YEAR Couplets=_Couplets ));
if (CUSIP=_CUSIP and YEAR=_YEAR-1 and Couplets=0 and _Couplets=1) or
   (CUSIP=lag(CUSIP) and YEAR=lag(YEAR)+1 and Couplets=1 and lag(Couplets)=0) ;
drop _:;
run;
KS99
Obsidian | Level 7

Thank you Ksharp, 

long time no hear. 

I will copy your code. As a novice like me in SAS, "data B merge A  A" is a surprise. I think I can use them in a very efficient way. 

 

Sincerely, 

 

KS -, 

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 607 views
  • 0 likes
  • 4 in conversation