BookmarkSubscribeRSS Feed
xyxu
Quartz | Level 8

I find operations across rows difficult in SAS. For example, I want to select observations that have a non-missing value, while excluding those for which another observation was already selected among the last N rows. Below is an example for a small N (e.g., N=3), but I want to keep N general.  

 

data have;
 	infile datalines delimiter = "," missover;
	input value;
	datalines;
	.
	.
	1
	.
	.
	.
	.
	1
	.
	1
	.
	.
	.
	1
	1
	.
	.
run;

data want;
 	infile datalines delimiter = "," missover;
	input value select;
	datalines;
	., 0
	., 0
	1, 1
	., 0 
	., 0
	., 0
	., 0
	1, 1
	., 0
	1, 0
	., 0
	., 0
	., 0
	1, 1
	1, 0
	., 0 
	., 0
run;

 

 

2 REPLIES 2
mkeintz
PROC Star

 

First, thanks for putting both your sample data and expected results in the form of usable data steps.  So here is a program actually tested against your data, and identical to your desired results:

 

%let N=3;
data want (drop=_:);
  set have;

  retain _most_recent_select .; /*Obsnum of most recent select=1*/

  if not missing(value) and _most_recent_select < (_n_-&n) then do;
    select=1;
    _most_recent_select=_n_;
  end;
  else select=0;
run;

 

Because _most_recent_select is initialized as a missing value, it starts out "less than" all valid numeric values.  So the first non-missing VALUE is always selected.  After that only obs more than &N after a selected obs is eligible for select=1.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Tom
Super User Tom
Super User

It would help a lot if you explained what the overall goal is because this request seems silly.

Does your data only have 1's and missing?  Or could there be other non-missing values?  Does it matter what the first non-missing value is?

 

Why not just use a temporary array to make a rolling list of the previous N values?

%let N=3 ;
data want ;
  set have;
  array previous [&n] _temporary_ ;
  previous[1+mod(_n_,&n)]=value;
  n_previous = n(of previous[*]);
  select = n_previous=1 and not missing(value);
run;

 

PS: Note there is no need to use such convoluted data steps to share such simple data.

data have;
  input value @@ ;
datalines;
. .
1
. . . .
1
.
1
. . .
1 1
. .
;

data expect ;
  input value select ;
datalines;
. 0
. 0
1 1
. 0
. 0
. 0
. 0
1 1
. 0
1 0
. 0
. 0
. 0
1 1
1 0
. 0
. 0
;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 2 replies
  • 1152 views
  • 2 likes
  • 3 in conversation