BookmarkSubscribeRSS Feed
philgo
Calcite | Level 5

All, first time here, please be gentle. Thanks.

 

I have a table of records. For each row, I have an existing variable, x, and a new variable y to be derived.

 

For each row, say with row index r, the new variable y is defined such that it is '1' if I can find a row afterwards, say row r+p, such that ALL the rows from row r to row r+p having the existing variable x larger than a threshold, say, 100, and it is '0' otherwise.

 

How can I do this in SAS? thanks, from a newbie.

 

Is it necessary to use a do-until loop here? or not necessary. 

 

 

4 REPLIES 4
Reeza
Super User

No loops required, but you may need a RETAIN, it depends exactly on what you're trying to do.

 

Can you please  provide a small example that includes the data you start with and what you want as output. It does not need to be your actual data but should reflect the complexities, ie if you need to do this for multiple groups include at least two groups. 

 


@philgo wrote:

All, first time here, please be gentle. Thanks.

 

I have a table of records. For each row, I have an existing variable, x, and a new variable y to be derived.

 

For each row, say with row index r, the new variable y is defined such that it is '1' if I can find a row afterwards, say row r+p, such that ALL the rows from row r to row r+p having the existing variable x larger than a threshold, say, 100, and it is '0' otherwise.

 

How can I do this in SAS? thanks, from a newbie.

 

Is it necessary to use a do-until loop here? or not necessary. 

 

 


 

philgo
Calcite | Level 5

row index           x    (existing)        y (to be derived)

--------------------------------------------------------------------

1                       200                        0 (because I cannot find any row afterwards such that all rows in between have x>100) 

2                       0                            0

3                       200                        0

4                       0                            0

5                       200                        1 (row 5 and row 6 have x>100)

6                       200                        0 

7                       0                            0

8                       200                        1 (row 8 and row 9 have x>100)

9                       200                        1 (row 9 and row 10 have x>100)

10                     200                         no definition

 

thanks a lot for the quick reply. please see the above sample data.

 

to derive the variable y for a row r, i need to look at all rows beyond and see if the condition saitisfies.

mkeintz
PROC Star

All you really seem to want is a dummy variable indicating whether both the current observation ("row") and the next observation have x>100.  This is a ready-made problem for solution by merging a data set with itself, but with the simultaneously access records offset by 1 ("firstobs=2" below):

 

data have;
  input r x;
datalines;
 1    200
 2      0
 3    200
 4      0
 5    200
 6    200
 7      0
 8    200
 9    200
10    200
run;
 
data want (drop=_:);
  merge have  
        have (firstobs=2 keep=x rename=(x=_x_next));
  if _x_next=. then y=.;
  else y=min(x,_x_next)>100;
run;

 

Just as you can merge two separate datasets you can to a self-merge, think of them as two parallel streams of data.   In this case merging record 1 with record 2 of have, record 2 with 3, etc.  But you can't have 2 variables named X at the same time, so X in the "firstobs=2" stream is renamed to _x_next.

 

The rest is logic.  _x_next for the matching of record 10 has no record 11, so it is missing, allowing an easy test for arriving at the end of the merge. 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
philgo
Calcite | Level 5

Thanks very much. That helps a lot.

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 630 views
  • 2 likes
  • 3 in conversation