I want to take the row where `ORD = _11` and subtract the row where `ORD = _31`, and store the difference in `ORD = _32`. So if `ORD = _32` then `A_CNT = 13-6 = 7` and `B_CNT = 17-4 = 13`. I know I could probably just transpose a few times, but I was wondering if difference between rows is possible.
/*Data Have*/
data have;
input ORD $ A_CNT A_PCT B_CNT B_PCT;
cards;
_11 13 20 17 40
_12 . . 19 45
_21 21 32 1 2
_22 8 12 3 7
_23 36 55 19 45
_31 6 9 4 10
;
run;
/*Data Want: add ORD = _32*/
data have;
input ORD $ A_CNT A_PCT B_CNT B_PCT;
cards;
_11 13 20 17 40
_12 . . 19 45
_21 21 32 1 2
_22 8 12 3 7
_23 36 55 19 45
_31 6 9 4 10
_32 7 11 13 31
;
run;
Use multiple set skill:
data want(drop=tmp_:);
set have;
if _n_ = 1 then set have(where=(tmp_ord='_11') rename=(ord=tmp_ord a_cnt=tmp_a_cnt a_pct=tmp_a_pct b_cnt=tmp_b_cnt b_pct=tmp_b_pct));
output;
if ORD = '_31' then do;
A_CNT = tmp_a_cnt - A_CNT ;
A_PCT = tmp_a_pct - A_PCT;
B_CNT = tmp_b_cnt - B_CNT;
B_PCT = tmp_b_pct - B_PCT;
output;
end;
run;
First questions:
I am trying to make a table. The ORD variable has formatting attached to it, where _11 is any subject with at least one hospital readmission, and _31 is any subject with at least one readmission due to a complication. I want _32 to be any subject with no readmission due to complications. The other rows are needed for the table.
Use multiple set skill:
data want(drop=tmp_:);
set have;
if _n_ = 1 then set have(where=(tmp_ord='_11') rename=(ord=tmp_ord a_cnt=tmp_a_cnt a_pct=tmp_a_pct b_cnt=tmp_b_cnt b_pct=tmp_b_pct));
output;
if ORD = '_31' then do;
A_CNT = tmp_a_cnt - A_CNT ;
A_PCT = tmp_a_pct - A_PCT;
B_CNT = tmp_b_cnt - B_CNT;
B_PCT = tmp_b_pct - B_PCT;
output;
end;
run;
I had no idea you could do a set within a set. Thank you!
By the way, what does _N_ mean?
_n_ is an automatic variable
_N_
is initially set to 1. Each time the DATA step loops past the DATA statement, the variable _N_ increments by 1. The value of _N_ represents the number of times the DATA step has iterated.
Oh and I'll add, it's a common mistake to think that _n_ is the observation number in the input dataset.
They are not the same thing
Here's a simple example:
358 data have ;
359 do i=1 to 5 ;
360 output ;
361 end ;
362 run ;
NOTE: The data set WORK.HAVE has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
363
364 data _null_ ;
365 set have (where=(i>2));
366 put _n_= i= ;
367 run ;
_N_=1 i=3
_N_=2 i=4
_N_=3 i=5
NOTE: There were 3 observations read from the data set WORK.HAVE.
WHERE i>2;
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.