I want to take the row where `ORD = _11` and subtract the row where `ORD = _31`, and store the difference in `ORD = _32`. So if `ORD = _32` then `A_CNT = 13-6 = 7` and `B_CNT = 17-4 = 13`. I know I could probably just transpose a few times, but I was wondering if difference between rows is possible.
/*Data Have*/
data have;
input ORD $ A_CNT A_PCT B_CNT B_PCT;
cards;
_11 13 20 17 40
_12 . . 19 45
_21 21 32 1 2
_22 8 12 3 7
_23 36 55 19 45
_31 6 9 4 10
;
run;
/*Data Want: add ORD = _32*/
data have;
input ORD $ A_CNT A_PCT B_CNT B_PCT;
cards;
_11 13 20 17 40
_12 . . 19 45
_21 21 32 1 2
_22 8 12 3 7
_23 36 55 19 45
_31 6 9 4 10
_32 7 11 13 31
;
run;
Use multiple set skill:
data want(drop=tmp_:);
set have;
if _n_ = 1 then set have(where=(tmp_ord='_11') rename=(ord=tmp_ord a_cnt=tmp_a_cnt a_pct=tmp_a_pct b_cnt=tmp_b_cnt b_pct=tmp_b_pct));
output;
if ORD = '_31' then do;
A_CNT = tmp_a_cnt - A_CNT ;
A_PCT = tmp_a_pct - A_PCT;
B_CNT = tmp_b_cnt - B_CNT;
B_PCT = tmp_b_pct - B_PCT;
output;
end;
run;
First questions:
I am trying to make a table. The ORD variable has formatting attached to it, where _11 is any subject with at least one hospital readmission, and _31 is any subject with at least one readmission due to a complication. I want _32 to be any subject with no readmission due to complications. The other rows are needed for the table.
Use multiple set skill:
data want(drop=tmp_:);
set have;
if _n_ = 1 then set have(where=(tmp_ord='_11') rename=(ord=tmp_ord a_cnt=tmp_a_cnt a_pct=tmp_a_pct b_cnt=tmp_b_cnt b_pct=tmp_b_pct));
output;
if ORD = '_31' then do;
A_CNT = tmp_a_cnt - A_CNT ;
A_PCT = tmp_a_pct - A_PCT;
B_CNT = tmp_b_cnt - B_CNT;
B_PCT = tmp_b_pct - B_PCT;
output;
end;
run;
I had no idea you could do a set within a set. Thank you!
By the way, what does _N_ mean?
_n_ is an automatic variable
_N_
is initially set to 1. Each time the DATA step loops past the DATA statement, the variable _N_ increments by 1. The value of _N_ represents the number of times the DATA step has iterated.
Oh and I'll add, it's a common mistake to think that _n_ is the observation number in the input dataset.
They are not the same thing
Here's a simple example:
358 data have ;
359 do i=1 to 5 ;
360 output ;
361 end ;
362 run ;
NOTE: The data set WORK.HAVE has 5 observations and 1 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.00 seconds
363
364 data _null_ ;
365 set have (where=(i>2));
366 put _n_= i= ;
367 run ;
_N_=1 i=3
_N_=2 i=4
_N_=3 i=5
NOTE: There were 3 observations read from the data set WORK.HAVE.
WHERE i>2;
NOTE: DATA statement used (Total process time):
real time 0.00 seconds
cpu time 0.00 seconds
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.