I'm trying to more or less transpose a dataset by groups of variables, and am currently attempting to do so with the output statement in a datastep. However, that method outputs the current value of all variables, and I'd like to only output the values of the variables I'm listing per time period, while others remain missing.
What is a better way to do this or improve my current code?
Have:
w4_wt_lbs w4_ht_in w4_bmi m3_brstfeed m3_supplement m6_brstfeed m6_supplement
121.5 63 21.520408163 1 0 1 1
Want:
week wt_lbs ht_in bmi brstfeed supplement
4 121.5 63.0 21.5204 . .
12 . . . 1 0
24 . . . 1 1
Currently output:
week wt_lbs ht_in bmi brstfeed supplement
4 121.5 63.0 21.5204 . .
12 121.5 63.0 21.5204 1 0
24 121.5 63.0 21.5204 1 1
Currently code:
data want; set have;
week=4;
wt_lbs = w4_wt_lbs ;
ht_in = w4_ht_in ;
bmi = w4_bmi ;
output;
week=12;
brstfeed = m3_brstfeed ;
supplement = m3_supplement ;
output;
week=24;
brstfeed = m6_brstfeed ;
supplement = m6_supplement ;
output;
run;
data want; set have;
week=4;
wt_lbs = w4_wt_lbs ;
ht_in = w4_ht_in ;
bmi = w4_bmi ;
output;
call missing(of _all_);
week=12;
brstfeed = m3_brstfeed ;
supplement = m3_supplement ;
output;
call missing(of _all_);
week=24;
brstfeed = m6_brstfeed ;
supplement = m6_supplement ;
output;
run;
Hm, looks like the code above using the call function seems to brings down the missing values of the last two variables instead of setting them to the intended value for that line:
week wt_lbs ht_in bmi brstfeed supplement
4 121.5 63 21.5204 . .
12 . . . . .
24 . . . . .
data have;
input w4_wt_lbs w4_ht_in w4_bmi m3_brstfeed m3_supplement m6_brstfeed m6_supplement;
cards;
121.5 63 21.520408163 1 0 1 1
;
run;
data want; set have;
week=4;
wt_lbs = w4_wt_lbs ;
ht_in = w4_ht_in ;
bmi = w4_bmi ;
output;
call missing(wt_lbs , ht_in , bmi);
week=12;
brstfeed = m3_brstfeed ;
supplement = m3_supplement ;
output;
week=24;
brstfeed = m6_brstfeed ;
supplement = m6_supplement ;
output;
run;
Can be accomplished programmatically, using the DATA step function VNAME to assign a numeric value for WEEK - based on a SUBSTR(...) and output each variable instance from inside a DO/END loop using an array that references the w#_xxxxx and m#_xxxx variables.
Another possibility is to use DICTIONARY.COLUMNS to generate the required SAS code based on the actual variable names found -- again, without any hardcoded programming where the "#" must already be known.
Scott Barry
SBBWorks, Inc.
Use Xia Keshan's code but add a keep statement to the want dataset.
E.g.
data want(keep=week wt_lbs ht_in bmi brstfeed supplement);
set have;
Regards,
BPD
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.