Hi Folks:
Can you help me transform data from HAVE to DESIRED OUTPUT structure? I don't need to output it to a data set.
Thanks in advance.
DATA HAVE;
INPUT GROUP Desc $ A B;
CARDS;
1 Min 1 4
1 P50 2 5
1 Max 3 6
2 Min 3 7
2 P50 4 2
2 Max 5 3
3 Min 7 5
3 P50 6 4
3 Max 8 6
;
I'm not sure whether proc tabulate or report or transpose the best option.
Combine
proc format;
picture Group low-high = '9' (prefix='GROUP');
run;
proc tabulate data=have;
class group;
class desc / order=data;
var a b;
table
(a b) * desc=''
, group=' ' * sum='' * f=best6. ;
format group group.;
run;
TABULATE:
Can produce the output directly because it can place a VAR variable in the row dimension.
title "TABULATE: (a b) * desc, group";
proc tabulate data=have;
class group;
class desc / order=data;
var a b;
table
(a b) * desc=''
, group * sum='' * f=best6. ;
run;
Output
TRANSPOSE:
The issue of transposing descriptions in 'chunks' of 3 (those MIN,P50,MAX) requires a double transpose approach to reach a final reshape that can be REPORTed. The INDEX= data set option used in the first transpose output avoids a sort that would otherwise be needed for the second transpose.
proc transpose data=have out=stage1(rename=(_name_=varname) index=(varname)); by group; var a b; id desc; label varname=' '; run; proc transpose data=stage1 out=reshaped(rename=_name_=desc) prefix=group; by varname ; var min p50 max; id group; label desc = ' '; run; proc report data=reshaped; title1 "Double transpose + REPORT"; title2 "TRANSPOSE: BY GROUP; VAR A B; ID DESC. _NAME_ to VARNAME"; title3 "TRANSPOSE: BY VARNAME; VAR MIN P50 MAX; ID GROUP. _NAME_ to DESC"; define varname / ' ' order order=data; define desc / ' '; run;
Output
Log showing index use
1140
1141 proc transpose data=have out=stage1(rename=(_name_=varname) index=(varname));
1142 by group;
1143 var a b;
1144 id desc;
1145 label varname=' ';
WARNING: Variable VARNAME not found in data set WORK.HAVE. <---------- Not a problem and expected. Clears the _NAME_ label
1146 run;
NOTE: There were 9 observations read from the data set WORK.HAVE.
NOTE: The data set WORK.STAGE1 has 6 observations and 5 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.11 seconds
cpu time 0.00 seconds
1147 proc transpose data=stage1 out=reshaped(rename=_name_=desc) prefix=group;
1148 by varname ;
NOTE: An index was selected to execute the BY statement. <----------- Index usage means a SORT step was not needed
The observations will be returned in index order rather than in physical order. The
selected index is for the variable(s):
varname
1149 var min p50 max;
1150 id group;
1151 label desc = ' ';
WARNING: Variable DESC not found in data set WORK.STAGE1. <---------- Not a problem and expected. Clears the _NAME_ label
1152 run;
NOTE: There were 6 observations read from the data set WORK.STAGE1.
NOTE: The data set WORK.RESHAPED has 6 observations and 5 variables.
NOTE: PROCEDURE TRANSPOSE used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds
Hi @Cruise
Here is another approach using proc report. The use of a format enables to display 'Group1', 'Group2', ... instead of 1, 2, ... without computing another variable.
proc transpose data=have out=have_tr;
var A B;
by Group Desc notsorted;
run;
proc format;
picture Group low-high = '9' (prefix='GROUP');
run;
proc report data=have_tr;
columns _name_ desc group, col1;
define _name_ / group '';
define desc / group order=data '';
define group / across '' f=Group.;
define col1 / analysis '';
run;
Output:
Best,
Here is the same approach using proc tabulate:
proc transpose data=have out=have_tr;
var A B;
by Group Desc notsorted;
run;
proc format;
picture Group low-high = '9' (prefix='GROUP');
run;
proc tabulate data=have_tr;
class _name_ desc group / order=data;
var col1;
table _name_=''*desc='', group=''*col1=''*sum=''*f=8.0;
format group group.;
run;
Output:
Best,
DATA HAVE;
INPUT GROUP Desc $ A B;
CARDS;
1 Min 1 4
1 P50 2 5
1 Max 3 6
2 Min 3 7
2 P50 4 2
2 Max 5 3
3 Min 7 5
3 P50 6 4
3 Max 8 6
;
proc sql noprint;
select distinct catt('have(where=(group=',group,') rename=(A=group',group,'))')
into : merge1 separated by ' '
from have;
select distinct catt('have(where=(group=',group,') rename=(B=group',group,'))')
into : merge2 separated by ' '
from have;
quit;
data temp1;
var='A';
merge &merge1 ;
drop group b;
run;
data temp2;
var='B';
merge &merge2 ;
drop group a;
run;
data want;
set temp1 temp2;
run;
Combine
proc format;
picture Group low-high = '9' (prefix='GROUP');
run;
proc tabulate data=have;
class group;
class desc / order=data;
var a b;
table
(a b) * desc=''
, group=' ' * sum='' * f=best6. ;
format group group.;
run;
Thanks all:
Now I'm trying to label the variables now. But as I've shown below my attempt to label a to Education is not taken up. Any suggestion on this?
Label a='Education'
proc format;
picture Group low-high = '9' (prefix='GROUP');
run;
proc tabulate data=have;
class group;
class desc / order=data;
var a b;
table
(a b) * desc=''
, group=' ' * sum='' * f=best6. ;
Label a='Education';
format group group.;
run;
Hi @Cruise
With the statement Label a='Education'; , we get the following output. What would the expected output look like?
Weird. The labels specified are not taken up. Maybe something specific to my data set? I attached my real data with two variables out of 20 more variables. Can you try the code using the data 'have' at your end I attached to this post? Please let me know if this works for you.
proc format;
picture Group low-high = '9' (prefix='GROUP');
run;
proc tabulate data=have;
class group;
class desc / order=data;
var HIGH_SCH_P OB_MEAS_RATE;
table (HIGH_SCH_P OB_MEAS_RATE)*desc=''
,group=' ' * sum='' * f=best5. ;
label HIGH_SCH_P='Education';
label OB_MEAS_RATE='Obesity';
format group group.;
run;
Hi @Cruise
This is very weird.
Here is what I get:
Can you check in your options if the label option is not disabled ?
option label;
Great 👍
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.