Hi,
I need to transpose almost 50 variables using proc transpose for which I kept statement (Var=_all_)
which is not working as per my expectation. It is because one variable PID from below data is repeated everytimes. can you please help
My expected output
pid parameter value
101 siteid 001
101 subjid 010
101 age 23
101 sex M
101 race asian
101 siteid 002
101 subjid 011
101 age 21
101 sex M
101 race cauca
data demo; input pid $ siteid $ subjid $ age sex $ race $; cards; 101 001 010 23 M asian 101 002 011 21 M cauca 102 003 011 34 F white 101 001 010 11 F black ; run; proc sort data=demo out=demo1; by pid;run; proc transpose data=demo1 out=demo2 (RENAME=(_NAME_=PARAMETER COL1=VALUE)); by pid; var _all_ ; run;
Also I am getting note as
NOTE: Numeric variables in the input data set will be converted to character in the output data set.
Like that?
data demo;
input pid $ siteid $ subjid $ age sex $ race $;
cards;
101 001 010 23 M asian
101 002 011 21 M cauca
102 003 011 34 F white
101 001 010 11 F black
;
run;
data demo;
set demo;
_x_+1;
run;
proc sort data=demo out=demo1;
by _x_ pid siteid;
run;
proc print data = demo1 ;
run;
proc transpose data=demo1(drop=pid) out=demo2 (RENAME=(_NAME_=PARAMETER COL1=VALUE)
where=(PARAMETER ne "_x_") drop=_x_);
by _x_ ;
var _all_ ;
run;
proc print data = demo2;
run;
Bart
I would not bother transposing the data set named DEMO. What is the benefit here of doing such a transpose? What can you do with the transposed result better than with the un-transposed result? Please explain how you would use the transposed data set.
@abraham1 wrote:
I want to display all parameters (variable) values in vertical format for analysis in a clinical trial based on unique PID.
Yes, I see that. Why? What is the benefit?
Like that?
data demo;
input pid $ siteid $ subjid $ age sex $ race $;
cards;
101 001 010 23 M asian
101 002 011 21 M cauca
102 003 011 34 F white
101 001 010 11 F black
;
run;
data demo;
set demo;
_x_+1;
run;
proc sort data=demo out=demo1;
by _x_ pid siteid;
run;
proc print data = demo1 ;
run;
proc transpose data=demo1(drop=pid) out=demo2 (RENAME=(_NAME_=PARAMETER COL1=VALUE)
where=(PARAMETER ne "_x_") drop=_x_);
by _x_ ;
var _all_ ;
run;
proc print data = demo2;
run;
Bart
The solution is working fine. Is it possible to add the PID column in the beginning so that each variable value will be populated like below
pid parameter value
101 siteid 001
101 subjid 010
101 age 23
101 sex M
101 race asian
remove:
(drop=pid)
Besides being hard to work with your expected output, placing character and what were numeric values into a single variable means that the numeric have to be converted to text which can lead to some issues depending on the actual values, especially if decimals and the actual needed result.
The desired output for multiple values of the same PID also means that Transpose isn't the tool as it would create value2 value3, etc, for each repetition of PID.
Ugly but does what is requested:
data demo; input pid $ siteid $ subjid $ age sex $ race $; cards; 101 001 010 23 M asian 101 002 011 21 M cauca 102 003 011 34 F white 101 001 010 11 F black ; run; data want; set demo; length parameter $ 10 value $ 5; Parameter='siteid'; value=siteid; output; Parameter='subjid'; value=subjid; output; Parameter='age'; value=put(age,f2. -L); output; Parameter='sex'; value=sex; output; Parameter='race'; value=race; output; keep pid parameter value; run;
If you do not have a set of key variables to use for the BY statement of PROC TRANSPOSE then you need to add another variable.
data demo;
input pid $ siteid $ subjid $ age sex $ race $;
cards;
101 001 010 23 M asian
101 002 011 21 M cauca
102 003 011 34 F white
101 001 010 11 F black
;
data step1;
row+1;
set demo;
run;
proc transpose data=step1 out=want(rename=(_name_=PARAM col1=VALUE)) ;
by row pid ;
var _all_;
run;
Results:
Obs row pid PARAM VALUE 1 1 101 row 1 2 1 101 pid 101 3 1 101 siteid 001 4 1 101 subjid 010 5 1 101 age 23 6 1 101 sex M 7 1 101 race asian 8 2 101 row 2 9 2 101 pid 101 10 2 101 siteid 002 11 2 101 subjid 011 12 2 101 age 21 13 2 101 sex M 14 2 101 race cauca 15 3 102 row 3 16 3 102 pid 102 17 3 102 siteid 003 18 3 102 subjid 011 19 3 102 age 34 20 3 102 sex F 21 3 102 race white 22 4 101 row 4 23 4 101 pid 101 24 4 101 siteid 001 25 4 101 subjid 010 26 4 101 age 11 27 4 101 sex F 28 4 101 race black
You can add a WHERE= dataset option on the output dataset to exclude the observations for the BY variables. Or add a post processing step to remove those observations.
Also noticed how the numeric values are right aligned. You could either add a post processing step to remove the leading spaces. Or transpose the numeric and character variables separately and create two value columns, one for the numeric values and one for the character values.
data final;
set want;
if upcase(param) in ('PID' 'ROW') then delete;
value = left(value);
run;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.