hi,
In medical journal articles, it is very common to see a big long summary statistic table that usually occupies the space of table 1. Perhaps medical readers prefer to see how a patient characteristic changes from year to year horizontally. But you know long format data is a lot easier to generate. I have managed to organise my data in the following layout, which is in a long format. All the variables have been converted to character. The length, however, varies from variable to variable. To keep this example simple, I use serial variable names. They don't look that nice in my data.
year var1 var2 var3 var4 var5 var6..... var17
------------------------------------------------------------
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
------------------------------------------------------------
I would like to (1) reshape the long-format table to the layout below, (2) replace var1 var2... var17 with desired labels and (3) export the wide-format table to a RTF via PROC REPORT. As can be seen in the following mock table, I place years horizontally from second column to the last column. Each var become an item under the first column characteristic. I have no idea how this can be done in SAS. Most posts talk about transposing data from wide to long. Meanwhile I am also looking at the reshape2 package in R. Hope this post will benefit medical SAS programmers. Please make your solution easy to replicate.
-------------------------------------------------------------------------------
Characteristic 2004 2005 2006 2007 2008..... 2013
-------------------------------------------------------------------------------
var1
var2
var3
var4
var5
var6
...
var17
---------------------------------------------------------------------------------
I have exported my table to a CSV. I guess again it can't be uploaded here. Please download it at the bottom of the post at the URL below. My original data is in the spreadsheet 'have.' I have what I want in the spreadsheet 'want' Any advice would be highly appreciated.
Reshaping data to the table 1 in medical journal articles | Like-Minded Data Analysts
I think you need to include more details, I'm having a hard time seeing what you want.
That looks like a transpose, possibly a double one.
There are a lot of papers on generating table of characteristics from SAS datasets, search LexJansen.com for many examples and macros.
I think this is what you are looking for:
data have;
infile cards missover;
input year var1-var17;
cards;
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
;
proc transpose data=have out=want (rename=(_NAME_ = Characteristic));id year;
hi Mark,
does your method require that a PROC transpose is run for every variable? I think var= statement can take just one variable at a time. If so, there will be 17 proc transpose. Eventually 17 out= data set need to be vertically stacked into one table.
Are you able to run the code I provided and look at the output? If so let me know if this is what you want.
hi Mark,
Unfortunately your code is not working on my side. I don't see var= statement in your code. Would PROC TRANSPOSE work without the statement? Fortunately I've finally reshaped my data with a do loop and a merge. I think there is no id variable in my data. I just can't get PROC TRANSPOSE work with my data.
/*reshape data from wide to long */
%macro reshape_merge;
/*add all the value variables in this list*/
%let var_list = char_num_admi char_pct_male age_1 age_2 income_1 income_2 income_3
income_4 cost_1 cost_2 ccx los_1 los_2 died
char_hosp_1_pct char_hosp_2_pct char_hosp_3_pct ;
/*reshape data from wide to long for each year*/
%do year=2004 %to 2013;
data year_&year.;
set data_pro.keep_colon_summary_table_1_wide(where= (year= &year.) ) ;
array wide{*} $ &var_list. ;
do counter= 1 to dim(wide); /*create format which a counter corresponds to a patient characteristic*/
year&year.= wide(counter);
output;
end;
drop year &var_list. ;
run;
%end;
/*merge year2004-year2013*/
data data_pro.year2004_2013;
merge year_2004-year_2013;
by counter;
format counter colon_characteristic.; /*apply format to counter*/
rename year2013=Avg counter= Characteristic;
run;
%mend reshape_merge;
%reshape_merge;
Here is the output I see. I assume we are using different versions of SAS.
Characteristic | 2004 | 2005 | 2006 | 2007 | 2008 | 2009 | 2010 | 2011 | 2012 | 2013 |
var1 | . | . | . | . | . | . | . | . | . | . | |
var2 | . | . | . | . | . | . | . | . | . | . | |
var3 | . | . | . | . | . | . | . | . | . | . | |
var4 | . | . | . | . | . | . | . | . | . | . | |
var5 | . | . | . | . | . | . | . | . | . | . | |
var6 | . | . | . | . | . | . | . | . | . | . | |
var7 | . | . | . | . | . | . | . | . | . | . | |
var8 | . | . | . | . | . | . | . | . | . | . | |
var9 | . | . | . | . | . | . | . | . | . | . | |
var10 | . | . | . | . | . | . | . | . | . | . | |
var11 | . | . | . | . | . | . | . | . | . | . | |
var12 | . | . | . | . | . | . | . | . | . | . | |
var13 | . | . | . | . | . | . | . | . | . | . | |
var14 | . | . | . | . | . | . | . | . | . | . | |
var15 | . | . | . | . | . | . | . | . | . | . | |
var16 | . | . | . | . | . | . | . | . | . | . | |
var17 | . | . | . | . | . | . | . | . | . | . |
Would not two proc transpose work, first one to normalise the data, second to transpose up again:
data have;
year=2004; var1=1; var2=4; var3=6; output;
year=2005; var1=2; var2=8; var3=9; output;
run;
proc transpose data=have out=inter;
by year;
var var:;
run;
proc sort data=inter;
by _name_ year;
run;
proc transpose data=inter out=want;
by _name_;
var col1;
id year;
idlabel year;
run;
Mark didn't generate sample data for the var1-var17 which is why it was empty. However it should work on your actual data. You should post your actual data.
If you have a mix of char and numeric none of these may work the way you want.
data have;
array var(17) var1-var17;
do year=2004 to 2013;
do i=1 to 17;
var(i)=floor(rand('normal', 50, 3));
end;
output;
end;
drop i;
run;
proc transpose data=have out=long Prefix=Y;
id year;
run;
I suspect that you can use a Report procedure to do this.
Something in this approach if the data is already summarized, or possibly if it isn't just request the desired common statistics:
proc tabulate data=have;
class year;
var var1 - var10;
table (var1 - var10)* sum='', year='';
run;
The quotes in the ='' are to suppress a heading. If requesting multiple statistics then not such a good idea...
ONE proc transpose is good enough.
OR Try IML code which could save you lots of time.
data have; infile cards missover; input (year var1-var17) ($); cards; 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 ; run; proc iml; use have; v='var1':'var17'; read all var v into x[r=year c=Characteristic]; close; y=t(x); create want from y[r=Characteristic c=year]; append from y[r=Characteristic]; close; quit;
Xia Keshan
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.