BookmarkSubscribeRSS Feed
Chang
Quartz | Level 8

hi,

In medical journal articles, it is very common to see a big long summary statistic table that usually occupies the space of table 1. Perhaps medical readers prefer to see how a patient characteristic changes from year to year horizontally. But you know long format data is a lot easier to generate. I have managed to organise my data in the following layout, which is in a long format. All the variables have been converted to character. The length, however, varies from variable to variable. To keep this example simple, I use serial variable names. They don't look that nice in my data.

year   var1 var2 var3 var4 var5 var6..... var17

------------------------------------------------------------

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

------------------------------------------------------------

I would like to (1) reshape the long-format table to the layout below, (2) replace var1 var2... var17 with desired labels and (3) export the wide-format table to a RTF via PROC REPORT. As can be seen in the following mock table, I place years horizontally from second column to the last column. Each var become an item under the first column characteristic. I have no idea how this can be done in SAS. Most posts talk about transposing data from wide to long. Meanwhile I am also looking at the reshape2 package in R. Hope this post will benefit medical SAS programmers. Please make your solution easy to replicate.

-------------------------------------------------------------------------------

Characteristic  2004 2005 2006 2007 2008..... 2013

-------------------------------------------------------------------------------

var1

var2

var3

var4

var5

var6

...

var17

---------------------------------------------------------------------------------

I have exported my table to a CSV. I guess again it can't be uploaded here. Please download it at the bottom of the post at the URL below. My original data is in the spreadsheet 'have.' I have what I want in the spreadsheet 'want' Any advice would be highly appreciated.

Reshaping data to the table 1 in medical journal articles | Like-Minded Data Analysts

10 REPLIES 10
Reeza
Super User

I think you need to include more details, I'm having a hard time seeing what you want.

That looks like a transpose, possibly a double one. 

There are a lot of papers on generating table of characteristics from SAS datasets, search LexJansen.com for many examples and macros.

Steelers_In_DC
Barite | Level 11

I think this is what you are looking for:

data have;

infile cards missover;

input year var1-var17;

cards;

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

;

proc transpose data=have out=want (rename=(_NAME_ = Characteristic));id year;

Chang
Quartz | Level 8

hi Mark,

does your method require that a PROC transpose is run for every variable? I think var= statement can take just one variable at a time. If so, there will be 17 proc transpose. Eventually 17 out= data set need to be vertically stacked into one table.

Steelers_In_DC
Barite | Level 11

Are you able to run the code I provided and look at the output?  If so let me know if this is what you want.

Chang
Quartz | Level 8

hi Mark,

Unfortunately your code is not working on my side. I don't see var= statement in your code. Would PROC TRANSPOSE work without the statement? Fortunately I've finally reshaped my data with a do loop and a merge. I think there is no id variable in my data. I just can't get PROC TRANSPOSE work with my data.

/*reshape data from wide to long */

%macro reshape_merge;

/*add all the value variables in this list*/

    %let var_list = char_num_admi char_pct_male age_1 age_2 income_1 income_2 income_3

                            income_4 cost_1 cost_2 ccx los_1 los_2 died   

                            char_hosp_1_pct char_hosp_2_pct char_hosp_3_pct ;

/*reshape data from wide to long for each year*/

    %do year=2004 %to 2013;

        data year_&year.;

            set data_pro.keep_colon_summary_table_1_wide(where= (year= &year.) )    ;

            array wide{*} $ &var_list. ;

                do counter= 1 to dim(wide); /*create format which a counter corresponds to a patient characteristic*/

                    year&year.= wide(counter);

                    output;

                end;

            drop year &var_list. ;

        run;  

    %end;

/*merge year2004-year2013*/

    data data_pro.year2004_2013;

        merge year_2004-year_2013;

        by counter;

        format counter colon_characteristic.; /*apply format to counter*/

        rename year2013=Avg    counter= Characteristic;

    run;   

%mend reshape_merge;

%reshape_merge;

Steelers_In_DC
Barite | Level 11

Here is the output I see.  I assume we are using different versions of SAS.

Characteristic2004200520062007200820092010201120122013

var1        .   .   .   .   .   .   .   .   .   .
var2        .   .   .   .   .   .   .   .   .   .
var3        .   .   .   .   .   .   .   .   .   .
var4        .   .   .   .   .   .   .   .   .   .
var5        .   .   .   .   .   .   .   .   .   .
var6        .   .   .   .   .   .   .   .   .   .
var7        .   .   .   .   .   .   .   .   .   .
var8        .   .   .   .   .   .   .   .   .   .
var9        .   .   .   .   .   .   .   .   .   .
var10       .   .   .   .   .   .   .   .   .   .
var11       .   .   .   .   .   .   .   .   .   .
var12       .   .   .   .   .   .   .   .   .   .
var13       .   .   .   .   .   .   .   .   .   .
var14       .   .   .   .   .   .   .   .   .   .
var15       .   .   .   .   .   .   .   .   .   .
var16       .   .   .   .   .   .   .   .   .   .
var17       .   .   .   .   .   .   .   .   .   .
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Would not two proc transpose work, first one to normalise the data, second to transpose up again:

data have; 

  year=2004; var1=1; var2=4; var3=6; output;

  year=2005; var1=2; var2=8; var3=9; output;

run;

proc transpose data=have out=inter;

  by year;

  var var:;

run;

proc sort data=inter;

  by _name_ year;

run;

proc transpose data=inter out=want;

  by _name_;

  var col1;

  id year;

  idlabel year;

run;

Reeza
Super User

Mark didn't generate sample data for the var1-var17 which is why it was empty. However it should work on your actual data. You should post your actual data.

If you have a mix of char and numeric none of these may work the way you want.


data have;

array var(17) var1-var17;

do year=2004 to 2013;

do i=1 to 17;

var(i)=floor(rand('normal', 50, 3));

end;

output;

end;

drop i;

run;

proc transpose data=have out=long Prefix=Y;

id year;

run;

ballardw
Super User

I suspect that you can use a Report procedure to do this.

Something in this approach if the data is already summarized, or possibly if it isn't just request the desired common statistics:

proc tabulate data=have;

class year;

var var1 - var10;

table (var1 - var10)* sum='', year='';

run;

The quotes in the ='' are to suppress a heading. If requesting multiple statistics then not such a good idea...

Ksharp
Super User

ONE proc transpose is good enough.

OR Try IML code which could save you lots of time.

data have;
infile cards missover;
input (year var1-var17) ($);
cards;
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
;
run;
proc iml;
use have;
v='var1':'var17';
read all var v into x[r=year c=Characteristic];
close;
y=t(x);
create want from y[r=Characteristic c=year];
append from y[r=Characteristic];
close;
quit;


Xia Keshan

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1520 views
  • 0 likes
  • 6 in conversation