BookmarkSubscribeRSS Feed
anujmehta
Calcite | Level 5

I was hoping I could get some help with creating tables for publication in SAS. I am determining the frequency of several health conditions for two different cohorts (variable called surge). the proc freq code is

ods output crosstabfreqs=cmr;
proc freq data=***;
table (CMR_AIDS CMR_ALCOHOL CMR_ANEMDEF CMR_AUTOIMMUNE CMR_BLDLOSS CMR_CANCER_LYMPH CMR_CANCER_LEUK
CMR_CANCER_METS CMR_CANCER_NSITU CMR_CANCER_SOLID CMR_CBVD) * surge/nocum norow nopercent; run;

 

I ods output the results into a dataset and am in the process of trying to convert that dataset into a better table for viewing. 

 

The ods output dataset as it stands after dropping unnecessary columns looks something like this:

anujmehta_0-1718211790323.png

What i want is a table where each row is is one of the conditions (alcohol abuse, deficiency anemias, autoimmune conditions, etc.), the columns are surge 0 and 1, and the values are percent of column frequency. 

 

Anyone have any idea how to do this with proc transpose or another procedure?

 

7 REPLIES 7
ballardw
Super User

Could you provide an example of the input data set to proc freq that you used with 3 of the CMR variables and values of surge, maybe 15 or 20 observations in a data step? Small enough that you can calculate or estimate the body of the report table by hand. Paste the data step in a text box opened with the </> icon above the message window. Something like this:

data example
   input surge  CMR_AIDS CMR_ALCOHOL CMR_ANEMDEF;
datalines;
1 0 0 1
1 1 0 1
1 1 1 0
0 1 1 1
0 1 0 0
0 0 0 0
;

I guessed as to likely other values than 1 for the shown variables. If your data is different show it as such. I am assuming the CMR variables are numeric from the way they appeared in your picture. If this is not the case let us know.

 

I think you might want to use a different procedure, possibly Proc Tabulate instead of trying to reshape the proc freq output. Plus I think you may have removed some information that would make the transpose work.

data_null__
Jade | Level 19

With the name prefix CMR_ and since they are all the same TYPE you it is not too difficult.   Actually, you don't need to worry about data type with VVALUEX

data example;
   input surge  CMR_AIDS CMR_ALCOHOL:$1. CMR_ANEMDEF;
datalines;
1 0 0 1
1 1 0 1
1 1 1 0
0 1 1 1
0 1 0 0
0 0 0 0
;;;;

ods trace on;
ods select none;
proc freq;
   tables (CMR:)*surge;
   ods output crosstabfreqs=cmr;
   run;
ods trace off;
ods select all;

data want;
   length _variable_ $32 _value_ $8;
   set cmr; 
   where _type_ eq '11';
   _variable_ = scan(table,2,'* ');
   _value_ = left(vvaluex(_variable_));
   keep _variable_ _value_ surge percent RowPercent;
   run;
proc print;
   run;

Capture.PNG

anujmehta
Calcite | Level 5

Thanks so much. The output is not exactly what i want. The data is structured as is structured exactly as you indicated but with 9 million observations and a patient ID. 

 

data temp;
input patientid surge cmr_aids cmr_alcohol cmr_anemdef;
datalines;
1 0 0 1 0
2 0 0 1 1
3 0 0 0 1
4 0 1 1 1
5 0 1 1 1 
6 0 0 1 0
7 0 1 0 0 
8 0 1 0 1
9 1 0 0 0 
10 1 0 0 0 
11 1 1 1 1
12 1 0 1 0 
13 1 1 0 1
14 1 0 0 1
15 1 1 1 0
;
run;

i used proc freq with a table statement (cmr_aids cmr_alcohol cmr_anemdef)* surge; to calculate the percentage of each condition in the larger population. 

 

The table i want would be:

 

                            Surge=1               Surge=0

AIDS                    0.3                          0.3

Alcohol               4.5                          4.1

Anemia                8.2                          8.3

 

 

where the columns are surge 1 and surge 0, the rows are each condition and the values of the cells are the percentage of patients with each condition within each surge cohort similar to a Table 1 in publications.

 

Thanks!

PaigeMiller
Diamond | Level 26
proc transpose data=temp out=temp_t;
    by patientid surge;
    var cmr_aids--cmr_anemdef;
run;

proc report data=temp_t;
    columns _name_ surge,col1;
    define _name_/group " ";
    define surge/across "Surge";
    define col1/mean " " format=percent8.2;
run;

 

 

PROC REPORT is what you should think of first (or PROC TABULATE) to produce this type of report. You can compute the MEAN in PROC REPORT, and since your variables are always 0 or 1, the mean is equal to the percent. 

--
Paige Miller
Tom
Super User Tom
Super User

With BINARY variables just take the MEAN to find the PERCENT.

proc means data=have noprint ;
  class surge ;
  var cmr_: ;
  output out=want mean= ;
run;

Tom_0-1718221617844.png

 

If you want a TALL format then add a PROC TRANSPOSE step.

proc transpose data=want name=variable out=tall(rename=(col1=percent));
  by _type_ surge;
  var cmr_:;
run;

Tom_1-1718221643324.png

 

 

ballardw
Super User

@anujmehta wrote:

Thanks so much. The output is not exactly what i want. The data is structured as is structured exactly as you indicated but with 9 million observations and a patient ID. 

 

data temp;
input patientid surge cmr_aids cmr_alcohol cmr_anemdef;
datalines;
1 0 0 1 0
2 0 0 1 1
3 0 0 0 1
4 0 1 1 1
5 0 1 1 1 
6 0 0 1 0
7 0 1 0 0 
8 0 1 0 1
9 1 0 0 0 
10 1 0 0 0 
11 1 1 1 1
12 1 0 1 0 
13 1 1 0 1
14 1 0 0 1
15 1 1 1 0
;
run;

i used proc freq with a table statement (cmr_aids cmr_alcohol cmr_anemdef)* surge; to calculate the percentage of each condition in the larger population. 

 

The table i want would be:

 

                            Surge=1               Surge=0

AIDS                    0.3                          0.3

Alcohol               4.5                          4.1

Anemia                8.2                          8.3

 

 

where the columns are surge 1 and surge 0, the rows are each condition and the values of the cells are the percentage of patients with each condition within each surge cohort similar to a Table 1 in publications.

 

Thanks!


First "Table 1 in publications" is so topic/profession/area specific that it is next to meaningless to many of us.

Second is this :

                          Surge=1               Surge=0
AIDS                    0.3                          0.3
Alcohol               4.5                          4.1
Anemia                8.2                          8.3

Supposed to come from that example data set???? Using your previous proc freq code as a template I do not get any of those numbers from the provided example data.

This does however match proc freq output for the given example data for the _type_=11 and the CMR variable = 1.

proc tabulate data=temp;
   class surge;
   var cmr: ;
   table cmr: * mean=''*f=percent8.2,
         surge
   ;
run;

or if you don't like percent signs:

proc format library=work;
picture perc_no_sign (round)
0-high  = '009.99' (mult=10000 )
;
run;

proc tabulate data=temp;
   class surge;
   var cmr: ;
   table cmr: * mean=''*f=perc_no_sign.,
         surge
   ;
run;
PaigeMiller
Diamond | Level 26

Please see this code for modifying the output from ONEWAYFREQS into something more usable. PROC FREQ for oneway tables output to data set UPDATE: looks like @data_null__ has provided the same code.

 

However, as requested by @ballardw lets start at the beginning by you showing us the data that went into PROC FREQ. I'm guessing that if we start from the beginning, there are easier methods.

--
Paige Miller

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 358 views
  • 2 likes
  • 5 in conversation