I was hoping I could get some help with creating tables for publication in SAS. I am determining the frequency of several health conditions for two different cohorts (variable called surge). the proc freq code is
ods output crosstabfreqs=cmr;
proc freq data=***;
table (CMR_AIDS CMR_ALCOHOL CMR_ANEMDEF CMR_AUTOIMMUNE CMR_BLDLOSS CMR_CANCER_LYMPH CMR_CANCER_LEUK
CMR_CANCER_METS CMR_CANCER_NSITU CMR_CANCER_SOLID CMR_CBVD) * surge/nocum norow nopercent; run;
I ods output the results into a dataset and am in the process of trying to convert that dataset into a better table for viewing.
The ods output dataset as it stands after dropping unnecessary columns looks something like this:
What i want is a table where each row is is one of the conditions (alcohol abuse, deficiency anemias, autoimmune conditions, etc.), the columns are surge 0 and 1, and the values are percent of column frequency.
Anyone have any idea how to do this with proc transpose or another procedure?
Could you provide an example of the input data set to proc freq that you used with 3 of the CMR variables and values of surge, maybe 15 or 20 observations in a data step? Small enough that you can calculate or estimate the body of the report table by hand. Paste the data step in a text box opened with the </> icon above the message window. Something like this:
data example input surge CMR_AIDS CMR_ALCOHOL CMR_ANEMDEF; datalines; 1 0 0 1 1 1 0 1 1 1 1 0 0 1 1 1 0 1 0 0 0 0 0 0 ;
I guessed as to likely other values than 1 for the shown variables. If your data is different show it as such. I am assuming the CMR variables are numeric from the way they appeared in your picture. If this is not the case let us know.
I think you might want to use a different procedure, possibly Proc Tabulate instead of trying to reshape the proc freq output. Plus I think you may have removed some information that would make the transpose work.
With the name prefix CMR_ and since they are all the same TYPE you it is not too difficult. Actually, you don't need to worry about data type with VVALUEX
data example;
input surge CMR_AIDS CMR_ALCOHOL:$1. CMR_ANEMDEF;
datalines;
1 0 0 1
1 1 0 1
1 1 1 0
0 1 1 1
0 1 0 0
0 0 0 0
;;;;
ods trace on;
ods select none;
proc freq;
tables (CMR:)*surge;
ods output crosstabfreqs=cmr;
run;
ods trace off;
ods select all;
data want;
length _variable_ $32 _value_ $8;
set cmr;
where _type_ eq '11';
_variable_ = scan(table,2,'* ');
_value_ = left(vvaluex(_variable_));
keep _variable_ _value_ surge percent RowPercent;
run;
proc print;
run;
Thanks so much. The output is not exactly what i want. The data is structured as is structured exactly as you indicated but with 9 million observations and a patient ID.
data temp;
input patientid surge cmr_aids cmr_alcohol cmr_anemdef;
datalines;
1 0 0 1 0
2 0 0 1 1
3 0 0 0 1
4 0 1 1 1
5 0 1 1 1
6 0 0 1 0
7 0 1 0 0
8 0 1 0 1
9 1 0 0 0
10 1 0 0 0
11 1 1 1 1
12 1 0 1 0
13 1 1 0 1
14 1 0 0 1
15 1 1 1 0
;
run;
i used proc freq with a table statement (cmr_aids cmr_alcohol cmr_anemdef)* surge; to calculate the percentage of each condition in the larger population.
The table i want would be:
Surge=1 Surge=0
AIDS 0.3 0.3
Alcohol 4.5 4.1
Anemia 8.2 8.3
where the columns are surge 1 and surge 0, the rows are each condition and the values of the cells are the percentage of patients with each condition within each surge cohort similar to a Table 1 in publications.
Thanks!
proc transpose data=temp out=temp_t;
by patientid surge;
var cmr_aids--cmr_anemdef;
run;
proc report data=temp_t;
columns _name_ surge,col1;
define _name_/group " ";
define surge/across "Surge";
define col1/mean " " format=percent8.2;
run;
PROC REPORT is what you should think of first (or PROC TABULATE) to produce this type of report. You can compute the MEAN in PROC REPORT, and since your variables are always 0 or 1, the mean is equal to the percent.
With BINARY variables just take the MEAN to find the PERCENT.
proc means data=have noprint ;
class surge ;
var cmr_: ;
output out=want mean= ;
run;
If you want a TALL format then add a PROC TRANSPOSE step.
proc transpose data=want name=variable out=tall(rename=(col1=percent));
by _type_ surge;
var cmr_:;
run;
@anujmehta wrote:
Thanks so much. The output is not exactly what i want. The data is structured as is structured exactly as you indicated but with 9 million observations and a patient ID.
data temp; input patientid surge cmr_aids cmr_alcohol cmr_anemdef; datalines; 1 0 0 1 0 2 0 0 1 1 3 0 0 0 1 4 0 1 1 1 5 0 1 1 1 6 0 0 1 0 7 0 1 0 0 8 0 1 0 1 9 1 0 0 0 10 1 0 0 0 11 1 1 1 1 12 1 0 1 0 13 1 1 0 1 14 1 0 0 1 15 1 1 1 0 ; run;
i used proc freq with a table statement (cmr_aids cmr_alcohol cmr_anemdef)* surge; to calculate the percentage of each condition in the larger population.
The table i want would be:
Surge=1 Surge=0
AIDS 0.3 0.3
Alcohol 4.5 4.1
Anemia 8.2 8.3
where the columns are surge 1 and surge 0, the rows are each condition and the values of the cells are the percentage of patients with each condition within each surge cohort similar to a Table 1 in publications.
Thanks!
First "Table 1 in publications" is so topic/profession/area specific that it is next to meaningless to many of us.
Second is this :
Surge=1 Surge=0 AIDS 0.3 0.3 Alcohol 4.5 4.1 Anemia 8.2 8.3
Supposed to come from that example data set???? Using your previous proc freq code as a template I do not get any of those numbers from the provided example data.
This does however match proc freq output for the given example data for the _type_=11 and the CMR variable = 1.
proc tabulate data=temp; class surge; var cmr: ; table cmr: * mean=''*f=percent8.2, surge ; run;
or if you don't like percent signs:
proc format library=work; picture perc_no_sign (round) 0-high = '009.99' (mult=10000 ) ; run; proc tabulate data=temp; class surge; var cmr: ; table cmr: * mean=''*f=perc_no_sign., surge ; run;
Please see this code for modifying the output from ONEWAYFREQS into something more usable. PROC FREQ for oneway tables output to data set UPDATE: looks like @data_null__ has provided the same code.
However, as requested by @ballardw lets start at the beginning by you showing us the data that went into PROC FREQ. I'm guessing that if we start from the beginning, there are easier methods.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.