I want to run a frequency table that counts the number of observations with or without hypertension and displays them. My frequency table keeps coming out wrong and I am not sure how to fix it. Here is the code:
data bpx_j;
set libref.bpx_j;
AvgSystolic = MEAN(bpxsy1,bpxsy2,bpxsy3,bpxsy4);
AvgDiastolic = MEAN(bpxdi1,bpxdi2,bpxdi3,bpxdi4);
CountSystolic = N(bpxsy1,bpxsy2,bpxsy3,bpxsy4);
CountDiastolic = N(bpxdi1,bpxdi2,bpxdi3,bpxdi4);
run;
proc freq data = work.bpx_j;
tables CountSystolic CountDiastolic;
run;
proc means data = work.bpx_j;
var AvgSystolic AvgDiastolic;
run;
proc print data = work.bpx_j;
run;
/*Part 2*/
proc format;
value hypertensionD LOW -< 80 = 'No Hypertension'
80 <- HIGH = 'Hypertension';
run;
proc format;
value hypertensionS LOW -< 130 = 'No Hypertension'
130 <- HIGH = 'Hypertension';
run;
proc freq data = work.bpx_j;
tables AvgSystolic AvgDiastolic;
Format AvgSystolic hypertensionS. AvgDiastolic hypertensionD.;
run;
This is what the output looks like:
Describe what is "wrong", exactly with the output.
You have an issue with your formats. Both of these have a value repeated in the ranges -< means "less than or equal",
<- means "greater than or equal" , so "equal" is in both of these and yield unexpected results. Remove one of the - in each definition.
proc format;
value hypertensionD LOW -< 80 = 'No Hypertension'
80 <- HIGH = 'Hypertension';
run;
proc format;
value hypertensionS LOW -< 130 = 'No Hypertension'
130 <- HIGH = 'Hypertension';
run;
.
Here is the log:
NOTE: There were 8704 observations read from the data set LIBREF.BPX_J.
NOTE: The data set WORK.BPX_J has 8704 observations and 26 variables.
NOTE: DATA statement used (Total process time):
real time 0.11 seconds
cpu time 0.07 seconds
110 proc freq data = work.bpx_j;
111 tables CountSystolic CountDiastolic;
112 run;
NOTE: There were 8704 observations read from the data set WORK.BPX_J.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.06 seconds
cpu time 0.01 seconds
113 proc means data = work.bpx_j;
114 var AvgSystolic AvgDiastolic;
115 run;
NOTE: There were 8704 observations read from the data set WORK.BPX_J.
NOTE: PROCEDURE MEANS used (Total process time):
real time 0.06 seconds
cpu time 0.04 seconds
116 proc print data = work.bpx_j;
117 run;
NOTE: There were 8704 observations read from the data set WORK.BPX_J.
NOTE: PROCEDURE PRINT used (Total process time):
real time 13.07 seconds
cpu time 13.18 seconds
118 /*Part 2*/
119 proc format;
120 value hypertensionD LOW -< 80 = 'No Hypertension'
121 80 <- HIGH = 'Hypertension';
NOTE: Format HYPERTENSIOND is already on the library WORK.FORMATS.
NOTE: Format HYPERTENSIOND has been output.
122 run;
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
123 proc format;
124 value hypertensionS LOW -< 130 = 'No Hypertension'
125 130 <- HIGH = 'Hypertension';
NOTE: Format HYPERTENSIONS is already on the library WORK.FORMATS.
NOTE: Format HYPERTENSIONS has been output.
126 run;
NOTE: PROCEDURE FORMAT used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
127 proc freq data = work.bpx_j;
128 tables AvgSystolic AvgDiastolic;
129 Format AvgSystolic hypertensionS. AvgDiastolic hypertensionD.;
130 run;
NOTE: There were 8704 observations read from the data set WORK.BPX_J.
NOTE: PROCEDURE FREQ used (Total process time):
real time 0.08 seconds
cpu time 0.03 seconds
Thanks everyone! It finally worked after changing some things around.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.