Hello,
I'm trying to create separate data sets (for each quarter we gather data) that would output 1) the total # of uncontrolled patients with diabetes (numerator), 2) the total "well controlled" and "controlled" patients (these would be added together for our denominator), 3) the proportion of patients that are uncontrolled compared to controlled/well-controlled.
[uncntrolled/(well controlled +controlled)]*100
CODE:
MACRO fin(file);
DATA work.&file._dm4;
SET work.&file._dm3;
IF (a1c_result<= 8) THEN DM_control = 1;
IF (8< a1c_result <= 9) THEN DM_control = 2;
IF (a1c_result > 9) OR (a1c_result = .) THEN DM_control = 0;
run;
data work.&file._dm5;
set work.&file._dm4;
if dm_control=1 then well_cntrl=1; else well_cntrl=0;
if dm_control=2 then cntrl=1; else cntrl=0;
if dm_control=0 then uncntrl=1; else uncntrl=0;
RUN;
PROC SUMMARY data=work.&file._dm6;
class well_cntrl cntrl uncntrl;
output out=work.&file._summary sum=sum_wellcntrl sum_cntrl sum_unctrl;
final_freq= (uncntrl/(well_cntrl+cntrl)*100
run;
data=final.&file._sum_final;
set work.&file._summary;
controlled_DM= (sum_wellcntrl + sum_cntrl);
keep sum_uncntrl controlled_DM final_freq;
run;
%MEND fin;
%fin(Baseline);
%fin(Q1_2019);
%fin(Q2_2019);
%fin(Q3_2019);
%fin(Q4_2019);
%fin(Q1_2020);
%fin(Q2_2020);
What is your question, or what part do you need help with? Does your code work? If not, does it generate an error?
@RebeccaB_ wrote:
Hello,
I'm trying to create separate data sets (for each quarter we gather data) that would output 1) the total # of uncontrolled patients with diabetes (numerator), 2) the total "well controlled" and "controlled" patients (these would be added together for our denominator), 3) the proportion of patients that are uncontrolled compared to controlled/well-controlled.
[uncntrolled/(well controlled +controlled)]*100
CODE:
MACRO fin(file); DATA work.&file._dm4; SET work.&file._dm3; IF (a1c_result<= 8) THEN DM_control = 1; IF (8< a1c_result <= 9) THEN DM_control = 2; IF (a1c_result > 9) OR (a1c_result = .) THEN DM_control = 0; run; data work.&file._dm5; set work.&file._dm4; if dm_control=1 then well_cntrl=1; else well_cntrl=0; if dm_control=2 then cntrl=1; else cntrl=0; if dm_control=0 then uncntrl=1; else uncntrl=0; RUN; PROC SUMMARY data=work.&file._dm6; class well_cntrl cntrl uncntrl; output out=work.&file._summary sum=sum_wellcntrl sum_cntrl sum_unctrl; final_freq= (uncntrl/(well_cntrl+cntrl)*100 run; data=final.&file._sum_final; set work.&file._summary; controlled_DM= (sum_wellcntrl + sum_cntrl); keep sum_uncntrl controlled_DM final_freq; run; %MEND fin; %fin(Baseline); %fin(Q1_2019); %fin(Q2_2019); %fin(Q3_2019); %fin(Q4_2019); %fin(Q1_2020); %fin(Q2_2020);
What I'm trying to determine is how to create a final summary table for the Baseline, Q1_2019, Q2_2019, Q3_2019, Q4_2019, Q1_2020, Q2_2020. I've received some errors which is located below in my log.
NOTE: There were 38424 observations read from the data set WORK.BASELINE_DM2.
NOTE: The data set WORK.BASELINE_DM2 has 38424 observations and 78 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.11 seconds
cpu time 0.09 seconds
NOTE: There were 38424 observations read from the data set WORK.BASELINE_DM2.
NOTE: The data set WORK.BASELINE_DM3 has 7403 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
1004 %final(Q1_2019);
NOTE: There were 102559 observations read from the data set WORK.Q1_2019_F.
NOTE: The data set WORK.Q1_2019_DM has 11026 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.06 seconds
NOTE: There were 11026 observations read from the data set WORK.Q1_2019_DM.
NOTE: The data set WORK.Q1_2019_DM2 has 10934 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.00 seconds
NOTE: There were 10934 observations read from the data set WORK.Q1_2019_DM2.
NOTE: The data set WORK.Q1_2019_DM2 has 10934 observations and 78 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.04 seconds
cpu time 0.04 seconds
NOTE: There were 10934 observations read from the data set WORK.Q1_2019_DM2.
NOTE: The data set WORK.Q1_2019_DM3 has 5162 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
1005 %final(Q2_2019);
NOTE: There were 104559 observations read from the data set WORK.Q2_2019_F.
NOTE: The data set WORK.Q2_2019_DM has 11136 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.07 seconds
NOTE: There were 11136 observations read from the data set WORK.Q2_2019_DM.
NOTE: The data set WORK.Q2_2019_DM2 has 10977 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.04 seconds
cpu time 0.01 seconds
NOTE: There were 10977 observations read from the data set WORK.Q2_2019_DM2.
NOTE: The data set WORK.Q2_2019_DM2 has 10977 observations and 78 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
NOTE: There were 10977 observations read from the data set WORK.Q2_2019_DM2.
NOTE: The data set WORK.Q2_2019_DM3 has 5262 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
1006 %final(Q3_2019);
NOTE: There were 101742 observations read from the data set WORK.Q3_2019_F.
NOTE: The data set WORK.Q3_2019_DM has 10721 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.06 seconds
NOTE: There were 10721 observations read from the data set WORK.Q3_2019_DM.
NOTE: The data set WORK.Q3_2019_DM2 has 10577 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
NOTE: There were 10577 observations read from the data set WORK.Q3_2019_DM2.
NOTE: The data set WORK.Q3_2019_DM2 has 10577 observations and 78 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.04 seconds
cpu time 0.03 seconds
NOTE: There were 10577 observations read from the data set WORK.Q3_2019_DM2.
NOTE: The data set WORK.Q3_2019_DM3 has 5213 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
1007 %final(Q4_2019);
NOTE: There were 97979 observations read from the data set WORK.Q4_2019_F.
NOTE: The data set WORK.Q4_2019_DM has 10062 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.09 seconds
cpu time 0.06 seconds
NOTE: There were 10062 observations read from the data set WORK.Q4_2019_DM.
NOTE: The data set WORK.Q4_2019_DM2 has 9945 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
NOTE: There were 9945 observations read from the data set WORK.Q4_2019_DM2.
NOTE: The data set WORK.Q4_2019_DM2 has 9945 observations and 78 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
NOTE: There were 9945 observations read from the data set WORK.Q4_2019_DM2.
NOTE: The data set WORK.Q4_2019_DM3 has 5159 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.01 seconds
1008 %final(Q1_2020);
NOTE: There were 93023 observations read from the data set WORK.Q1_2020_F.
NOTE: The data set WORK.Q1_2020_DM has 9773 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.07 seconds
cpu time 0.07 seconds
NOTE: There were 9773 observations read from the data set WORK.Q1_2020_DM.
NOTE: The data set WORK.Q1_2020_DM2 has 9637 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.00 seconds
NOTE: There were 9637 observations read from the data set WORK.Q1_2020_DM2.
NOTE: The data set WORK.Q1_2020_DM2 has 9637 observations and 78 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
NOTE: There were 9637 observations read from the data set WORK.Q1_2020_DM2.
NOTE: The data set WORK.Q1_2020_DM3 has 5047 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
1009 %final(Q2_2020);
NOTE: There were 77667 observations read from the data set WORK.Q2_2020_F.
NOTE: The data set WORK.Q2_2020_DM has 9101 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.06 seconds
cpu time 0.06 seconds
NOTE: There were 9101 observations read from the data set WORK.Q2_2020_DM.
NOTE: The data set WORK.Q2_2020_DM2 has 8985 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
NOTE: There were 8985 observations read from the data set WORK.Q2_2020_DM2.
NOTE: The data set WORK.Q2_2020_DM2 has 8985 observations and 78 variables.
NOTE: PROCEDURE SORT used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds
NOTE: There were 8985 observations read from the data set WORK.Q2_2020_DM2.
NOTE: The data set WORK.Q2_2020_DM3 has 4389 observations and 78 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.01 seconds
1010
1011 /*CONTROL BY TIMEFRAME*/
1012 %MACRO fin(file);
1013
1014 DATA work.&file._dm4;
1015 SET work.&file._dm3;
1016 IF (a1c_result<= 8) THEN DM_control = 1;
1017 IF (8< a1c_result <= 9) THEN DM_control = 2;
1018 IF (a1c_result > 9) OR (a1c_result = .) THEN DM_control = 0;
1019 run;
1020 data work.&file._dm5;
1021 set work.&file._dm4;
1022 if dm_control=1 then well_cntrl=1; else well_cntrl=0;
1023 if dm_control=2 then cntrl=1; else cntrl=0;
1024 if dm_control=0 then uncntrl=1; else uncntrl=0;
1025 RUN;
1026
1027 Data work.&file._dm6;
1028 set work.&file._dm5;
1029 if well_cntrl=1 then DM_cat= "well controlled";
1030 if cntrl=1 then DM_cat= "controlled";
1031 if uncntrl=1 then DM_cat= "uncontrolled";
1032 run;
1033
1034 PROC SUMMARY data=work.&file._dm6;
1035 class well_cntrl cntrl uncntrl;
1036 output out=work.&file._summary sum=sum_wellcntrl sum_cntrl sum_unctrl;
1037 final_freq= (uncntrl/(well_cntrl+cntrl)*100
1038 run;
1039
1040 data=final.&file._sum_final;
1041 set work.&file._summary;
1042 controlled_DM= (sum_wellcntrl + sum_cntrl);
1043 keep sum_uncntrl controlled_DM final_freq;
1044 run;
1045
1046
1047 /*data work.&file._dm7;
1048 set work.&file._dm6;
1049 sum(uncntrl/ (well_cntrl + cntrl))*100;
1050 run;*/
1051
1052 /*talk to vicki about SQL code
1053 PROC SQL;_
1054 CREATE TABLE final.&file. AS
1055 SELECT uncntrl, Total, (uncntrl/ Total * 100) AS NQF0018
1056 FROM (
1057 SELECT SUM(uncntrl) AS uncontrolled, COUNT(patient_ID) AS Total
1058 FROM work.&file._dm5
1059 )
1060 ;
1061 QUIT;
1062 RUN;*/
1063
1064 %MEND fin;
1065
1066 %fin(Baseline);
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
4384:58 4384:107 4384:153 4384:173
NOTE: Invalid numeric data, a1c_result='HEALTH_MAINTENANCE_COMPLETED' , at line 4384 column 58.
You cannot nest calculations within the output the way you were trying with PROC FREQ, that isn't valid SAS code and should generate an error.
Issues I see so far with your code:
These are issues with the code, but there could also be issues with the logic and that I can't know without more details.'
And you really should use a different macro variable name rather than FILE. It's not going to complain but you can run into weird issues when using a Reserved Key Word, which I did indicate a prior post.
PROC SUMMARY data=work.&file._dm6;
....
final_freq= (uncntrl/(well_cntrl+cntrl)*100
...
run;
Instead, move it to your next data step at the top of your code.
data final.&file._sum_final;
set work.&file._summary;
final_freq= (uncntrl/(sum(well_cntrl, cntrl)))*100
controlled_DM= sum(sum_wellcntrl, sum_cntrl);
keep sum_uncntrl controlled_DM final_freq;
run;
Personally, I'd also be changing the structure of the names of these files to the following:
data final.final_sum_&file.;
This way, if you want to combine them together in the future it's very easy, as SAS allows you to reference data sets with a prefix shortcut, but not a suffix one.
I'm not sure why you're not just using a single PROC FREQ with a format here either - this could all be simplified using that approach, but you'd have to provide sample data and expected output if you wanted a better suggestion.
Good Luck.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.