data thcics.question8;
input $ admitting_diag princ_diag_code oth_diag_code1 oth_diag_code2 ;
datalines;
v350 25001 4567
25001 v4001 v4001
v3560 5670 418
2502 v4000
v4000 25032
e560 4352 v3201
518 5670
25029 A1054 6530
;
run;
data thcics.question8;
set thcics.thcicsamp;
ID = _N_;
DIABETES = 0;
CHF = 0;
keep ID DIABETES CHF discharge thcic_id length_of_stay provider_name Admitting_Diag Princ_Diag_Code Oth_Diag_Code2-Oth_Diag_Code10;
array A (*)$5 Admitting_Diag Princ_Diag_Code Oth_Diag_Code2-Oth_Diag_Code10;
do i = 1 to dim(A);
diagnosis = A(i);
IF diagnosis = '39891' OR diagnosis = '40201' OR diagnosis = '40211' OR
diagnosis = '40291' OR diagnosis = '40401' OR diagnosis = '40403' OR
diagnosis = '40411' OR diagnosis = '40413' OR diagnosis = '40491' OR
diagnosis = '40493' OR SUBSTR(diagnosis,1,4)='4251' OR
SUBSTR(diagnosis,1,4)='4254' OR SUBSTR(diagnosis,1,4)='4255' OR
SUBSTR(diagnosis,1,4)='4259' OR SUBSTR(diagnosis,1,4)='4280' OR
SUBSTR(diagnosis,1,4)='4281' OR SUBSTR(diagnosis,1,4)='4289'
then CHF=1;
else CHF = 0;
label chf = 'chf';
IF substr (diagnosis,1,3) IN: ('250') then diabetes = 1;
else diabetes = 0;
label diabetes = 'diabetes';
end;
end;
run;
Hello everyone,
please I have a challenge, I want to code some observations in 10 of my variables ( diagnosis variables containing disease types in codes) to chronic heart failure (chf) and diabetes and then get the number of patients with chf and diabetes. my log shows no error but i oberved that all my chf and diabetes variable was 0 even if the variables had the code for them. my variables from oth_diag_code1-10 was also blank. I will appreciate if you can please help me fish out what is wrong with my syntax. Thanks a lot, as always
ID | THCIC_ID | admitting_diag | princ_diag_code | oth_diag_code_1 | oth_diag_code_2 | oth_diag_code_3 | oth_diag_code_10 | diabetes | chf |
1 | 1001 | v0291 | 25010 | 25021 | 1 | 1 | |||
2 | 1001 | 40291 | 40291 | 39891 | 4255 | 0 | 1 | ||
3 | 1002 | 39891 | 40493 | 25053 | 1 | 1 | |||
4 | 1003 | 25001 | 25002 | 1 | 1 |
i oberved that all my chf and diabetes variable was 0 even if the variables had the code for them. my variables from oth_diag_code1-10 was also blank. I will appreciate if you can please help me fish out what is wrong with my syntax.
This is because the last step in the IF-THEN-ELSE construct else chf=0 and else diabetes=0 the way you wrote it sets variables CHF and DIABETES to zero. By removing those lines, you should get what you want. Also please note the proper indentation of your code makes everything easier to understand and debug and to see if you have the proper matching DO with END (which you don't have)
data thcics.question8;
set thcics.thcicsamp;
ID = _N_;
DIABETES = 0;
CHF = 0;
keep ID DIABETES CHF discharge thcic_id length_of_stay
provider_name Admitting_Diag Princ_Diag_Code Oth_Diag_Code2-Oth_Diag_Code10;
array A (*)$5 Admitting_Diag Princ_Diag_Code Oth_Diag_Code2-Oth_Diag_Code10;
do i = 1 to dim(A);
IF a(i)= '39891' OR a(i)= '40201' OR a(i)= '40211' OR
a(i)= '40291' OR a(i)= '40401' OR a(i)= '40403' OR
a(i)= '40411' OR a(i)= '40413' OR a(i)= '40491' OR
a(i)= '40493' OR substr(a(i),1,4)='4251' OR
SUBSTR(a(i),1,4)='4254' OR SUBSTR(a(i),1,4)='4255' OR
SUBSTR(a(i),1,4)='4259' OR SUBSTR(a(i),1,4)='4280' OR
SUBSTR(a(i),1,4)='4281' OR SUBSTR(a(i),1,4)='4289'
then CHF=1;
IF substr (a(i),1,3) IN: ('250') then diabetes = 1;
end;
end; /* THERE IS NO DO FOR THIS END, THIS SHOULD CAUSE AN ERROR */
run;
I also question the purpose and logic behind these statements:
label diabetes = 'diabetes';
label chf='chf';
Usually (in fact ALWAYS), labels are meant to make the output readable. There's really no reason to assign a label of 'chf' to chf, this enhances nothing. You might want to use proper capitalization in your label, that would make things more readable. Or if CHF is an acronym (I don't really know what it means, it may be that in your industry everyone knows what CHF is and so no need to spell out the words of the acronym, but I will give an example anyway)
label diabetes='Diabetes'; /* Note proper capitalization */
label chf='Chicago Hostile Ferengi'; /* OPTIONAL: Spell out the acronym */
i oberved that all my chf and diabetes variable was 0 even if the variables had the code for them. my variables from oth_diag_code1-10 was also blank. I will appreciate if you can please help me fish out what is wrong with my syntax.
This is because the last step in the IF-THEN-ELSE construct else chf=0 and else diabetes=0 the way you wrote it sets variables CHF and DIABETES to zero. By removing those lines, you should get what you want. Also please note the proper indentation of your code makes everything easier to understand and debug and to see if you have the proper matching DO with END (which you don't have)
data thcics.question8;
set thcics.thcicsamp;
ID = _N_;
DIABETES = 0;
CHF = 0;
keep ID DIABETES CHF discharge thcic_id length_of_stay
provider_name Admitting_Diag Princ_Diag_Code Oth_Diag_Code2-Oth_Diag_Code10;
array A (*)$5 Admitting_Diag Princ_Diag_Code Oth_Diag_Code2-Oth_Diag_Code10;
do i = 1 to dim(A);
IF a(i)= '39891' OR a(i)= '40201' OR a(i)= '40211' OR
a(i)= '40291' OR a(i)= '40401' OR a(i)= '40403' OR
a(i)= '40411' OR a(i)= '40413' OR a(i)= '40491' OR
a(i)= '40493' OR substr(a(i),1,4)='4251' OR
SUBSTR(a(i),1,4)='4254' OR SUBSTR(a(i),1,4)='4255' OR
SUBSTR(a(i),1,4)='4259' OR SUBSTR(a(i),1,4)='4280' OR
SUBSTR(a(i),1,4)='4281' OR SUBSTR(a(i),1,4)='4289'
then CHF=1;
IF substr (a(i),1,3) IN: ('250') then diabetes = 1;
end;
end; /* THERE IS NO DO FOR THIS END, THIS SHOULD CAUSE AN ERROR */
run;
I also question the purpose and logic behind these statements:
label diabetes = 'diabetes';
label chf='chf';
Usually (in fact ALWAYS), labels are meant to make the output readable. There's really no reason to assign a label of 'chf' to chf, this enhances nothing. You might want to use proper capitalization in your label, that would make things more readable. Or if CHF is an acronym (I don't really know what it means, it may be that in your industry everyone knows what CHF is and so no need to spell out the words of the acronym, but I will give an example anyway)
label diabetes='Diabetes'; /* Note proper capitalization */
label chf='Chicago Hostile Ferengi'; /* OPTIONAL: Spell out the acronym */
@PaigeMiller Be glad that you have not had reason to learn about (or worse experience) congestive heart failure.
Thank you so much @PaigeMiller and @Tom, I understand it better now. I had even intialized a value of zero to the variables. Thank you for the tips on labelling and identation. The second "run;" is an error, sorry about that.
Hi @PaigeMiller and everyone, please, is it possible to use array to get the top most frequent diagnosis (i.e the top 5 frequent observations in my array elements)? thanks
brilliant! thanks a lot. It worked
data thcics.question4;
set thcics.thcicsamp;
ID = _N_;
keep ID thcic_id length_of_stay provider_name Admitting_Diag Princ_Diag_Code Oth_Diag_Code_1-Oth_Diag_Code_8;
run;
proc transpose data = thcics.question4 out = thcics.question4a name = alldiag;
by id;
var Admitting_Diag Princ_Diag_Code Oth_Diag_Code_1-Oth_Diag_Code_8;
run;
proc freq data = thcics.question4a;
tables col1 / out = thcics.freq4a;
run;
proc sort data = thcics.freq4a;
by descending count;
run;
proc print data = thcics.freq4b (firstobs=2 obs=6);
run;
The way you structured your loop is only testing the last element in the array.
So this code:
x=0;
do index=1 to dim(array);
if array[index]='xxx' then x=1;
else x=0;
end;
Is the same as:
if array[dim(array)]='xxx' then x=1;
else x=0;
So remove the ELSE statement(s).
do i = 1 to dim(A);
if a[i] in: ('39891' '40201' '40211' '40291' '40401' '40403'
'40411' '40413' '40491' '40493'
'4251' '4254' '4255' '4259' '4280' '4281' '4289'
)
then CHF=1
;
if a[i] in: ('250') then diabetes = 1;
end;
For what its worth, you can replace this code:
IF a(i)= '39891' OR a(i)= '40201' OR a(i)= '40211' OR a(i)= '40291' OR a(i)= '40401' OR a(i)= '40403' OR a(i)= '40411' OR a(i)= '40413' OR a(i)= '40491' OR a(i)= '40493' OR substr(a(i),1,4)='4251' OR SUBSTR(a(i),1,4)='4254' OR SUBSTR(a(i),1,4)='4255' OR SUBSTR(a(i),1,4)='4259' OR SUBSTR(a(i),1,4)='4280' OR SUBSTR(a(i),1,4)='4281' OR SUBSTR(a(i),1,4)='4289' then CHF=1;
With:
if a(i) in ('39891' '40201' '40211' '40291' '40401' '40403' '40411' '40413' '40491' '40493') or substr(a(i),1,4) in ('4251' '4254' '4255' '4259' '4280' '4281' '4289') then CHF=1;
The IN operator will compare a variable or single expression to the list of values in the () and return a true/false result if the value matches any of them.
May save some time typing code.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.