My research involves analysing health diagnoses by ethnicity.
The diagnoses are coded via a classification code and I am attempting to create diagnostic groups through the if and then statement.
Most of the code for the groups are working all except a portion which unlike the other diagnostic groups contain characters as part of the classification code. i.e.
other=0;
if clincode='293.89'
or clincode='310.1'
or clincode='293.9'
or clincode='316'
or clincode='3321'
or clincode='33392'
or clincode='3337'
or clincode='33399'
or clincode='33382'
or clincode='3331'
or clincode='33390'
or clincode='9952'
or clincode='v619'
or clincode='v6120'
or clincode='v6110'
or clincode='v618'
or clincode='v6181'
or clincode='v6121'
or clincode='v6112'
or clincode='v6283'
or clincode='v1581'
or clincode='v652'
or clincode='v7101'
or clincode='v7102'
or clincode='v6289'
or clincode='7809'
or clincode='v6282'
or clincode='v623'
or clincode='v622'
or clincode='31382'
or clincode='v6289'
or clincode='v624'
or clincode='3009'
or clincode='v7109'
or clincode='7999' then other=1;
All the diagnostic group with classification codes containing only numerals work. Is SAS not able to read in the characters compared to codes with just numerals? If not is there a way in which I can get it to recognize the mixed character and numerical observations?
SAS has no trouble recognizing characters. But it does require an exact match. For example, is it possible you need to use an uppercase "V" in your statement? "v619" is different from "V619".
If your data has a mix of upper and lower case, you might want to use upcase(clincode) in your statement, so you don't need to list the lower case versions.
SAS has no trouble recognizing characters. But it does require an exact match. For example, is it possible you need to use an uppercase "V" in your statement? "v619" is different from "V619".
If your data has a mix of upper and lower case, you might want to use upcase(clincode) in your statement, so you don't need to list the lower case versions.
I recommend using an informat instead of that very long and hardly maintainable if-construct.
Example:
proc format;
invalue cc_marker / upcase
'v6282'
, 'v623'
, 'v622'
, '31382'
, 'v6289'
, 'v624'
, '3009'
, 'v7109' = 1
other = 0
;
run;
data work.have;
length clincode $ 6;
input clincode;
other = input(clincode, cc_marker.);
datalines;
v3434
V6289
v624
v8888
3009
v7109
;
run;
I agree, a format is the way to go especially if OP can get the target codes into a dataset. Additionally, OP should ensure the source format matches the target format. In particular, as @Astounding mentioned, convert to upcase and also strip "." from the codes if there are no "."s in the target data.
Yes, I thought that there had to be an easier way to do this.
Thanks for the tip.
Will most definitely be using it in the future.
Do those codes you provide come in a usable format, i.e. do you have them or can get them into a dataset? If so then a simple merge of that data on your data would simplify your code. Alternatively you could create format from the codes, and apply the format to the data to get the same result.
One other thing, you need not do loads of or statements, you can simplify to:
if clincode in ('293.89','310.1',...) then other=1;
You could even say the below to do it in one step.
other=ifn(clincode in ('293.89','310.1'...),1,0);
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.