Hi,
I want to create an indicator variable for my patients. If the patient's ICD10 code matches one of 150 codes, then this variable will be 1.
How can I make the comparison other than using (if equal then)? It will take so much time to type all codes by hand between " " !
I have the codes in csv file. I wish I could compare patients codes against a SAS dataset
Thanks,
I know this is a topic that comes up regularly. Please search the SAS Community for ICD codes.
Here's a good paper on the topic:
https://www.sas.com/content/dam/SAS/support/en/sas-global-forum-proceedings/2019/3117-2019.pdf
I have the codes in csv file. I wish I could compare patients codes against a SAS dataset
You can, but in my experience the critieria is usually not exact matches. If it is exact matches if you have a single column of diagnosis in your main data set you can use a SQL filter.
proc sql;
create table want as
select * from have where icd10 in (select icd10 from csvList);
quit;
@lansoprazole wrote:
Hi,
I want to create an indicator variable for my patients. If the patient's ICD10 code matches one of 150 codes, then this variable will be 1.
How can I make the comparison other than using (if equal then)? It will take so much time to type all codes by hand between " " !
I have the codes in csv file. I wish I could compare patients codes against a SAS dataset
Thanks,
One way if the list to search in is fixed:
data example; input x $; array v (4) $ 8 _temporary_("abc","cdf","pdq","rst"); found = (whichc(x, of v(*))>0); datalines; abc ABC cdf pdq PDQ ;
An example of basic code does not require all 150 or so values.
The key above is the WHICHC function that searches a list of values for the presence of the first one. In this case an array of temporary values is used as that makes it fairly easy to change the list and not require other code changes.
This is case sensitive, which why ABC and PDQ have 0 for the Found variable.
If you have not used arrays, for the purpose, V is the name of the array and cannot be the name of another variable, the number in the (4) says there will be 4 elements, the $ says the values will be character, the 8 says they will have a maximum length of 8 characters, the keyword _temporary_ means the created variables V1 to V4 are not written to the output data set and the values to search are in the second set of parentheses.
The "of v(*)" is a way that tells SAS you want to use all of the values in the array V for any function that accepts lists of variables.
One minor advantage of using WHICHC is that you can return a value that shows the position of the FIRST match, or its value, if desired.
Let's say you have a dataset with one variable (ICDcode) and 150 observations, one for each of the ICDcodes of interest - call it MY_ICDS
And there is also your main dataset HAVE with multiple variables, including ICDcode.
Then you can use a hash object as follows:
data want;
set have;
if _n_=1 then do;
declare hash h (dataset:'MY_ICDS');
h.definekey('icdcode');
h.definedone();
end;
if h.find()=0 then flag=1;
else flag=0;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.