DATA Step, Macro, Functions and more

match find

Posts: 71

match find

Hi, I am working on some drug data set and need to specify patients using some drugs. The problem is we have some drug name but the data set contains not exactly same but similar name. The drug name in dataset may contain more word/letter than the name we have in hand. I just used exact match to identify patients using these drugs, so the result looks not correct. My code looks like the following. I just listed two names here, but actually I have many drug names.

%macro drugs;

%do i = 1 %to 10;

data drug;

     set drug;



if brandname in ('ADOXA', 'VECTRIN')

     or genname in ('DOXYCYCLINE HYCLATE', 'QUINACRINE') 

     then drug_1=1;

if brandname in ('CIMZIA', 'SIMPONI ARIA')

     or genname in ('ANAKINRA', 'GOLIMUMAB')  

     then drug_2=1;





Too few patients were identified by the above code. I am thinking to use proc sql and like 'pattern' or =* operator, but not sure if it is a good way to do and also not sure if like or =* can be used with more than one drug names to do search. I am also concerning if sas can do macro and proc sql together. Could you help me write some sample code? Thanks so much!

Contributor dkb
Posts: 54

Re: match find

Posted in reply to michellel

Could you provide some examples of names which are not identical but which you want to be recognised as a match, please?

Super User
Posts: 13,500

Re: match find

Posted in reply to michellel

Do you have any issues with the case when matching? Currently you are searching for 'ADOXA'. Do you have values like 'Adoxa' in you your data? If so first step I would try would be:

If strip(upcase(brandname)) in ('ADOXA', 'VECTRIN')

or strip(upcase(genname)) in ('DOXYCYCLINE HYCLATE', 'QUINACRINE') 

     then drug_1=1;

The strip is included in case any of your entered values have leading or trailing blanks. ' ADOXA' would not match 'ADOXA'.

Extra words or letters may mean a more complicated search and you'll likely lose the IN operator and have to test for each of your values separately.

If index(upcase(brandname),'ADOXA') > 0 then drug1_1=1; /* then ADOXA is in the string somewhere*/


Find(brandname, 'ADOXA','it') > 0 then drug1_1=1;

I strongly recommend getting something to work before looking at macros. Your existing shell macro processes the same data set 10 time exactly the same way. And using the:

data data;

     set data;

construct in a macro is just asking for a case where you have an error and can't determine where it happened.

Posts: 71

Re: match find

Thanks so much ballardw! It works!

Super User
Posts: 10,766

Re: match find

Posted in reply to michellel

Either like or =* can't be used to multiple matched. Why not try Hash Table or PRX .

Ask a Question
Discussion stats
  • 4 replies
  • 1 like
  • 4 in conversation