DATA Step, Macro, Functions and more

Contains - multiple values selection

Reply
New Contributor
Posts: 2

Contains - multiple values selection

Hello, i am a verry new and verry basic user, and i want to know if it possible to have different vallues in a where function following a contains cmd, ex: 

 

proc sort data=work.l2b1;
where F='DE' and A contains 'ciproflox' or
F='FR' and A contains 'OFLOX';
by J;

 

How i could put more values/prod names after contains, smth like this: contains 'cipro' 'ator' 'Ome' 'etc', 

 I am trying to extract from a datat set only the lines that have country of incidence FR ( F=country) and at least one from the product list.

 

Like i have mentioned i am verry new with SAS and i dont have a programmer background, i dont even know if contains can do what i want.

 

THX.

Trusted Advisor
Posts: 1,837

Re: Contains - multiple values selection

Posted in reply to Ionut_Baciu

If all of your coditions are a combination of F and A variables you can use:

 

where upcase(cat(F,A)) in ("DECIPROFLOX" , "FROFLOX" , ... );

New Contributor
Posts: 2

Re: Contains - multiple values selection

It will be easier for me to understand if its possible to have a string of values following a contains: 

contains A B C D E, and how exactly i writhe them so i wont get an error, because ive tried :

Contains ("A", "B", "C", "D")  and doesent work.

Super User
Posts: 10,280

Re: Contains - multiple values selection

Posted in reply to Ionut_Baciu

Put your condition values into a dataset, and you can use contains or similar in a where condition of a SQL join.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
New Contributor
Posts: 3

Re: Contains - multiple values selection

Posted in reply to KurtBremser

Unfortunately i dont know how to do that, 0 experience with sql.

 

So if someone can tell me if is possible to have more then 1 observation after contains/like, 

example of what im looking for: like ('a', 'b', 'c'), btw i've tried this and is not working for me.

 My code is smth like :

 

proc sort data=work.l2b1;
where F='DE' and A like ('%ciproflox%'; '%atorv%','%ola%');
by J;
run;

 

if i use where F='DE' and A like '%ciproflox%'; it works, but i cant get at least 2 obs afther like, dont know how to group them

Trusted Advisor
Posts: 1,837

Re: Contains - multiple values selection

Posted in reply to Ionut_Baciu

As much as I know there is no function like what you want.

but you can do it using a macro program.

 

define the macro outside and before the step to run:

%macro check_for(varF=, varA=);
        %let n=%sysfunc(countw(&varA));
%let varF = %upcase(&varF);
%let varA = %upcase(&varA); if F = "&varF" and %do i=1 %to &n; /* A contains... */
findw(upcase(A) , "%scan(&varA,&i)")
/* index(upcase(A) , "%scan(&varA,&i)") */ %if &i < &n %then %str( or ); %end;
then output; %mend check_for;

  If you want the check for a substring, not a whole word, then replace the FINDW function by INDEX function.

      

creating a test data and using the macro:

 

data test;
    F="DE"; A="any ciproflox med"; output;
    F="DE"; A="any non oflox med"; output;
    F="FR"; A="OFLOX or BFLOX or CFLOX"; output;
    F="FR"; A="eflox or dflox oe Fflox"; output;
;
run;

data result;
 set test;
   %check_for(varF=DE, varA=cipro);
   %check_for(varF=FR, varA=oflox fflox);
run;

Run both functions  - with FINDW and with INDEX and compare results.

 

 

 

New Contributor
Posts: 3

Re: Contains - multiple values selection

Tank yiu verry much, i have managed to aplly your macro to my dataset, but im still having some problems:

1 one is that when A have more then 1 observation (Ex: A=ciprofloxcin atorvastatin) even if F=DE (F=country, DE=Germany), i still get results from outside DE (ex IT, FR, countries were the second product "atorvastatine" is present).

the code is:

 

data result;
set l2b1;
%check_for(varF=DE, varA=ciprofloxacin atorvastatin);
run;

proc print data=work.result;
run;

 

2. i would like to be abble to search for a part of the word not exactly the word Ex; %ator not atorvastatin, so to get any obs that start with ator.

 

3. I dident understand the comment 

 /* A contains... */ was that put there to explain that 
findw(upcase(A) , "%scan(&varA,&i)")        or    /* index(upcase(A) , "%scan(&varA,&i)") */  was the cointain function i was looking for.

 Thank you.

Trusted Advisor
Posts: 1,837

Re: Contains - multiple values selection

Posted in reply to Baciu_Ionut

I'll start with your third note - the comment means that next two lines are are supposed to do what you want.

As there is no function CONTAIN (nor CONTAINS) I can use each of the two other functions:

FINDW - search for a whole word limitted by spaces.

INDEX - search for a substring.

 

for example:

  A = "ciproflox".

  result1 = findw(A , "cipro")  will be 0 (means not found). 

  result2 = index(A, "cipro")   will be 1 (the position where the substring starts).

 

It seems that you need the INDEX variation of the macro program.

I have added ( ) to fix the problem of getting non DE observations.

%macro check_for(varF=, varA=);
        %let n=%sysfunc(countw(&varA));        
%let varF = %upcase(&varF);
%let varA = %upcase(&varA); if F = "&varF" and ( %do i=1 %to &n; index(upcase(A) , "%scan(&varA,&i)") %if &i < &n %then %str( or ); %end; ) then output; %mend check_for;

I suggest that you add options mprint; in your code and look at the log to see what have this macro program done:

options mprint;
data result;
  set l2b1;
       %check_for(varF=DE, varA=ciprofloxacin atorvastatin);
run;
options nomprint;

proc print data=work.result;
run;
New Contributor
Posts: 3

Re: Contains - multiple values selection

Sry for my late reply, I wanted to tank you and to say i've managed to make the macro work for my dataset. Tank you all for the support. Have a good and blessed day.

Trusted Advisor
Posts: 1,837

Re: Contains - multiple values selection

Posted in reply to Baciu_Ionut

Please assign the correct answer as solution so that others have it also.

Super User
Super User
Posts: 8,127

Re: Contains - multiple values selection

Posted in reply to Ionut_Baciu

There is no simple function/operator for that type of test.

I would just start by combining multiple tests.

proc sort data=work.l2b1  out=want;
  where (F='DE'
         and (lowcase(A) contains 'ciproflox'
           or lowcase(A) contains 'cipro'
             )
        )
   or (F='FR' and A contains 'OFLOX')
  ;
  by J;
run;

If you are familiar with regular expressions you might be able to use one or more of the prx.... functions to do your search.

Ask a Question
Discussion stats
  • 10 replies
  • 355 views
  • 1 like
  • 5 in conversation