BookmarkSubscribeRSS Feed
stellapersis7
Obsidian | Level 7

Hi all,

I have used  the following where-contains statement in my program, but this is dropping other IDs which does not have GENERIC_DRUG_NAME. How can I use this same and also retain the other IDs.

Here is my code:

%MACRO HW2_RX(dsname);

DATA SGA_TREATMENT;

SET hw2J.&dsname (KEEP= PERSON_ID GENERIC_DRUG_NAME);

WHERE (GENERIC_DRUG_NAME CONTAINS "ARIPIPRAZOLE")OR (GENERIC_DRUG_NAME CONTAINS "ASENAPINE")OR (GENERIC_DRUG_NAME CONTAINS "BREXPIPRAZOLE") OR (GENERIC_DRUG_NAME CONTAINS "CARIPRAZINE") OR 

(GENERIC_DRUG_NAME CONTAINS "CLOZAPINE") OR (GENERIC_DRUG_NAME CONTAINS "ILOPERIDONE") OR (GENERIC_DRUG_NAME CONTAINS "LURASIDONE") OR (GENERIC_DRUG_NAME CONTAINS "OLANZAPINE") OR 

(GENERIC_DRUG_NAME CONTAINS "PALIPERIDONE") OR (GENERIC_DRUG_NAME CONTAINS "PIMAVANSERIN") OR (GENERIC_DRUG_NAME CONTAINS "QUETIAPINE") OR (GENERIC_DRUG_NAME CONTAINS "RISPERIDONE") OR (GENERIC_DRUG_NAME CONTAINS "ZIPRASIDONE");

RUN;

Thank you

10 REPLIES 10
stellapersis7
Obsidian | Level 7

no, I can drop them. 

Thank you

Amir
PROC Star

Hi,

 

Thanks for sharing the code, but I'm not sure I understand your requirement. The code indicates you only want observations that contain the quoted strings ("ARIPIPRAZOLE", "ASENAPINE", "BREXPIPRAZOLE", etc.) in the variable GENERIC_DRUG_NAME. If there are other observations that you also require then have you tried adding the conditions for them to your where clause?

 

By default SAS performs case sensitive checks when comparing strings, so if your data is not in uppercase as per your code, then this will need to be fixed.

 

It would help if you supplied some example input data (ideally in the form of a data step) and what the corresponding output data should look like, with your reasons why you expect the data that is not appearing in your results.

 

 

Thanks & kind regards,

Amir.

stellapersis7
Obsidian | Level 7

Hi,

I did case sensitive check and it its the same with upper and lower case.

I wrote the code I mentioned using where-contains, but now because of the research question I need individuals who does not fall into those categories (("ARIPIPRAZOLE", "ASENAPINE", "BREXPIPRAZOLE", etc.).

 

I now want all individuals who has ("ARIPIPRAZOLE", "ASENAPINE", "BREXPIPRAZOLE", etc.). in  GENERIC_DRUG_NAME, to be named as drug=1 and who do not have, to be named drug=2. But its not working with the if then statement.So I am trying to use "contains" in if then statement.

Tom
Super User Tom
Super User

You cannot use CONTAINS in a IF statement.  That is an SQL concept and so only works in WHERE statement.

565  data test;
566   set sashelp.class;
567   if name contains 'a' then put name=;
              --------
              388
              76
ERROR 388-185: Expecting an arithmetic operator.

ERROR 76-322: Syntax error, statement will be ignored.

568  run;

Use INDEX or INDEXW or FIND or FINDW depending on what you are doing.

data want;
  set have;
   if index(upcase(generic_name),'ARIPRAZOLE') then drug=1;
   else if index(upcase(generic_name), ....
   else if index....
   else drug=2;
run;
Amir
PROC Star

A further thought is if you want to see all the records for a person_id if any observation for that same person_id contains any of the quoted strings in GENERIC_DRUG_NAME, then you can try something similar to the below. The code reads the built in sashelp.cars data set and brings back all observations for a make if any of the model values contain "X" or "Z". Just substitute make and model with your variables names, sashelp.cars with your input data set, and change the contains conditions to what you want.

 

proc sql noprint;
  create table
    want
  as
  select
     make
    ,model
  from
    sashelp.cars
  where
    make in (select distinct
               make
             from
               sashelp.cars
             where
                  upcase(model) contains 'X'
               or upcase(model) contains 'Z'
            )
  ;
quit;

 

 

 

Thanks & kind regards,

Amir.

Sajid01
Meteorite | Level 14

Hello @stellapersis7 
Your code as posted above will only list Person_id for records in which the where condition is true.
In case you want to have all records and also be able to identify the records that contain the required drug, then one approach is to create  a flag variable whose value will be 1 (integer one) in case there is a match. The data step code which you have posted would need to be replaces with something like this. Modify to suit your needs.

proc sql;
create table SGA_TREATMENT as
SELECT  PERSON_ID, GENERIC_DRUG_NAME , 
CASE 
WHEN UPCAE(GENERIC_DRUG_NAME) in ("ARIPIPRAZOLE"."ASENAPINE", "BREXPIPRAZOLE", "CARIPRAZINE"), "CLOZAPINE", "ILOPERIDONE" , "LURASIDONE",  "OLANZAPINE", "PALIPERIDONE", "PIMAVANSERIN", "QUETIAPINE",  "RISPERIDONE", "ZIPRASIDONE")
THEN 1
END AS FLAG
FROM hw2J.&dsname;
quit;

Note : I haven't tested the code. Don't have your source.
Another approach would be to create a dataset withe the names of the above drugs and create a full outer join.

ballardw
Super User

Silly question: Why is your subject line "Contains in if statement"?

You do not show an IF statement or condition anywhere in the example code?

 

Where and If are different statements. Where applies to the incoming data vector and If to  the working data vector after the data is read. There are also restrictions as to things like "contains" and "like" with Where but not available with If.

mkeintz
PROC Star

If your data are sorted by PERSON_ID and you want to keep all obs for any person_id that has at least one qualifying obs, then you can (1) change your SET to a MERGE, (2) change the WHERE statement to a dataset-name-option, and (3) add a BY statement and a subsetting IF statement:

 

data want;
  merge hw2J.&dsname
   (KEEP= PERSON_ID GENERIC_DRUG_NAME
    where=( (GENERIC_DRUG_NAME CONTAINS "ARIPIPRAZOLE")
           OR GENERIC_DRUG_NAME CONTAINS "ASENAPINE")
           OR (GENERIC_DRUG_NAME CONTAINS "BREXPIPRAZOLE") 
           OR (GENERIC_DRUG_NAME CONTAINS "CARIPRAZINE") 
           OR (GENERIC_DRUG_NAME CONTAINS "CLOZAPINE") 
           OR (GENERIC_DRUG_NAME CONTAINS "ILOPERIDONE") 
           OR (GENERIC_DRUG_NAME CONTAINS "LURASIDONE") 
           OR (GENERIC_DRUG_NAME CONTAINS "OLANZAPINE") 
           OR (GENERIC_DRUG_NAME CONTAINS "PALIPERIDONE") 
           OR (GENERIC_DRUG_NAME CONTAINS "PIMAVANSERIN") 
           OR (GENERIC_DRUG_NAME CONTAINS "QUETIAPINE") 
           OR (GENERIC_DRUG_NAME CONTAINS "RISPERIDONE") 
           OR (GENERIC_DRUG_NAME CONTAINS "ZIPRASIDONE")
         )
       in=flag)
   hw2J.&dsname (KEEP= PERSON_ID GENERIC_DRUG_NAME)  ;
  by person_id;
  if flag=1;
run;
--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Patrick
Opal | Level 21

Below one coding option for testing of existence of a list of substrings in a string.

data have;
  infile datalines truncover dsd;
  input GENERIC_DRUG_NAME $100.;
  person_id+1;
  datalines;
ARIPIPRAZOLE
PIMAVANSERIN
xxxPIMAVANSERIN
xxxAAAxxx
xxxQUETIAPINExxx
 
PIMAVANSERINxxx
;

data want(drop=_:);
  set have (keep= person_id generic_drug_name);
  array drug{3} $15 _temporary_ ('ARIPIPRAZOLE','PIMAVANSERIN','QUETIAPINE');
  do _i=1 to dim(drug);
    if find(generic_drug_name,drug[_i],'it') then 
      do;
        output;
        leave;
      end;
  end;
run;

proc print data=want;
run;

 If you've got a table with the drug names you want to select then you could also generate the array statement dynamically. Example below:

Spoiler
data have;
  infile datalines truncover dsd;
  input GENERIC_DRUG_NAME $100.;
  person_id+1;
  datalines;
ARIPIPRAZOLE
PIMAVANSERIN
xxxPIMAVANSERIN
xxxAAAxxx
xxxQUETIAPINExxx
 
PIMAVANSERINxxx
;

data drugs;
  infile datalines truncover dsd;
  input drug $20.;
  datalines;
ARIPIPRAZOLE
ASENAPINE
BREXPIPRAZOLE
CARIPRAZINE
CLOZAPINE
ILOPERIDONE
LURASIDONE
OLANZAPINE
PALIPERIDONE
PIMAVANSERIN
QUETIAPINE
RISPERIDONE
ZIPRASIDONE
;

proc sql noprint;
  select 
    cats("'",drug,"'"), 
    count(*),
    max(length(drug))
    into
      :drug_list separated by ',',
      :n_drugs trimmed,
      :max_len trimmed
  from drugs
  ;
quit;

data want(drop=_:);
  set have (keep= person_id generic_drug_name);
  array drug{&n_drugs} $&max_len _temporary_ (&drug_list);
  do _i=1 to dim(drug);
    if find(generic_drug_name,drug[_i],'it') then 
      do;
        output;
        leave;
      end;
  end;
run;

proc print data=want;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 1451 views
  • 4 likes
  • 8 in conversation