Hi all,
I have used the following where-contains statement in my program, but this is dropping other IDs which does not have GENERIC_DRUG_NAME. How can I use this same and also retain the other IDs.
Here is my code:
%MACRO HW2_RX(dsname);
DATA SGA_TREATMENT;
SET hw2J.&dsname (KEEP= PERSON_ID GENERIC_DRUG_NAME);
WHERE (GENERIC_DRUG_NAME CONTAINS "ARIPIPRAZOLE")OR (GENERIC_DRUG_NAME CONTAINS "ASENAPINE")OR (GENERIC_DRUG_NAME CONTAINS "BREXPIPRAZOLE") OR (GENERIC_DRUG_NAME CONTAINS "CARIPRAZINE") OR
(GENERIC_DRUG_NAME CONTAINS "CLOZAPINE") OR (GENERIC_DRUG_NAME CONTAINS "ILOPERIDONE") OR (GENERIC_DRUG_NAME CONTAINS "LURASIDONE") OR (GENERIC_DRUG_NAME CONTAINS "OLANZAPINE") OR
(GENERIC_DRUG_NAME CONTAINS "PALIPERIDONE") OR (GENERIC_DRUG_NAME CONTAINS "PIMAVANSERIN") OR (GENERIC_DRUG_NAME CONTAINS "QUETIAPINE") OR (GENERIC_DRUG_NAME CONTAINS "RISPERIDONE") OR (GENERIC_DRUG_NAME CONTAINS "ZIPRASIDONE");
RUN;
Thank you
Do you want to also keep observations where generic_drug_name is missing/empty?
no, I can drop them.
Thank you
Hi,
Thanks for sharing the code, but I'm not sure I understand your requirement. The code indicates you only want observations that contain the quoted strings ("ARIPIPRAZOLE", "ASENAPINE", "BREXPIPRAZOLE", etc.) in the variable GENERIC_DRUG_NAME. If there are other observations that you also require then have you tried adding the conditions for them to your where clause?
By default SAS performs case sensitive checks when comparing strings, so if your data is not in uppercase as per your code, then this will need to be fixed.
It would help if you supplied some example input data (ideally in the form of a data step) and what the corresponding output data should look like, with your reasons why you expect the data that is not appearing in your results.
Thanks & kind regards,
Amir.
Hi,
I did case sensitive check and it its the same with upper and lower case.
I wrote the code I mentioned using where-contains, but now because of the research question I need individuals who does not fall into those categories (("ARIPIPRAZOLE", "ASENAPINE", "BREXPIPRAZOLE", etc.).
I now want all individuals who has ("ARIPIPRAZOLE", "ASENAPINE", "BREXPIPRAZOLE", etc.). in GENERIC_DRUG_NAME, to be named as drug=1 and who do not have, to be named drug=2. But its not working with the if then statement.So I am trying to use "contains" in if then statement.
You cannot use CONTAINS in a IF statement. That is an SQL concept and so only works in WHERE statement.
565 data test; 566 set sashelp.class; 567 if name contains 'a' then put name=; -------- 388 76 ERROR 388-185: Expecting an arithmetic operator. ERROR 76-322: Syntax error, statement will be ignored. 568 run;
Use INDEX or INDEXW or FIND or FINDW depending on what you are doing.
data want;
set have;
if index(upcase(generic_name),'ARIPRAZOLE') then drug=1;
else if index(upcase(generic_name), ....
else if index....
else drug=2;
run;
A further thought is if you want to see all the records for a person_id if any observation for that same person_id contains any of the quoted strings in GENERIC_DRUG_NAME, then you can try something similar to the below. The code reads the built in sashelp.cars data set and brings back all observations for a make if any of the model values contain "X" or "Z". Just substitute make and model with your variables names, sashelp.cars with your input data set, and change the contains conditions to what you want.
proc sql noprint;
create table
want
as
select
make
,model
from
sashelp.cars
where
make in (select distinct
make
from
sashelp.cars
where
upcase(model) contains 'X'
or upcase(model) contains 'Z'
)
;
quit;
Thanks & kind regards,
Amir.
Hello @stellapersis7
Your code as posted above will only list Person_id for records in which the where condition is true.
In case you want to have all records and also be able to identify the records that contain the required drug, then one approach is to create a flag variable whose value will be 1 (integer one) in case there is a match. The data step code which you have posted would need to be replaces with something like this. Modify to suit your needs.
proc sql;
create table SGA_TREATMENT as
SELECT PERSON_ID, GENERIC_DRUG_NAME ,
CASE
WHEN UPCAE(GENERIC_DRUG_NAME) in ("ARIPIPRAZOLE"."ASENAPINE", "BREXPIPRAZOLE", "CARIPRAZINE"), "CLOZAPINE", "ILOPERIDONE" , "LURASIDONE", "OLANZAPINE", "PALIPERIDONE", "PIMAVANSERIN", "QUETIAPINE", "RISPERIDONE", "ZIPRASIDONE")
THEN 1
END AS FLAG
FROM hw2J.&dsname;
quit;
Note : I haven't tested the code. Don't have your source.
Another approach would be to create a dataset withe the names of the above drugs and create a full outer join.
Silly question: Why is your subject line "Contains in if statement"?
You do not show an IF statement or condition anywhere in the example code?
Where and If are different statements. Where applies to the incoming data vector and If to the working data vector after the data is read. There are also restrictions as to things like "contains" and "like" with Where but not available with If.
If your data are sorted by PERSON_ID and you want to keep all obs for any person_id that has at least one qualifying obs, then you can (1) change your SET to a MERGE, (2) change the WHERE statement to a dataset-name-option, and (3) add a BY statement and a subsetting IF statement:
data want;
merge hw2J.&dsname
(KEEP= PERSON_ID GENERIC_DRUG_NAME
where=( (GENERIC_DRUG_NAME CONTAINS "ARIPIPRAZOLE")
OR GENERIC_DRUG_NAME CONTAINS "ASENAPINE")
OR (GENERIC_DRUG_NAME CONTAINS "BREXPIPRAZOLE")
OR (GENERIC_DRUG_NAME CONTAINS "CARIPRAZINE")
OR (GENERIC_DRUG_NAME CONTAINS "CLOZAPINE")
OR (GENERIC_DRUG_NAME CONTAINS "ILOPERIDONE")
OR (GENERIC_DRUG_NAME CONTAINS "LURASIDONE")
OR (GENERIC_DRUG_NAME CONTAINS "OLANZAPINE")
OR (GENERIC_DRUG_NAME CONTAINS "PALIPERIDONE")
OR (GENERIC_DRUG_NAME CONTAINS "PIMAVANSERIN")
OR (GENERIC_DRUG_NAME CONTAINS "QUETIAPINE")
OR (GENERIC_DRUG_NAME CONTAINS "RISPERIDONE")
OR (GENERIC_DRUG_NAME CONTAINS "ZIPRASIDONE")
)
in=flag)
hw2J.&dsname (KEEP= PERSON_ID GENERIC_DRUG_NAME) ;
by person_id;
if flag=1;
run;
Below one coding option for testing of existence of a list of substrings in a string.
data have;
infile datalines truncover dsd;
input GENERIC_DRUG_NAME $100.;
person_id+1;
datalines;
ARIPIPRAZOLE
PIMAVANSERIN
xxxPIMAVANSERIN
xxxAAAxxx
xxxQUETIAPINExxx
PIMAVANSERINxxx
;
data want(drop=_:);
set have (keep= person_id generic_drug_name);
array drug{3} $15 _temporary_ ('ARIPIPRAZOLE','PIMAVANSERIN','QUETIAPINE');
do _i=1 to dim(drug);
if find(generic_drug_name,drug[_i],'it') then
do;
output;
leave;
end;
end;
run;
proc print data=want;
run;
If you've got a table with the drug names you want to select then you could also generate the array statement dynamically. Example below:
data have;
infile datalines truncover dsd;
input GENERIC_DRUG_NAME $100.;
person_id+1;
datalines;
ARIPIPRAZOLE
PIMAVANSERIN
xxxPIMAVANSERIN
xxxAAAxxx
xxxQUETIAPINExxx
PIMAVANSERINxxx
;
data drugs;
infile datalines truncover dsd;
input drug $20.;
datalines;
ARIPIPRAZOLE
ASENAPINE
BREXPIPRAZOLE
CARIPRAZINE
CLOZAPINE
ILOPERIDONE
LURASIDONE
OLANZAPINE
PALIPERIDONE
PIMAVANSERIN
QUETIAPINE
RISPERIDONE
ZIPRASIDONE
;
proc sql noprint;
select
cats("'",drug,"'"),
count(*),
max(length(drug))
into
:drug_list separated by ',',
:n_drugs trimmed,
:max_len trimmed
from drugs
;
quit;
data want(drop=_:);
set have (keep= person_id generic_drug_name);
array drug{&n_drugs} $&max_len _temporary_ (&drug_list);
do _i=1 to dim(drug);
if find(generic_drug_name,drug[_i],'it') then
do;
output;
leave;
end;
end;
run;
proc print data=want;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.