BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
newsas007
Quartz | Level 8

Hi all,

I had asked a similar question a while ago but I am running into some issues. 

Dataset A contains information on the gene type and list of prescribed drugs associated with it. For example, the prescribed drugs for Gene type ‘a’ are as follows: Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab.  

 

The dataset B contains information on what the patient got prescribed and the dates associated with it. For example, ID#1 got Panitumumab, Binimetinib, and Neo-Cetuximab on different dates.

 

Therefore, in my merged dataset, I would like to get a row for Panitumimab and a row for Neo-Cetuximab. Further, in case the patient got a drug that is not listed in the drug_list I want it to be missing. For example, ID #6 prescribed drug Idelalisib does not match the drug_list for the specific gene ‘b’. Similar scenario for and ID#7

Here are the datasets and the code I had started using.

 

data A;
informat id 3. date_a mmddyy10. Gene_type $3. Drug_list $200.;
input id date_a Gene_type Drug_list;
format id 3. date_a mmddyy10. Gene_Type$3. Drug_list $200.;
cards;
1 1/2/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab
1 2/10/2019 b Cetuximab,Xanitumumab
2 3/3/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib
3 1/2/2018 d Alpelisib,Everolimus,Temsirolimus
4 2/2/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp
4 2/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus
5 3/3/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab
6 1/2/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib
6 2/2/2019 b Cetuximab,Xanitumumab
7 2/10/2019 d Alpelisib,Everolimus,Temsirolimus
7 3/10/2019 b Cetuximab,Xanitumumab
;;;
proc sort; by id; run;

data B;
informat id 3. date_b mmddyy10. Drug $20.;
input id date_b Drug;
format id 3. date_b mmddyy10. Drug $200.;
cards;
1 2/01/2018 Panitumumab
1 3/02/2019 Binimetinib
1 3/28/2019 Neo-Cetuximab
2 3/23/2019 Trametinib
2 3/25/2019 Temsirolimus
3 2/01/2018 Alpelisib
5 3/21/2019 Dacomitinib
4 3/01/2019 Everolimus
4 4/01/2019 Cobimetinib
4 5/01/2019 Ado-trastuzumab
6 1/17/2018 Idelalisib
7 3/02/2019 Temsirolimus
7 4/01/2019 Binimetinib
;;;

Data Want:

ID Date_a Gene_type Drug_list Drug Date_b
1 1/2/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 2/1/2018
1 1/2/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 3/28/2019
1 2/10/2019 b Cetuximab,Xanitumumab    
2 3/3/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 3/23/2019
3 1/2/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 2/1/2018
4 2/2/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb Ado-trastuzumab 5/1/2019
4 2/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 3/1/2019
5 3/3/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 3/21/2019
6 1/2/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 1/17/2018
6 2/2/2019 b Cetuximab,Xanitumumab    
7 2/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 3/2/2019
7 3/10/2019 b Cetuximab,Xanitumumab    

 

Proc sql;
Create table c2 as select a.*, b.Drug,b.Date_b
from a a join b b on a.id=b.id
and findw( a.drug_list,strip(b.drug),',') > 0 and (b.date_b);
Quit;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20
proc sql;
  select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
  from GENES left join DRUGS 
    on GENES.ID=DRUGS.ID
   and findw( GENES.DRUG_LIST, DRUGS.DRUG, ',- ', 't') 
 order by GENES.ID, GENE_TYPE ;
quit;

ID DATE_A GENE_TYPE DRUG_LIST DRUG DATE_B
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 03/28/2019
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 02/01/2018
1 02/10/2019 b Cetuximab,Xanitumumab   .
2 03/03/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 03/23/2019
3 01/02/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 02/01/2018
4 02/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 03/01/2019
4 02/02/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp Ado-trastuzumab 05/01/2019
5 03/03/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 03/21/2019
6 02/02/2019 b Cetuximab,Xanitumumab   .
6 01/02/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 01/17/2018
7 03/10/2019 b Cetuximab,Xanitumumab   .
7 02/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 03/02/2019

 

View solution in original post

9 REPLIES 9
ChrisNZ
Tourmaline | Level 20

so Neo-Cetuximab matches with Cetuximab?

newsas007
Quartz | Level 8
@ChrisNZ: they are unique and different. For example #ID 1 only got Neo-Cetuximab.
ChrisNZ
Tourmaline | Level 20

> They are unique and different.

OK just replace

findw( GENES.DRUG_LIST, prxchange('s/\A(Ado|Neo)-//',1,DRUGS.DRUG), ',- ', 't') 

with

findw( GENES.DRUG_LIST, DRUGS.DRUG), ', ', 't') 

then, and report. 

ChrisNZ
Tourmaline | Level 20

1. Please format your code and your data

2. Something like this is what you need. Please alter to suit:

proc sql;
  create table WANT as 
  select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
  from GENES join DRUGS 
    on GENES.ID=DRUGS.ID
   and findw( GENES.DRUG_LIST, prxchange('s/\A(Ado|Neo)-//',1,DRUGS.DRUG), ',- ', 't') 
 order by GENES.ID, GENE_TYPE ;
quit;

ID DATE_A GENE_TYPE DRUG_LIST DRUG DATE_B
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 03/28/2019
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 02/01/2018
1 02/10/2019 b Cetuximab,Xanitumumab Neo-Cetuximab 03/28/2019
2 03/03/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 03/23/2019
3 01/02/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 02/01/2018
4 02/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 03/01/2019
4 02/02/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp Ado-trastuzumab 05/01/2019
5 03/03/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 03/21/2019
6 01/02/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 01/17/2018
7 02/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 03/02/2019

 

newsas007
Quartz | Level 8

Thanks for this:
I used this code and got the output for ID#1 shows Neo-Cetuximab merging with Cetuximab. It should be a empty cell for drug and date_b because #ID1 was never given Cetuximab.
Basically, each drug name is unique. Neo-Cetuximab is different from Cetuximab and so does any drug with Ado- or any other prefix.I know it's probably a subtle change in the code. I am still trying..thanks again
proc sql;
create table WANT as
select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
from GENES left join DRUGS
on GENES.ID=DRUGS.ID
and findw( GENES.DRUG_LIST, prxchange('s/\A(Ado|Neo)-//',1,DRUGS.DRUG), ',- ', 't')
order by GENES.ID, GENE_TYPE ;
quit;


ChrisNZ
Tourmaline | Level 20

Please format your code and data.

ChrisNZ
Tourmaline | Level 20
proc sql;
  select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
  from GENES left join DRUGS 
    on GENES.ID=DRUGS.ID
   and findw( GENES.DRUG_LIST, DRUGS.DRUG, ',- ', 't') 
 order by GENES.ID, GENE_TYPE ;
quit;

ID DATE_A GENE_TYPE DRUG_LIST DRUG DATE_B
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 03/28/2019
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 02/01/2018
1 02/10/2019 b Cetuximab,Xanitumumab   .
2 03/03/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 03/23/2019
3 01/02/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 02/01/2018
4 02/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 03/01/2019
4 02/02/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp Ado-trastuzumab 05/01/2019
5 03/03/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 03/21/2019
6 02/02/2019 b Cetuximab,Xanitumumab   .
6 01/02/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 01/17/2018
7 03/10/2019 b Cetuximab,Xanitumumab   .
7 02/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 03/02/2019

 

newsas007
Quartz | Level 8
@ChrisNZ: Thank you so much for this! This works just fine. I am still learning subtle things 🙂
Thanks again.
ChrisNZ
Tourmaline | Level 20

Maybe remove the dash as a separator in the findw() function.

And don't forget to format and reread your question if you want to be helped.

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 979 views
  • 1 like
  • 2 in conversation