BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
newsas007
Quartz | Level 8

Hi all,

I had asked a similar question a while ago but I am running into some issues. 

Dataset A contains information on the gene type and list of prescribed drugs associated with it. For example, the prescribed drugs for Gene type ‘a’ are as follows: Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab.  

 

The dataset B contains information on what the patient got prescribed and the dates associated with it. For example, ID#1 got Panitumumab, Binimetinib, and Neo-Cetuximab on different dates.

 

Therefore, in my merged dataset, I would like to get a row for Panitumimab and a row for Neo-Cetuximab. Further, in case the patient got a drug that is not listed in the drug_list I want it to be missing. For example, ID #6 prescribed drug Idelalisib does not match the drug_list for the specific gene ‘b’. Similar scenario for and ID#7

Here are the datasets and the code I had started using.

 

data A;
informat id 3. date_a mmddyy10. Gene_type $3. Drug_list $200.;
input id date_a Gene_type Drug_list;
format id 3. date_a mmddyy10. Gene_Type$3. Drug_list $200.;
cards;
1 1/2/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab
1 2/10/2019 b Cetuximab,Xanitumumab
2 3/3/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib
3 1/2/2018 d Alpelisib,Everolimus,Temsirolimus
4 2/2/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp
4 2/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus
5 3/3/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab
6 1/2/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib
6 2/2/2019 b Cetuximab,Xanitumumab
7 2/10/2019 d Alpelisib,Everolimus,Temsirolimus
7 3/10/2019 b Cetuximab,Xanitumumab
;;;
proc sort; by id; run;

data B;
informat id 3. date_b mmddyy10. Drug $20.;
input id date_b Drug;
format id 3. date_b mmddyy10. Drug $200.;
cards;
1 2/01/2018 Panitumumab
1 3/02/2019 Binimetinib
1 3/28/2019 Neo-Cetuximab
2 3/23/2019 Trametinib
2 3/25/2019 Temsirolimus
3 2/01/2018 Alpelisib
5 3/21/2019 Dacomitinib
4 3/01/2019 Everolimus
4 4/01/2019 Cobimetinib
4 5/01/2019 Ado-trastuzumab
6 1/17/2018 Idelalisib
7 3/02/2019 Temsirolimus
7 4/01/2019 Binimetinib
;;;

Data Want:

ID Date_a Gene_type Drug_list Drug Date_b
1 1/2/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 2/1/2018
1 1/2/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 3/28/2019
1 2/10/2019 b Cetuximab,Xanitumumab    
2 3/3/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 3/23/2019
3 1/2/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 2/1/2018
4 2/2/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb Ado-trastuzumab 5/1/2019
4 2/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 3/1/2019
5 3/3/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 3/21/2019
6 1/2/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 1/17/2018
6 2/2/2019 b Cetuximab,Xanitumumab    
7 2/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 3/2/2019
7 3/10/2019 b Cetuximab,Xanitumumab    

 

Proc sql;
Create table c2 as select a.*, b.Drug,b.Date_b
from a a join b b on a.id=b.id
and findw( a.drug_list,strip(b.drug),',') > 0 and (b.date_b);
Quit;

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
ChrisNZ
Tourmaline | Level 20
proc sql;
  select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
  from GENES left join DRUGS 
    on GENES.ID=DRUGS.ID
   and findw( GENES.DRUG_LIST, DRUGS.DRUG, ',- ', 't') 
 order by GENES.ID, GENE_TYPE ;
quit;

ID DATE_A GENE_TYPE DRUG_LIST DRUG DATE_B
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 03/28/2019
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 02/01/2018
1 02/10/2019 b Cetuximab,Xanitumumab   .
2 03/03/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 03/23/2019
3 01/02/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 02/01/2018
4 02/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 03/01/2019
4 02/02/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp Ado-trastuzumab 05/01/2019
5 03/03/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 03/21/2019
6 02/02/2019 b Cetuximab,Xanitumumab   .
6 01/02/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 01/17/2018
7 03/10/2019 b Cetuximab,Xanitumumab   .
7 02/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 03/02/2019

 

View solution in original post

9 REPLIES 9
ChrisNZ
Tourmaline | Level 20

so Neo-Cetuximab matches with Cetuximab?

newsas007
Quartz | Level 8
@ChrisNZ: they are unique and different. For example #ID 1 only got Neo-Cetuximab.
ChrisNZ
Tourmaline | Level 20

> They are unique and different.

OK just replace

findw( GENES.DRUG_LIST, prxchange('s/\A(Ado|Neo)-//',1,DRUGS.DRUG), ',- ', 't') 

with

findw( GENES.DRUG_LIST, DRUGS.DRUG), ', ', 't') 

then, and report. 

ChrisNZ
Tourmaline | Level 20

1. Please format your code and your data

2. Something like this is what you need. Please alter to suit:

proc sql;
  create table WANT as 
  select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
  from GENES join DRUGS 
    on GENES.ID=DRUGS.ID
   and findw( GENES.DRUG_LIST, prxchange('s/\A(Ado|Neo)-//',1,DRUGS.DRUG), ',- ', 't') 
 order by GENES.ID, GENE_TYPE ;
quit;

ID DATE_A GENE_TYPE DRUG_LIST DRUG DATE_B
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 03/28/2019
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 02/01/2018
1 02/10/2019 b Cetuximab,Xanitumumab Neo-Cetuximab 03/28/2019
2 03/03/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 03/23/2019
3 01/02/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 02/01/2018
4 02/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 03/01/2019
4 02/02/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp Ado-trastuzumab 05/01/2019
5 03/03/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 03/21/2019
6 01/02/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 01/17/2018
7 02/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 03/02/2019

 

newsas007
Quartz | Level 8

Thanks for this:
I used this code and got the output for ID#1 shows Neo-Cetuximab merging with Cetuximab. It should be a empty cell for drug and date_b because #ID1 was never given Cetuximab.
Basically, each drug name is unique. Neo-Cetuximab is different from Cetuximab and so does any drug with Ado- or any other prefix.I know it's probably a subtle change in the code. I am still trying..thanks again
proc sql;
create table WANT as
select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
from GENES left join DRUGS
on GENES.ID=DRUGS.ID
and findw( GENES.DRUG_LIST, prxchange('s/\A(Ado|Neo)-//',1,DRUGS.DRUG), ',- ', 't')
order by GENES.ID, GENE_TYPE ;
quit;


ChrisNZ
Tourmaline | Level 20

Please format your code and data.

ChrisNZ
Tourmaline | Level 20
proc sql;
  select GENES.*, DRUGS.DRUG, DRUGS.DATE_B
  from GENES left join DRUGS 
    on GENES.ID=DRUGS.ID
   and findw( GENES.DRUG_LIST, DRUGS.DRUG, ',- ', 't') 
 order by GENES.ID, GENE_TYPE ;
quit;

ID DATE_A GENE_TYPE DRUG_LIST DRUG DATE_B
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Neo-Cetuximab 03/28/2019
1 01/02/2018 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Panitumumab 02/01/2018
1 02/10/2019 b Cetuximab,Xanitumumab   .
2 03/03/2019 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Trametinib 03/23/2019
3 01/02/2018 d Alpelisib,Everolimus,Temsirolimus Alpelisib 02/01/2018
4 02/10/2019 e Bosutinib,Dasatinib,Everolimus,Temsirolimus Everolimus 03/01/2019
4 02/02/2019 f Ado-trastuzumab,Afatinib,Dacomitinib,Lapatinib,Neratinib,Pertuzumab,Trastuzumab,Trastuzumab-anns,Trastuzumab-dkst,Trastuzumab-dttb,Trastuzumab-pkrb,Trastuzumab-qyyp Ado-trastuzumab 05/01/2019
5 03/03/2019 a Afatinib,Neo-Cetuximab,Dacomitinib,Erlotinib,Gefitinib,Lapatinib,Necitumumab,Osimertinib,Panitumumab Dacomitinib 03/21/2019
6 02/02/2019 b Cetuximab,Xanitumumab   .
6 01/02/2018 c Binimetinib,Cetuximab,Cobimetinib,Idelalisib,Panitumumab,Trametinib Idelalisib 01/17/2018
7 03/10/2019 b Cetuximab,Xanitumumab   .
7 02/10/2019 d Alpelisib,Everolimus,Temsirolimus Temsirolimus 03/02/2019

 

newsas007
Quartz | Level 8
@ChrisNZ: Thank you so much for this! This works just fine. I am still learning subtle things 🙂
Thanks again.
ChrisNZ
Tourmaline | Level 20

Maybe remove the dash as a separator in the findw() function.

And don't forget to format and reread your question if you want to be helped.

 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 1336 views
  • 1 like
  • 2 in conversation