BookmarkSubscribeRSS Feed
simonqin
Fluorite | Level 6

Hi SAS pro

I am doing a research on polypharmacy and want to re-structure the dataset to see the start and end of single drug, and the start and end of two drugs, and the start and end of three drugs.

 

The example of original

dataset is like this :

IDdrugstartendSCORE 
1A06/01/202011/15/20202
1B08/01/202009/09/20201
1C06/01/202007/01/20203
1D10/01/202012/31/20204
2A01/15/202003/04/20202
2C01/27/202003/09/20203
3A03/01/202004/01/20202
3C03/26/202004/05/20203
3D03/28/202005/05/20204
4A04/04/202006/07/20202
4C04/12/202007/01/20203

 

I want to change the dataset as below:

 

IDdrugsstart endscore 
1A+C06/01/202007/01/20202+3=5
1A07/02/202007/31/20202
1A+B08/01/202009/09/20202+1=3
1A09/10/202009/30/20202
1A+D10/01/202011/15/20202+4=6
1D11/16/202012/31/20204
2A01/15/202001/26/20202
2A+C01/27/202003/04/20202+3=5
2C03/05/202003/09/20203

 

I know this analysis is comprehensive.. I am much appreciate if you could help me to sort out this analysis.

1 REPLY 1
s_lassen
Meteorite | Level 14

You will have a much better chance of someone answering if you post your example data as a datastep, e.g.

data have;                                
  length id 8 drug $1 start end score 8;  
  informat start end mmddyy10.;           
  format start end date9.;                
  input ID drug start end SCORE;          
cards;                                    
1 A 06/01/2020 11/15/2020 2               
1 B 08/01/2020 09/09/2020 1               
1 C 06/01/2020 07/01/2020 3               
1 D 10/01/2020 12/31/2020 4               
2 A 01/15/2020 03/04/2020 2               
2 C 01/27/2020 03/09/2020 3               
3 A 03/01/2020 04/01/2020 2               
3 C 03/26/2020 04/05/2020 3               
3 D 03/28/2020 05/05/2020 4               
4 A 04/04/2020 06/07/2020 2               
4 C 04/12/2020 07/01/2020 3               
;run;

I think the easiest way around this is to create a "long" dataset, with one entry for each drug, each day it was active:

data long;              
  set have;             
  do date=start to end; 
    output;             
    end;                
  drop start end;       
  format date date9.;   
run;

Then sort that by ID and date, and calculate combination and score for each date:

proc sort data=long;              
  by id date drug;                
run;                              
data long2;                       
  length drugs $10;               
  total=0;                        
  do until(last.date);            
    set long;                     
    by id date;                   
    call catx('+',drugs,drug);    
    total+score;                  
    end;                          
  drop drug score;                
  rename total=score;             
run;

And finally, convert back to intervals:

data want;                 
  do until(last.drugs);    
    set long2;             
    by id drugs notsorted; 
    if first.drugs then    
      start=date;          
    end;                   
  format start date9.;     
  rename date=end;         
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 299 views
  • 2 likes
  • 2 in conversation