BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GraceStehlin98
Calcite | Level 5

the following drug data ;

the goal  is to count days  overlap of at least two drugs taken and list drug names per patient id(PID). i am looking forward the output data should look like the following.

 

PID drug_name drug_cnt overlapping_flag start_date end_date count_days
A ARIPIPRAZOLE 1 0 1/1/15 2/1/15 31
A ARIPIPRAZOLE_DESIPRAMINE 2 1 2/1/15 2/20/15 19
A ARIPIPRAZOLE_DESIPRAMINE_QUETIAPINE 3 1 2/20/15 3/1/15 9
A DESIPRAMINE_QUETIAPINE 2 1 2/20/15 4/1/15 40
A QUETIAPINE 1 0 4/10/15 5/1/15 21

 

 

 

the data  source as the below:

PID drug med_taken_begin_dt med_taken_finish_dt MD_taken_by_days
1 HALOPERIDOL DECANOATE 11Dec2013 15Dec2017 1465
1 LITHIUM CARBONATE 23Jan2014 07Feb2018 1476
1 CLOZAPINE 03Mar2014 08Feb2018 1438
2 SERTRALINE 25Feb2014 30Jun2014 125
2 LITHIUM CARBONATE 27Feb2014 20Jan2017 1058
2 QUETIAPINE 03Mar2014 14Feb2017 1079
2 HALOPERIDOL 03Apr2014 08Sep2016 889
2 TRAZODONE 01May2014 20Jan2017 995
2 BUPROPION HCL 10Jul2014 31Jul2014 21
2 BUPROPION HCL 31Jul2014 09Sep2014 40
2 DULOXETINE 11Sep2014 17Oct2014 36
2 ARIPIPRAZOLE 23Apr2015 25Jun2015 63
2 LITHIUM CARBONATE 26Apr2015 01May2015 5
2 LORAZEPAM 27Apr2015 27Apr2015 0
2 FLUOXETINE 15Oct2015 09Nov2015 25
2 QUETIAPINE 14Apr2016 15Sep2016 154
2 SEROQUEL 13Jun2016 20Jun2016 7
2 OLANZAPINE 08Sep2016 24Mar2017 197
2 RISPERIDONE 01Feb2017 19Mar2017 46
2 CLOZAPINE 01Mar2017 31Mar2017 30
3 DESIPRAMINE 11Nov2013 22Sep2014 315
3 QUETIAPINE 26Nov2013 26Jan2017 1157
3 ARIPIPRAZOLE 10Jan2014 26Jan2017 1112
3 VENLAFAXINE 28Mar2014 26Jan2017 1035

 

 

Thanks lots!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

Some questions:

Do you have your data is SAS yet?

Are your existing date variables SAS date values or character?

Do you have a maximum number of "overlaps" you want to consider? Since your example start end periods cover 4 or 5 years there could be a great many drugs at the same time.

Your apparent rule is to combine the drug names by inserting an underscore between the overlaps. Is that the actual case?

 

Also your example output is for a PID not included in your input so it is a tad difficult to check.

 

But this may provide enough to get you started. NOTE: the date values must be SAS date values.

data have;
   infile datalines dlm='|';
   informat PID $5. drug $25. med_taken_begin_dt med_taken_finish_dt date9. MD_taken_by_days best5. ;
   format med_taken_begin_dt med_taken_finish_dt date9.;
   input PID drug med_taken_begin_dt med_taken_finish_dt MD_taken_by_days;
datalines;
1|HALOPERIDOL DECANOATE|11Dec2013|15Dec2017|1465|
1|LITHIUM CARBONATE|23Jan2014|07Feb2018|1476|
1|CLOZAPINE|03Mar2014|08Feb2018|1438|
2|SERTRALINE|25Feb2014|30Jun2014|125|
2|LITHIUM CARBONATE|27Feb2014|20Jan2017|1058|
2|QUETIAPINE|03Mar2014|14Feb2017|1079|
2|HALOPERIDOL|03Apr2014|08Sep2016|889|
2|TRAZODONE|01May2014|20Jan2017|995|
2|BUPROPION HCL|10Jul2014|31Jul2014|21|
2|BUPROPION HCL|31Jul2014|09Sep2014|40|
2|DULOXETINE|11Sep2014|17Oct2014|36|
2|ARIPIPRAZOLE|23Apr2015|25Jun2015|63|
2|LITHIUM CARBONATE|26Apr2015|01May2015|5|
2|LORAZEPAM|27Apr2015|27Apr2015|0|
2|FLUOXETINE|15Oct2015|09Nov2015|25|
2|QUETIAPINE|14Apr2016|15Sep2016|154|
2|SEROQUEL|13Jun2016|20Jun2016|7|
2|OLANZAPINE|08Sep2016|24Mar2017|197|
2|RISPERIDONE|01Feb2017|19Mar2017|46|
2|CLOZAPINE|01Mar2017|31Mar2017|30|
3|DESIPRAMINE|11Nov2013|22Sep2014|315|
3|QUETIAPINE|26Nov2013|26Jan2017|1157|
3|ARIPIPRAZOLE|10Jan2014|26Jan2017|1112|
3|VENLAFAXINE|28Mar2014|26Jan2017|1035|
;
run;

data temp;
   set have;
   do takedt = med_taken_begin_dt to  med_taken_finish_dt;
      output;
   end;
   format takedt date9.;
   drop med_taken_begin_dt med_taken_finish_dt  MD_taken_by_days;
run;
proc sort data=temp;
  by pid takedt;
run;
proc transpose data=temp out=temptrans (drop=_name_);
   by pid takedt;
   var drug;
run;

data combineddrug;
   set temptrans;
   combineddrug= catx('_',of col:);
   drop col: ;
run;

proc summary data=combineddrug nway;
  class pid combineddrug;
  var takedt;
  output out= combinesummary (drop=_:) min= start_date max=end_date;
run;

data want;
   set combinesummary;
   drug_cnt = countw(combineddrug,'_');
   overlapping_flag = (drug_cnt > 1);
   count_days = end_date - start_date +1;
   format start_date end_date mmddyy6.;
run;

   

The most problematic issue is with the data step that creates combineddrug. You may well need to set a length statement to accomodate the combined lengths of all of the names, making sure to allow for the inserted _ . The default as used is set to 200 characters. if that isn't enough then add something like Length combineddrug $ 500; Keep increasing the number until everything fits.

 

View solution in original post

3 REPLIES 3
ballardw
Super User

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

 

Some questions:

Do you have your data is SAS yet?

Are your existing date variables SAS date values or character?

Do you have a maximum number of "overlaps" you want to consider? Since your example start end periods cover 4 or 5 years there could be a great many drugs at the same time.

Your apparent rule is to combine the drug names by inserting an underscore between the overlaps. Is that the actual case?

 

Also your example output is for a PID not included in your input so it is a tad difficult to check.

 

But this may provide enough to get you started. NOTE: the date values must be SAS date values.

data have;
   infile datalines dlm='|';
   informat PID $5. drug $25. med_taken_begin_dt med_taken_finish_dt date9. MD_taken_by_days best5. ;
   format med_taken_begin_dt med_taken_finish_dt date9.;
   input PID drug med_taken_begin_dt med_taken_finish_dt MD_taken_by_days;
datalines;
1|HALOPERIDOL DECANOATE|11Dec2013|15Dec2017|1465|
1|LITHIUM CARBONATE|23Jan2014|07Feb2018|1476|
1|CLOZAPINE|03Mar2014|08Feb2018|1438|
2|SERTRALINE|25Feb2014|30Jun2014|125|
2|LITHIUM CARBONATE|27Feb2014|20Jan2017|1058|
2|QUETIAPINE|03Mar2014|14Feb2017|1079|
2|HALOPERIDOL|03Apr2014|08Sep2016|889|
2|TRAZODONE|01May2014|20Jan2017|995|
2|BUPROPION HCL|10Jul2014|31Jul2014|21|
2|BUPROPION HCL|31Jul2014|09Sep2014|40|
2|DULOXETINE|11Sep2014|17Oct2014|36|
2|ARIPIPRAZOLE|23Apr2015|25Jun2015|63|
2|LITHIUM CARBONATE|26Apr2015|01May2015|5|
2|LORAZEPAM|27Apr2015|27Apr2015|0|
2|FLUOXETINE|15Oct2015|09Nov2015|25|
2|QUETIAPINE|14Apr2016|15Sep2016|154|
2|SEROQUEL|13Jun2016|20Jun2016|7|
2|OLANZAPINE|08Sep2016|24Mar2017|197|
2|RISPERIDONE|01Feb2017|19Mar2017|46|
2|CLOZAPINE|01Mar2017|31Mar2017|30|
3|DESIPRAMINE|11Nov2013|22Sep2014|315|
3|QUETIAPINE|26Nov2013|26Jan2017|1157|
3|ARIPIPRAZOLE|10Jan2014|26Jan2017|1112|
3|VENLAFAXINE|28Mar2014|26Jan2017|1035|
;
run;

data temp;
   set have;
   do takedt = med_taken_begin_dt to  med_taken_finish_dt;
      output;
   end;
   format takedt date9.;
   drop med_taken_begin_dt med_taken_finish_dt  MD_taken_by_days;
run;
proc sort data=temp;
  by pid takedt;
run;
proc transpose data=temp out=temptrans (drop=_name_);
   by pid takedt;
   var drug;
run;

data combineddrug;
   set temptrans;
   combineddrug= catx('_',of col:);
   drop col: ;
run;

proc summary data=combineddrug nway;
  class pid combineddrug;
  var takedt;
  output out= combinesummary (drop=_:) min= start_date max=end_date;
run;

data want;
   set combinesummary;
   drug_cnt = countw(combineddrug,'_');
   overlapping_flag = (drug_cnt > 1);
   count_days = end_date - start_date +1;
   format start_date end_date mmddyy6.;
run;

   

The most problematic issue is with the data step that creates combineddrug. You may well need to set a length statement to accomodate the combined lengths of all of the names, making sure to allow for the inserted _ . The default as used is set to 200 characters. if that isn't enough then add something like Length combineddrug $ 500; Keep increasing the number until everything fits.

 

GraceStehlin98
Calcite | Level 5

@      ballardw  thanks a lot!!!!!!!!!!!!

 

That is exact what i want!

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 2651 views
  • 1 like
  • 3 in conversation