BookmarkSubscribeRSS Feed
Nina4
Obsidian | Level 7

Hello 

I have a dataset with patients that have different therapies. In the therapies there are multiple medications. It all is in a long format. 

 

It is in this format: 

 

data your_dataset;
input PatientID TherapyID Drug $;
datalines;
1 1 DrugA
1 1 DrugB
1 1 DrugC
1 2 DrugA
1 2 DrugB
1 2 DrugC
1 3 DrugB
1 3 DrugC
1 4 DrugA
1 4 DrugC
1 4 DrugB
1 4 DrugD
2 5 DrugA
2 5 DrugD
2 7 DrugA
2 7 DrugC
3 6 DrugB
3 6 DrugE
;

 

 

I would like to identify, which drugs are newly started or finished within a patient, when they have a new therapy. And flag them accordingly.

 

Solution example: 

1 1 DrugA baseline

1 1 DrugB baseline

1 1 DrugC baseline

1 2 DrugA last_time

1 2 DrugB no_change

1 2 DrugC no_change

1 3 DrugB no_change

1 3 DrugC no_change

1 4 DrugA restart

1 4 DrugC no_change

1 4 DrugB no_change

1 4 DrugD new_start

2 5 DrugA baseline

2 5 DrugD last_time

2 7 DrugA no_change

2 7 DrugC new_start

3 6 DrugB baseline

3 6 DrugE baseline

 

They could also have variables added with a binary value to it. 

 

So far, I have used this code, but there are multiple errors in it.

 

data drug_changes;
  set your_dataset;
  by PatientID;
  retain PrevPatientID PrevTherapyID FirstTherapyDrugs;
  
  if first.PatientID then do;
    PrevPatientID = .;
    PrevTherapyID = .;
    FirstTherapyDrugs = '';
  end;
  
  if PatientID ne PrevPatientID then do;
    PrevTherapyID = .;
    FirstTherapyDrugs = '';
  end;
  
  if TherapyID ne PrevTherapyID then do;
    if Drug ne '' then do;
      if PrevTherapyID ne . then do;
        action = 'Finished';
        output;
      end;
      if TherapyID = 1 then do; /* Check if it's the first therapy */
        FirstTherapyDrugs = catx('|', FirstTherapyDrugs, Drug); /* Concatenate drugs */
        action = 'First Start';
        output;
      end;
      else do;
        action = 'Started';
        output;
      end;
    end;
  end;
  
  PrevPatientID = PatientID;
  PrevTherapyID = TherapyID;
  
  if last.PatientID then do;
    if FirstTherapyDrugs ne '' then do;
      do i = 1 to countw(FirstTherapyDrugs, '|');
        Drug = scan(FirstTherapyDrugs, i, '|');
        action = 'First Start';
        output;
      end;
    end;
    if Drug ne '' then do;
      if TherapyID = 1 then do; /* Check if it's the first therapy */
        FirstTherapyDrugs = catx('|', FirstTherapyDrugs, Drug); /* Concatenate drugs */
        action = 'First Start';
        output;
      end;
      else do;
        action = 'Started';
        output;
      end;
    end;
  end;
  
  drop PrevPatientID PrevTherapyID FirstTherapyDrugs i;
run;

proc print data=drug_changes;
run;

Thank you so much for your help in advance. 

Nina

5 REPLIES 5
ballardw
Super User

In terms of data what is "new"? How do we recognize that? And the same for "finish"? Rules are a good idea, otherwise we are guessing and that's seldom a good idea.

 

Question about "last time". If it is "restarted" is it really a "last"?

mkeintz
PROC Star

What if a drug appears only once?   Will you establish a new status  (say 'ONE TIME')?  

 

And what if that one time is for therapyid=1 (which I presume is how you identify status 'BASELINE')?

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Nina4
Obsidian | Level 7

Great questions! 

These were my rules: 

Baseline: drugs of the therapy 1 of a patient

last_time: if a drug is stopped afterwards

restart: a drug that has been taken before, but that had a break

new_start: start of a drug, that has never been used before by that patient

no_change: nothing changed to the therapy before

 

But you are right: 

add

one_time: if a drug appears only once

paused: if a drug has a break and is then restarted again

 

I hope, this helps.

mkeintz
PROC Star

@Nina4 wrote:

Great questions! 

These were my rules: 

Baseline: drugs of the therapy 1 of a patient

last_time: if a drug is stopped afterwards

restart: a drug that has been taken before, but that had a break

new_start: start of a drug, that has never been used before by that patient

no_change: nothing changed to the therapy before

 

But you are right: 

add

one_time: if a drug appears only once

paused: if a drug has a break and is then restarted again

 

I hope, this helps.


Yes, it helps but

 Why is   1/4/druga characterized as "last time" instead of "paused", given there is a subsequent 1/4/druga?

 What if the "one time" is at baseline?   Which of those two status values do you want?

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Nina4
Obsidian | Level 7

You are right, with the new value paused and one_time the dataset should like like the following: 

 

1 1 DrugA baseline

1 1 DrugB baseline

1 1 DrugC baseline

1 2 DrugA paused

1 2 DrugB no_change

1 2 DrugC no_change

1 3 DrugB no_change

1 3 DrugC no_change

1 4 DrugA restart

1 4 DrugC no_change

1 4 DrugB no_change

1 4 DrugD new_start

2 5 DrugA baseline

2 5 DrugD last_time

2 7 DrugA no_change

2 7 DrugC new_start

3 6 DrugB baseline

3 6 DrugE baseline

 

If one_time is at baseline, there should be one_time and also all other values should overrule baseline.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 514 views
  • 0 likes
  • 3 in conversation