BookmarkSubscribeRSS Feed
Nina4
Obsidian | Level 7

Hello 

I have a dataset with patients that have different therapies. In the therapies there are multiple medications. It all is in a long format. 

 

It is in this format: 

 

data your_dataset;
input PatientID TherapyID Drug $;
datalines;
1 1 DrugA
1 1 DrugB
1 1 DrugC
1 2 DrugA
1 2 DrugB
1 2 DrugC
1 3 DrugB
1 3 DrugC
1 4 DrugA
1 4 DrugC
1 4 DrugB
1 4 DrugD
2 5 DrugA
2 5 DrugD
2 7 DrugA
2 7 DrugC
3 6 DrugB
3 6 DrugE
;

 

 

I would like to identify, which drugs are newly started or finished within a patient, when they have a new therapy. And flag them accordingly.

 

Solution example: 

1 1 DrugA baseline

1 1 DrugB baseline

1 1 DrugC baseline

1 2 DrugA last_time

1 2 DrugB no_change

1 2 DrugC no_change

1 3 DrugB no_change

1 3 DrugC no_change

1 4 DrugA restart

1 4 DrugC no_change

1 4 DrugB no_change

1 4 DrugD new_start

2 5 DrugA baseline

2 5 DrugD last_time

2 7 DrugA no_change

2 7 DrugC new_start

3 6 DrugB baseline

3 6 DrugE baseline

 

They could also have variables added with a binary value to it. 

 

So far, I have used this code, but there are multiple errors in it.

 

data drug_changes;
  set your_dataset;
  by PatientID;
  retain PrevPatientID PrevTherapyID FirstTherapyDrugs;
  
  if first.PatientID then do;
    PrevPatientID = .;
    PrevTherapyID = .;
    FirstTherapyDrugs = '';
  end;
  
  if PatientID ne PrevPatientID then do;
    PrevTherapyID = .;
    FirstTherapyDrugs = '';
  end;
  
  if TherapyID ne PrevTherapyID then do;
    if Drug ne '' then do;
      if PrevTherapyID ne . then do;
        action = 'Finished';
        output;
      end;
      if TherapyID = 1 then do; /* Check if it's the first therapy */
        FirstTherapyDrugs = catx('|', FirstTherapyDrugs, Drug); /* Concatenate drugs */
        action = 'First Start';
        output;
      end;
      else do;
        action = 'Started';
        output;
      end;
    end;
  end;
  
  PrevPatientID = PatientID;
  PrevTherapyID = TherapyID;
  
  if last.PatientID then do;
    if FirstTherapyDrugs ne '' then do;
      do i = 1 to countw(FirstTherapyDrugs, '|');
        Drug = scan(FirstTherapyDrugs, i, '|');
        action = 'First Start';
        output;
      end;
    end;
    if Drug ne '' then do;
      if TherapyID = 1 then do; /* Check if it's the first therapy */
        FirstTherapyDrugs = catx('|', FirstTherapyDrugs, Drug); /* Concatenate drugs */
        action = 'First Start';
        output;
      end;
      else do;
        action = 'Started';
        output;
      end;
    end;
  end;
  
  drop PrevPatientID PrevTherapyID FirstTherapyDrugs i;
run;

proc print data=drug_changes;
run;

Thank you so much for your help in advance. 

Nina

5 REPLIES 5
ballardw
Super User

In terms of data what is "new"? How do we recognize that? And the same for "finish"? Rules are a good idea, otherwise we are guessing and that's seldom a good idea.

 

Question about "last time". If it is "restarted" is it really a "last"?

mkeintz
PROC Star

What if a drug appears only once?   Will you establish a new status  (say 'ONE TIME')?  

 

And what if that one time is for therapyid=1 (which I presume is how you identify status 'BASELINE')?

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Nina4
Obsidian | Level 7

Great questions! 

These were my rules: 

Baseline: drugs of the therapy 1 of a patient

last_time: if a drug is stopped afterwards

restart: a drug that has been taken before, but that had a break

new_start: start of a drug, that has never been used before by that patient

no_change: nothing changed to the therapy before

 

But you are right: 

add

one_time: if a drug appears only once

paused: if a drug has a break and is then restarted again

 

I hope, this helps.

mkeintz
PROC Star

@Nina4 wrote:

Great questions! 

These were my rules: 

Baseline: drugs of the therapy 1 of a patient

last_time: if a drug is stopped afterwards

restart: a drug that has been taken before, but that had a break

new_start: start of a drug, that has never been used before by that patient

no_change: nothing changed to the therapy before

 

But you are right: 

add

one_time: if a drug appears only once

paused: if a drug has a break and is then restarted again

 

I hope, this helps.


Yes, it helps but

 Why is   1/4/druga characterized as "last time" instead of "paused", given there is a subsequent 1/4/druga?

 What if the "one time" is at baseline?   Which of those two status values do you want?

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Nina4
Obsidian | Level 7

You are right, with the new value paused and one_time the dataset should like like the following: 

 

1 1 DrugA baseline

1 1 DrugB baseline

1 1 DrugC baseline

1 2 DrugA paused

1 2 DrugB no_change

1 2 DrugC no_change

1 3 DrugB no_change

1 3 DrugC no_change

1 4 DrugA restart

1 4 DrugC no_change

1 4 DrugB no_change

1 4 DrugD new_start

2 5 DrugA baseline

2 5 DrugD last_time

2 7 DrugA no_change

2 7 DrugC new_start

3 6 DrugB baseline

3 6 DrugE baseline

 

If one_time is at baseline, there should be one_time and also all other values should overrule baseline.

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 851 views
  • 0 likes
  • 3 in conversation