DATA Step, Macro, Functions and more

Proc sql selecting most recent date depending on a second variable

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Proc sql selecting most recent date depending on a second variable

[ Edited ]

Hello, 

I am working with pharmaceutical prescription records and I am trying to create an output that includes only the most recent prescription for each drug for each person. For example:

 

John smith     adderall     12/28/17

John smith     adderall     12/21/17

John smith     oxycodone 11/14/17

jane doe        adderall       2/14/18

jane doe        oxycodone  1/10/18

jane doe        adderall       12/21/17

 

I want the resulting table to look like this:

 

John smith     adderall     12/28/17

John smith     oxycodone 11/14/17

jane doe        adderall       2/14/18

jane doe        oxycodone  1/10/18

 

Right now I am using the code: 

 

proc sort data = master ;
by Recipient_Last_Name Recipient_First_Name Date_Dispensed_n descending Drug_Name ;
run ;

proc sql ;
create table final as
select *
from blaine.master
group by Recipient_Last_Name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit ;

 

This gives me the most recent date for each last name, but I need the most recent date for EACH DRUG associated with a person. 

I.e. the code above gives me this:

 

John smith     adderall     12/28/17

jane doe        adderall       2/14/18

 

(need to somehow get the most recent prescriptions of other drugs in there)

 

any help would be appreciated!


Accepted Solutions
Solution
‎03-14-2018 02:35 PM
Super User
Posts: 2,045

Re: Proc sql selecting most recent date depending on a second variable

slight correction in your group by:

data have;
input Recipient_First_Name $ Recipient_Last_Name $ drug_name $ Date_Dispensed_n :mmddyy10.;
format Date_Dispensed_n mmddyy10.;
datalines;
John smith     adderall     12/28/17
John smith     adderall     12/21/17
John smith     oxycodone 11/14/17
jane doe        adderall       2/14/18
jane doe        oxycodone  1/10/18
jane doe        adderall       12/21/17
;

proc sql;
create table want as
select *
from have
group by Recipient_Last_Name,drug_name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit;

View solution in original post


All Replies
Super User
Posts: 23,950

Re: Proc sql selecting most recent date depending on a second variable

Do you have to use SQL?

A data step does this very easily.

 

proc sort data=have;
by id drug date;
run;

data want;
set have;
by ID drug date;
if last.drug;
run;

@v2murthy wrote:

Hello, 

I am working with pharmaceutical prescription records and I am trying to create an output that includes only the most recent prescription for each drug for each person. For example:

 

John smith     adderall     12/28/17

John smith     adderall     12/21/17

John smith     oxycodone 11/14/17

jane doe        adderall       2/14/18

jane doe        oxycodone  1/10/18

jane doe        adderall       12/21/17

 

I want the resulting table to look like this:

 

John smith     adderall     12/28/17

John smith     oxycodone 11/14/17

jane doe        adderall       2/14/18

jane doe        oxycodone  1/10/18

 

Right now I am using the code: 

 

proc sort data = blaine.master ;
by Recipient_Last_Name Recipient_First_Name Date_Dispensed_n descending Drug_Name ;
run ;

proc sql ;
create table final as
select *
from blaine.master
group by Recipient_Last_Name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit ;

 

This gives me the most recent date for each last name, but I need the most recent date for EACH DRUG associated with a person. 

I.e. the code above gives me this:

 

John smith     adderall     12/28/17

jane doe        adderall       2/14/18

 

(need to somehow get the most recent prescriptions of other drugs in there)

 

any help would be appreciated!


 

New Contributor
Posts: 4

Re: Proc sql selecting most recent date depending on a second variable

Thanks for the suggestion. This would be much easier to use next time around. Ended up going with another solution since I had the proc sql code written already. 

Solution
‎03-14-2018 02:35 PM
Super User
Posts: 2,045

Re: Proc sql selecting most recent date depending on a second variable

slight correction in your group by:

data have;
input Recipient_First_Name $ Recipient_Last_Name $ drug_name $ Date_Dispensed_n :mmddyy10.;
format Date_Dispensed_n mmddyy10.;
datalines;
John smith     adderall     12/28/17
John smith     adderall     12/21/17
John smith     oxycodone 11/14/17
jane doe        adderall       2/14/18
jane doe        oxycodone  1/10/18
jane doe        adderall       12/21/17
;

proc sql;
create table want as
select *
from have
group by Recipient_Last_Name,drug_name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit;
New Contributor
Posts: 4

Re: Proc sql selecting most recent date depending on a second variable

Posted in reply to novinosrin

Thank you! I didn't realize you could add a comma to include more than one "group by" variable. 

Respected Advisor
Posts: 3,251

Re: Proc sql selecting most recent date depending on a second variable

proc sort data=have;
    by name drug date;
run;
data want;
    set have;
    by name drug;
    if last.drug then output;
run;
--
Paige Miller
Super User
Posts: 23,950

Re: Proc sql selecting most recent date depending on a second variable

Your edits do not change the question or answer significantly except for the explicit variable names. The approach is the same. 

Also, we don't necessarily see edits so it's best to post as a response or note that you've edited your question. 

 


@v2murthy wrote:

Hello, 

I am working with pharmaceutical prescription records and I am trying to create an output that includes only the most recent prescription for each drug for each person. For example:

 

John smith     adderall     12/28/17

John smith     adderall     12/21/17

John smith     oxycodone 11/14/17

jane doe        adderall       2/14/18

jane doe        oxycodone  1/10/18

jane doe        adderall       12/21/17

 

I want the resulting table to look like this:

 

John smith     adderall     12/28/17

John smith     oxycodone 11/14/17

jane doe        adderall       2/14/18

jane doe        oxycodone  1/10/18

 

Right now I am using the code: 

 

proc sort data = master ;
by Recipient_Last_Name Recipient_First_Name Date_Dispensed_n descending Drug_Name ;
run ;

proc sql ;
create table final as
select *
from blaine.master
group by Recipient_Last_Name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit ;

 

This gives me the most recent date for each last name, but I need the most recent date for EACH DRUG associated with a person. 

I.e. the code above gives me this:

 

John smith     adderall     12/28/17

jane doe        adderall       2/14/18

 

(need to somehow get the most recent prescriptions of other drugs in there)

 

any help would be appreciated!


 

New Contributor
Posts: 4

Re: Proc sql selecting most recent date depending on a second variable

Hi - just edited to remove part of the dataset name for confidentiality purposes. Thanks again.

Super User
Posts: 13,886

Re: Proc sql selecting most recent date depending on a second variable

And another approach if I understand the question:

 

proc summary data =master nway;

   class Recipient_Last_Name Recipient_First_Name Drug_Name ;

   var  Date_Dispensed_n;

   output out=want (drop=_type_ _freq_) max=;

run;

 

which assumes that the date variable is an actual SAS date value.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 344 views
  • 6 likes
  • 5 in conversation