- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I am working with pharmaceutical prescription records and I am trying to create an output that includes only the most recent prescription for each drug for each person. For example:
John smith adderall 12/28/17
John smith adderall 12/21/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
jane doe adderall 12/21/17
I want the resulting table to look like this:
John smith adderall 12/28/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
Right now I am using the code:
proc sort data = master ;
by Recipient_Last_Name Recipient_First_Name Date_Dispensed_n descending Drug_Name ;
run ;
proc sql ;
create table final as
select *
from blaine.master
group by Recipient_Last_Name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit ;
This gives me the most recent date for each last name, but I need the most recent date for EACH DRUG associated with a person.
I.e. the code above gives me this:
John smith adderall 12/28/17
jane doe adderall 2/14/18
(need to somehow get the most recent prescriptions of other drugs in there)
any help would be appreciated!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
slight correction in your group by:
data have;
input Recipient_First_Name $ Recipient_Last_Name $ drug_name $ Date_Dispensed_n :mmddyy10.;
format Date_Dispensed_n mmddyy10.;
datalines;
John smith adderall 12/28/17
John smith adderall 12/21/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
jane doe adderall 12/21/17
;
proc sql;
create table want as
select *
from have
group by Recipient_Last_Name,drug_name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do you have to use SQL?
A data step does this very easily.
proc sort data=have;
by id drug date;
run;
data want;
set have;
by ID drug date;
if last.drug;
run;
@v2murthy wrote:
Hello,
I am working with pharmaceutical prescription records and I am trying to create an output that includes only the most recent prescription for each drug for each person. For example:
John smith adderall 12/28/17
John smith adderall 12/21/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
jane doe adderall 12/21/17
I want the resulting table to look like this:
John smith adderall 12/28/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
Right now I am using the code:
proc sort data = blaine.master ;
by Recipient_Last_Name Recipient_First_Name Date_Dispensed_n descending Drug_Name ;
run ;proc sql ;
create table final as
select *
from blaine.master
group by Recipient_Last_Name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit ;
This gives me the most recent date for each last name, but I need the most recent date for EACH DRUG associated with a person.
I.e. the code above gives me this:
John smith adderall 12/28/17
jane doe adderall 2/14/18
(need to somehow get the most recent prescriptions of other drugs in there)
any help would be appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the suggestion. This would be much easier to use next time around. Ended up going with another solution since I had the proc sql code written already.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
slight correction in your group by:
data have;
input Recipient_First_Name $ Recipient_Last_Name $ drug_name $ Date_Dispensed_n :mmddyy10.;
format Date_Dispensed_n mmddyy10.;
datalines;
John smith adderall 12/28/17
John smith adderall 12/21/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
jane doe adderall 12/21/17
;
proc sql;
create table want as
select *
from have
group by Recipient_Last_Name,drug_name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you! I didn't realize you could add a comma to include more than one "group by" variable.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
proc sort data=have;
by name drug date;
run;
data want;
set have;
by name drug;
if last.drug then output;
run;
Paige Miller
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Your edits do not change the question or answer significantly except for the explicit variable names. The approach is the same.
Also, we don't necessarily see edits so it's best to post as a response or note that you've edited your question.
@v2murthy wrote:
Hello,
I am working with pharmaceutical prescription records and I am trying to create an output that includes only the most recent prescription for each drug for each person. For example:
John smith adderall 12/28/17
John smith adderall 12/21/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
jane doe adderall 12/21/17
I want the resulting table to look like this:
John smith adderall 12/28/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
Right now I am using the code:
proc sort data = master ;
by Recipient_Last_Name Recipient_First_Name Date_Dispensed_n descending Drug_Name ;
run ;proc sql ;
create table final as
select *
from blaine.master
group by Recipient_Last_Name
having Date_Dispensed_n = max(Date_Dispensed_n) ;
quit ;
This gives me the most recent date for each last name, but I need the most recent date for EACH DRUG associated with a person.
I.e. the code above gives me this:
John smith adderall 12/28/17
jane doe adderall 2/14/18
(need to somehow get the most recent prescriptions of other drugs in there)
any help would be appreciated!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi - just edited to remove part of the dataset name for confidentiality purposes. Thanks again.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And another approach if I understand the question:
proc summary data =master nway;
class Recipient_Last_Name Recipient_First_Name Drug_Name ;
var Date_Dispensed_n;
output out=want (drop=_type_ _freq_) max=;
run;
which assumes that the date variable is an actual SAS date value.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
data master;
retain name fname lname drug date;
input fname $ lname $ drug: $9. date: mmddyy8.;
format date mmddyy8.;
name=cat(fname,lname,' ');
cards;
John smith adderall 12/28/17
John smith adderall 12/21/17
John smith oxycodone 11/14/17
jane doe adderall 2/14/18
jane doe oxycodone 1/10/18
jane doe adderall 12/21/17
;
run;
proc sort data=master;
by name drug descending date;
run;
data master_result;
set master;
by name drug descending date;
if first.drug then output;
run;