10-21-2016 10:47 PM - last edited on 10-21-2016 11:47 PM by Reeza
I am currently working with claims dataset. I need to get patient level claims dataset. My dataset looks like:
ClaimID Patient ID Rxflag Filldate Filldatethru
1 5 1 08/09/10 09/08/10
2 5 1 10/08/10 11/09/10
3 5 0 10/08/10 12/13/10
4 6 1 03/09/11 05/10/11
5 6 0 06/10/12 07/13/11
I need to create variables named start_date and end_date based on unique patient ID to capture the date of first and last fill. I tried using something like this:
data PATIENT_LEVEL_EVER_FLAGS (keep=pt_id drug_ever); set FLAGGED_CLAIMS; by pt_id; label drug_ever="Ever have drug code of interest"; if first.pt_id then do; drug_ever=0; end; if rx_flag=1 then drug_ever=1; if drug_ever=1 then startdate=filldate; if last.pt_id then output; if last.pt_id then enddate=filldatethru; run;
I am confused with the concept. Could someone please explain me the basics behind this? Any suggestions on how to proceed will be of great help! Thanks.
10-21-2016 11:47 PM
A better option might be SQL. This should be fairly close:
proc sql; create table patients as select patientid ,count(*) as patientvisits ,min(filldate) as firstfilldate format = date9. ,max(filldatethru) as lastfilldatethru format = date9. from flagged_claims where rx_flag = 1 group by patientid ; quit;
10-22-2016 12:09 AM
Or, if you want to keep patients without any rx_flag=1 records
proc sql; create table patients as select patientId ,count(*) as patientVisits ,min(case when rx_flag then filldate else . end) as firstfilldate format = date9. ,max(case when rx_flag then filldatethru else . end) as lastfilldatethru format = date9. from flagged_claims group by patientId ; quit;
10-22-2016 12:17 AM
Or, if you prefer a data step (should be more efficient)
data patients; do until(last.patientid); set flagged_claims; by patientId; if rx_flag then do; firstFillDate = min(firstFillDate, fillDate); lastFillDateThru = max(lastFillDateThru, fillDateThru); end; end; keep patientId firstFillDate lastFillDateThru; format firstFillDate lastFillDateThru yymmdd10.; run;
10-22-2016 12:49 AM
Thank you for providing the SQL method for keeping all patient-level observations. It worked perfectly. (I now have start date for every patient who was prescribed the drug and no start date for every patient who was not prescribed the drug).
Thanks a lot!
10-22-2016 12:45 AM
The SQL method worked great. It just keeps the observation with RX_flag=1(because of the Where clause). But, I needed to keep all the observations (those without Rx_flag=1 also), hence I modified your method using suggestions made by PGStats.
Thanks a ton!
10-21-2016 11:48 PM
Based on the sample dataset abouve what would you expect as output?
And what are you having issues with in your code?
Hint: Look into RETAIN to keep variable value across rows.
10-22-2016 12:00 AM
Thanks for your inputs. I expect that the output will have all the variables given in the sample and will create two new variables 'startdate' and 'enddate'. Startdate represents when the first prescription was filled(is equal to filldate). Enddate is when the prescription was stopped(is equal to filldatethru). So, I need to keep start and end dates for each unique patient who was prescribed a drug (whose rx_flag=1). However, when I try running the code which I have mentioned, it gives me missing values for startdate even for rx_flag=1. So, I am not sure how I should be specifying that condition. Should it be
if rx_flag=1 then startdate=filldate?
I will look into the RETAIN statement that you mentioned. Thanks again!
10-22-2016 04:06 PM
If I understand your question correctly, you want startdate = fill date and enddate = fillthrudate when rx_flag = 1 otherwise startdate = filldate.
Try the solution below:
data patientrx; set filldate ; by patient_id; if rx_flag = 1 then do ; startdate = filldate; enddate = filldatethru; end; else; do; startdate = filldate; end; format startdate enddate mmddyy9.; run;