Hi,
I am working on the prescription dataset from a health organization. the dataset contains the following variables in the snapshot below. I want to count a number prescription per patient based on the following criteria,
Thank you in advance!!
data thesis.presc2;
set thesis.presc;
by patient;
if first.patient then lastdate=.;
else lastdate = lag(deliv_date2);
datediff = deliv_date2-lastdate;
run;
proc sql;
CREATE table thesis.presc3 as
select patient, reimburs_cat, source, atc, AdminRoute, N_presc, count(N_presc) as Co_Pres, Center, deliv_date2, DDD,datediff
From thesis.presc2
where datediff >= 7
group by patient ;
quit;
Try something like this:
data temp;
set have; by patient atc notsorted;
lagDate = lag(date);
if first.atc then seq + 1;
else if intck("day", lagDate, date) > 7 then seq +1;
drop lagdate;
run;
data want;
do until(last.seq);
set temp; by patient atc seq notsorted;
newDDD = sum(newDDD, DDD);
end;
drop seq DDD;
rename newDDD=DDD;
run;
Note that I've edited your post to make your question more description. "help" is fairly generic and makes it hard to search for similar answers in the future.
Please post your data as text, especially if you would like tested code as an answer. Currently, to work with your code or come up with a solution I'd have to type it out or mock up my own. Instructions on how include your data as a data step are here:
It also helps if you post what you would expect as output from your input data so that we can test solutions.
Good Luck.
@Yilikal_Tesfaye wrote:
Hi,
I am working on the prescription dataset from a health organization. the dataset contains the following variables in the snapshot below. I want to count a number prescription per patient based on the following criteria,
- If the deliv_date2 is less than seven and the same ATC code for the individual patient then set a number of prescription is equal to one and sum DDD otherwise count the number of prescription.
Thank you in advance!!
data thesis.presc2; set thesis.presc; by patient; if first.patient then lastdate=.; else lastdate = lag(deliv_date2); datediff = deliv_date2-lastdate; run; proc sql; CREATE table thesis.presc3 as select patient, reimburs_cat, source, atc, AdminRoute, N_presc, count(N_presc) as Co_Pres, Center, deliv_date2, DDD,datediff From thesis.presc2 where datediff >= 7 group by patient ; quit;
Your use of the lag() function is incorrect:
if first.patient then lastdate=.;
else lastdate = lag(deliv_date2);
This will cause a wrong entry for lastdate in the second observation for a patient, as you will get the last deliv_date2 of the previous patient.
Never use lag() conditionally (unless you have a very specific reason for that skip).
Do it like that:
lastdate = lag(deliv_date2);
if first.patient then lastdate = .;
PS if you attach a text file, also supply the code that reads it into SAS, so we get an exact replica of your dataset.
@Yilikal_Tesfaye wrote:
Hi,
I am working on the prescription dataset from a health organization. the dataset contains the following variables in the snapshot below. I want to count a number prescription per patient based on the following criteria,
- If the deliv_date2 is less than seven and the same ATC code for the individual patient then set a number of prescription is equal to one and sum DDD otherwise count the number of prescription.
Thank you in advance!!
Incomplete specification for the highlighted bit. I think many of us will assume less than 7 days given that it is a date but what specific other date(s) is it to be compared with?
You may be interested to know there is a function DIF that works like LAG to do the difference of the current value from the previous. So you can use:
datediff = dif( deliv_date2);
Same note about conditional use from @Kurt_Bremser about LAG applies.
Dear Sir/ Madam,
data thesis.presc2;
set thesis.presc;
by patient;
lastdate = lag(deliv_date2);
LAG_DDD = LAG(DDD);
if first.patient then lastdate = .;
datediff = deliv_date2-lastdate;
run;
data thesis.presc21;
set thesis.presc2;
by patient ;
If atc = lag(atc) and datediff < 7 then
Sum_DDD = sum(LAG_DDD,DDD);
run;
Try something like this:
data temp;
set have; by patient atc notsorted;
lagDate = lag(date);
if first.atc then seq + 1;
else if intck("day", lagDate, date) > 7 then seq +1;
drop lagdate;
run;
data want;
do until(last.seq);
set temp; by patient atc seq notsorted;
newDDD = sum(newDDD, DDD);
end;
drop seq DDD;
rename newDDD=DDD;
run;
I merged the last two posts into the older thread concerning the same issue, as there were already several answers given.
To repeat myself (what I usually hat to do): if you supply data in text form, also include the code that reads that text into a SAS dataset.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.