Hello all,
i hope you can help me.
I have a dataset containing information of prescription filled by an individual. An individual can have several observations depending on the number of prescriptions.
I need to find out:
1. How many drugs do one individual use during the whole time period
2. How many drugs do one individual use during one month
3. What is the length of use
data have:
id prescriptiondate drug
1 01012013 A
2 01042018 B
2 04052018 C
3 10102016 A
3 11102016 B
3 12122016 A
output want:
1.
number of drugs used number of individuals
1 X
2 X
3 X
4 X
2. Same as one, just within one month
3. mean, median, p25, p25, std
data have; input id $ prescriptiondate :ddmmyy10. drug $; format prescriptiondate ddmmyy10.; datalines; 1 01012013 A 1 12032013 A 1 13032013 B 2 01042018 B 2 04052018 C 2 04062018 C 2 05072018 C 2 03082018 C 3 10102016 A 3 11102016 B 3 12122016 A ; proc sql; create table want as select *, count(distinct drug) as count, (select count(drug) from have where id=a.id and prescriptiondate between intnx('month',a.prescriptiondate,-1,'s') and a.prescriptiondate) as count_one_month , range(a.prescriptiondate) as duration from have as a group by id order by 1,2; quit;
Best is to provide example data in the from of a data step that we can run.
Second, you should provide an example of the result given that example data. Reason: it is not clear whether you mean "unique drugs" only count a specific drug one time for each interval or each occurrence. Example:
Suppose that you have ( assumes that your date is in Month/day/year order) and you have two records like:
1 01012013 A
1 01152013 A
So does this person count 1 or 2 for prescription A in Jan 2013?
Third, you need tell us if your dates are in month/day/year or day/month/year order. Minor but we have a 50/50 chance of guessing wrong because none of your values tell us and the approach to report by month will make that difference critical.
Fourth your example data should demonstrate something that counts more than 1 per month to make sure we see what you are intending.
Fifth, do need a report that people read or data set for further manipulation as your final output?
Start: Here is an example data set using data step. Add some more records to make the example a bit more complex and a second to show the result. If you paste code into a text box opened using the </> icon above the message box it is clearer as to the intent and the text box will maintain code format such as indenting that is typically removed by the message window making it a tad harder to follow code. If your dates are day/month/year use DDMMYY10 in the code below. If your data does not have dates as actual dates that will be needed to apply your "month" logic.
data have; input id $ prescriptiondate :mmddyy10. drug $; format prescriptiondate mmddyy10.; datalines; 1 01012013 A 2 01042018 B 2 04052018 C 3 10102016 A 3 11102016 B 3 12122016 A ;
Hint: Run Proc Contents on your existing data set and actually show us the type and current format assigned to your PrescriptionDate variable. If it is not an actual SAS date value that will need to be addressed.
Hi Ballard,
I am sorry for the missing information, in the following, i will try to answer my best:
I need to count unique drugs, meaning that if a person have a prescription several times for the same drug, it should be counted as "one".
I am sorry about the date format. It is a SAS date in the format of day/month/year
I have added some more records to the dataset and included a "want" dataset. The struggle is though, that i am not sure how it will look like in a dataset. I just know, that i need the results to be a table like presented in the initial question, and for the count within one month, the result must be the maximum count during one month, regardless of period, meaning that if a participant uses several medications for three months, but only uses two within one of the months, then the result should reflect two.
I hope it makes sense.
data have; input id $ prescriptiondate :ddmmyy10. drug $; format prescriptiondate ddmmyy10.; datalines; 1 01012013 A
1 12032013 A
1 13032013 B 2 01042018 B 2 04052018 C
2 04062018 C
2 05072018 C
2 03082018 C 3 10102016 A 3 11102016 B 3 12122016 A
data want;
input id $ prescriptiondate :ddmmyy10. drug $ count 1. count_one_month 1. duration 5.;
format prescriptiondate ddmmyy10.;
datalines;
1 01012013 A 2 1 71
1 12032013 A 2 1 71
1 13032013 B 2 2 71
2 01042018 B. 2 1 124
2 04052018 C 2 1 124
2 04062018 C 2 2 124
2 05072018 C 2 1 124
2 03082018 C 2 2 124
3 10102016 A 2 1 64
3 11102016 B 2 2 64
3 12122016 A 2 1 64
data have; input id $ prescriptiondate :ddmmyy10. drug $; format prescriptiondate ddmmyy10.; datalines; 1 01012013 A 1 12032013 A 1 13032013 B 2 01042018 B 2 04052018 C 2 04062018 C 2 05072018 C 2 03082018 C 3 10102016 A 3 11102016 B 3 12122016 A ; proc sql; create table want as select *, count(distinct drug) as count, (select count(drug) from have where id=a.id and prescriptiondate between intnx('month',a.prescriptiondate,-1,'s') and a.prescriptiondate) as count_one_month , range(a.prescriptiondate) as duration from have as a group by id order by 1,2; quit;
If you want pick up the max value between count and count_one_month.
proc sql; create table want as select *, count(distinct drug) as count, (select count(drug) from have where id=a.id and prescriptiondate between intnx('month',a.prescriptiondate,-1,'s') and a.prescriptiondate) as count_one_month , max(calculated count,calculated count_one_month) as max_count, range(a.prescriptiondate) as duration from have as a group by id order by 1,2; quit;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.