04-26-2016 07:44 AM - edited 04-26-2016 07:46 AM
I have a dataset which shows the number of packs of a drug administered to different patients over a series of months. Here is an example of the structure of the dataset:
data have; input ID _201510 _201511 _201512 _201601 _201602 _201603; cards; 3 1 1 . 2 1 1 8 1 . 1 1 2 2 17 . . 1 . . 1 ; run;
A patient receives a 30% discount on the first 4 packs, no discount on the 5th and 6th pack, and a 50% discount on any packs 7 and above.
Packs Discount Interval
1 – 4 30% 1
5 – 6 0 2
7 and above 50% 3
New data arrives every month, and at the end of each quarter, I do a follow-up to calculate the number of packs per interval that have been administered to a patient during the last three months. Specifically, I need help writing a program that achieves the following:
data want; input ID _201510 _201511 _201512 _201601 _201602 _201603 PackInt1 PackInt2 PackInt3; cards; 3 1 1 . 2 1 1 2 2 0 8 1 . 1 1 2 2 2 2 1 17 . . 1 . . 1 1 0 0 ; run;
Notice that, for patient ID 3, the value of PackInt1 is 2 rather than 4. The reason is that the first two packs have already been dealt with in the previous follow-up (2015Q3). Similarly, patient ID 17 has a value of 1 for PackInt1, because the first pack they received was also dealt with in the previous follow-up.
Any ideas on how I can achieve the above? I need a code that is flexible enough to handle the addition of a new column every month.
04-26-2016 09:24 AM
Not sure what this interval you speak of is, or why your first record has packint1=2, packint2=2, and packint3=., when there appear to be a count of six in the given data?
Whilst you can use arrays to do this, e.g:
data want; set have; total=sum(of _:): run;
However, and you will see this in a Lot of posts here, I first question why you have "data" as column headings. It makes your life so much harder, how many columns are there for instance, what are they called, what are the new ones etc. You will find you life much easier working with a normalised data structure. The reason, well firstly your data structure doesn't change each time, only the number of rows, so yuor code can be more generic. As you are dealing with drugs, you should be using CDISC standards anyways, and you will see they use normalised structures - and there are very good reasons for it.
Remember the data you use to program with does NOT need to be the same as you import/export from your process.
ID YM NUM_PACKS
data have; input ID _201510 _201511 _201512 _201601 _201602 _201603; cards; 3 1 1 . 2 1 1 8 1 . 1 1 2 2 17 . . 1 . . 1 ; run; proc transpose data=have out=t_have; by id; var _:; run; data want; set have; by id; retain packint:; if first.id then do; packint1=0; packint2=0; packint3=0; end; /* Do logical code here to add col1 onto packintX */ run;
04-26-2016 10:42 AM
04-26-2016 09:37 AM
04-26-2016 11:22 AM