BookmarkSubscribeRSS Feed
udden2903
Obsidian | Level 7

I have a dataset which shows the number of packs of a drug administered to different patients over a series of months. Here is an example of the structure of the dataset:

 

data have;

input ID _201510 _201511 _201512 _201601 _201602 _201603;

cards;

3 1 1 . 2 1 1

8 1 . 1 1 2 2

17 . . 1 . . 1

;

run;

 

 

A patient receives a 30% discount on the first 4 packs, no discount on the 5th and 6th pack, and a 50% discount on any packs 7 and above.

 

Packs                             Discount    Interval

1 – 4                               30%            1

5 – 6                               0                  2

7 and above                 50%            3

 

New data arrives every month, and at the end of each quarter, I do a follow-up to calculate the number of packs per interval that have been administered to a patient during the last three months. Specifically, I need help writing a program that achieves the following:

 

data want;

input ID _201510 _201511 _201512 _201601 _201602 _201603 PackInt1 PackInt2 PackInt3;

cards;

3 1 1 . 2 1 1 2 2 0

8 1 . 1 1 2 2 2 2 1

17 . . 1 . . 1 1 0 0

;

run;

 

 

Notice that, for patient ID 3, the value of PackInt1 is 2 rather than 4. The reason is that the first two packs have already been dealt with in the previous follow-up (2015Q3). Similarly, patient ID 17 has a value of 1 for PackInt1, because the first pack they received was also dealt with in the previous follow-up.

 

Any ideas on how I can achieve the above? I need a code that is flexible enough to handle the addition of a new column every month.

4 REPLIES 4
RW9
Diamond | Level 26 RW9
Diamond | Level 26

Hi,

 

Not sure what this interval you speak of is, or why your first record has packint1=2, packint2=2, and packint3=., when there appear to be a count of six in the given data?

 

Whilst you can use arrays to do this, e.g:

data want; 
  set have;
  total=sum(of _:):
run;
  

However, and you will see this in a Lot of posts here, I first question why you have "data" as column headings.  It makes your life so much harder, how many columns are there for instance, what are they called, what are the new ones etc.  You will find you life much easier working with a normalised data structure.  The reason, well firstly your data structure doesn't change each time, only the number of rows, so yuor code can be more generic.  As you are dealing with drugs, you should be using CDISC standards anyways, and you will see they use normalised structures - and there are very good reasons for it.

 

Remember the data you use to program with does NOT need to be the same as you import/export from your process.

ID   YM    NUM_PACKS

...

For example:

data have;
  input ID _201510 _201511 _201512 _201601 _201602 _201603;
cards;
3 1 1 . 2 1 1
8 1 . 1 1 2 2
17 . . 1 . . 1
;
run;
proc transpose data=have out=t_have;
  by id;
  var _:;
run;
data want;
  set have;
  by id;
  retain packint:;
  if first.id then do;
    packint1=0;
    packint2=0;
    packint3=0;
  end;
  /* Do logical code here to add col1 onto packintX */ 
run;
udden2903
Obsidian | Level 7
I will change the column headers to month1, month2 etc. to make the dataset easier to deal with. I explain in the last part of my post why PackInt1 is 2. The reason is that the first two packs were administered in the previous quarter (i.e. 2015Q4). Those packs have already been handled in the previous quarter and I don't want to see them in the PackInt1 column when I'm following up on the 2016Q1 data.
pearsoninst
Pyrite | Level 9
I really dont understand what is the question..

data have;
input ID _201510 _201511 _201512 _201601 _201602 _201603;
if ID < 4 then
PackInt1 = '30% or 4 packs';
if ID = 5 or ID = 6 then
PackInt1 = 'no Dis';
if ID >= 7 then
PackInt1 = ' 50% DIS';

cards;
3 1 1 . 2 1 1
8 1 . 1 1 2 2
17 . . 1 . . 1
;
run;
udden2903
Obsidian | Level 7
The question is how to identify the number of packs in each interval that have been administered to a patient in the current quarter. If a patient had already received four packs in the previous quarter, by definition that means that the patient cannot have received any more packs from interval 1 in the current quarter. That is why the value of PackInt1 should be 0 for the same patient, even though historically they have received four packs.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1209 views
  • 0 likes
  • 3 in conversation