BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
FrankReynolds
Calcite | Level 5

Hi there, I am interested in modifying my dataset in a way that I believe needs a RETAIN function. But it is not yielding what I want. I have a series of observations that represent a doctor visit. One of the variables is a patient ID number (patient_ID). If the same patient ID number occurs under multiple observations, then the same person has had multiple visits to the doctor. There is also a variable that identifies the date the visit occured(date). There is a 3rd variable that indicates if the visit was a 5:30 or after visit(evening),a (1,0) variable where 1 indicates a visit at or after 5:30.

I've sorted the observations by ascending patient ID number and then ascending visit date. I am interested only in the visit that occurs after 5:30, and any visits for that patient thereafter. So if the first two visits occur at 3:00 and 4:00, and the 3rd 4th and 5th at 5:45, 2:00, 12:00 respectably, I am only interested in the 3rd 4th and 5th visits.I want to "mark" these visits by creating a new variable that =1 for these observations (=0 for the non-relevant observations). I'm thinking a RETAIN function is the way to go, but doesn't seem to be working:

data new;

set old;

by patient_id date;

retain indicator;

if first.patient_id and evening=1 then do;

indicator = 1 ;

end;

else if evening = 1 then do;

indicator2=1;

end;

run; 

I would think this code would carry down, or RETAIN, the 1 to the next data line. Instead, it seems to reset to 0. The only observations in where indicator=1 are those observations where the evening=1, even though all those subsequent visits for that patient should be marked as 1....

Hope it is clear, let me know if it ain't!

TM

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Again the ONLY way that happens with the code posted is if the INDICATOR variable is ALREADY defined in the input dataset OLD.

Here is your data.

data old;

  input patient_id date evening ;

  informat date mmddyy10. ;

  format date yymmdd10.;

cards;

1000                    1.24.10        0                       0

1000                    2.20.10        0                       0

1000                    3.30.10        0                       0

1000                    5.06.10        1                       1

1000                    6.11.10        0                       0

;;;;

data new ;

  set old ;

  by patient_id date ;

  retain indicator ;

  if first.patient_id then indicator=0;

  if evening=1 then indicator=1;

  put (_all_) (=);

run;

patient_id=1000 date=2010-01-24 evening=0 indicator=0

patient_id=1000 date=2010-02-20 evening=0 indicator=0

patient_id=1000 date=2010-03-30 evening=0 indicator=0

patient_id=1000 date=2010-05-06 evening=1 indicator=1

patient_id=1000 date=2010-06-11 evening=0 indicator=1

View solution in original post

11 REPLIES 11
gergely_batho
SAS Employee

In you code indicator becomes 1 only if the first visit of a patient is evening. Indicator2 becomes 1 if there is an evening visit but not the first one. But evening2 is not retained!

What  about this:

data new;

set old;

by patient_id date;

retain indicator;

if first.patient_id then do;

indicator = 0;/*reseting indicator at every group beginning*/

end;

if evening=1 then do;

  indicator=1;

end;

run;

FrankReynolds
Calcite | Level 5

Hi Gergely,

Thanks for the response. There is a typo in the code above - there should only be an 'indicator' variable, not 'indicator2'...Whenever I try to edit it, it bugs out so I'm going to let it be.

Let me try out your approach and see what happens...


FrankReynolds
Calcite | Level 5

It doesn't work. It sets indicator=0 for the first observation, and any subsequent observation to 1, whether evening=1 or not. I'm guessing it is because of the lines-

if first.patient_id then do;

indicator = 0;/*reseting indicator at every group beginning*/

It is indeed making indicator=0 if it is the first. Therefore, if it is not the first, it =1 by default.

sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10

Recommend adding PUTLOG '>DIAG-nnn' / _all_;   statements at various points in the DATA step to reveal just what you are getting condition-wise with not only your flag variables but also your BY variable processing.  That will help with desk-checking your DATA step flow.

Tom
Super User Tom
Super User

There is nothing in your code that could set INDICATOR=0.  If you are seeing records where INDICATOR=0 then it must have been on you input dataset.

You need to either drop that old INDICATOR variable or use a new name for the new variable.

You cannot "RETAIN" a variable that is on the input data set because every time the SET statement executes it will read the value from the data set and not carry forward the value from the previous iteration of the data step.

FrankReynolds
Calcite | Level 5

if first.patient_id and evening=1 then do;

indicator = 1 ;

end;

Wouldn't this make any first observation where evening not equal to 0, have an indicator=0, or atleast '.' ?

'Indicator' was not in the input data set, I believe I created it with this new dataset??

Tom
Super User Tom
Super User
 I am interested only in the visit that occurs after 5:30, and any visits for that patient thereafter. 


You need to reset the flag when you start a new patient.


data new;

  set old;

  by patient_id date;

  retain indicator;

  if first.patient_id then indicator=0;

  if evening = 1 then indicator=1;

run;

FrankReynolds
Calcite | Level 5

I understand your point, and it seems logical to me, but for some reason, the indicator only =1 for that specific observatin in where evening=1. FYI, when I write "any visits for that patient thereafter", I don't mean only subsequent visits after 5:30. These visits could be any time, and date, as long as they occur after that specific evening=1 observation. Let me give you an example of my output when I use your code, for the first patients set of visits

PATIENT_ID         DATE         EVENING         INDICATOR

1000                    1.24.10        0                       0

1000                    2.20.10        0                       0

1000                    3.30.10        0                       0

1000                    5.06.10        1                       1

1000                    6.11.10        0                       0

The problem is that the last observation's indicator variable should =1, because it is a visit on a date that took place after the evening visit date. It should read:

PATIENT_ID         DATE         EVENING         INDICATOR

1000                    1.24.10        0                       0

1000                    2.20.10        0                       0

1000                    3.30.10        0                       0

1000                    5.06.10        1                       1

1000                    6.11.10        0                       1

Tom
Super User Tom
Super User

Again the ONLY way that happens with the code posted is if the INDICATOR variable is ALREADY defined in the input dataset OLD.

Here is your data.

data old;

  input patient_id date evening ;

  informat date mmddyy10. ;

  format date yymmdd10.;

cards;

1000                    1.24.10        0                       0

1000                    2.20.10        0                       0

1000                    3.30.10        0                       0

1000                    5.06.10        1                       1

1000                    6.11.10        0                       0

;;;;

data new ;

  set old ;

  by patient_id date ;

  retain indicator ;

  if first.patient_id then indicator=0;

  if evening=1 then indicator=1;

  put (_all_) (=);

run;

patient_id=1000 date=2010-01-24 evening=0 indicator=0

patient_id=1000 date=2010-02-20 evening=0 indicator=0

patient_id=1000 date=2010-03-30 evening=0 indicator=0

patient_id=1000 date=2010-05-06 evening=1 indicator=1

patient_id=1000 date=2010-06-11 evening=0 indicator=1

FrankReynolds
Calcite | Level 5

Right...I'm starting to think it is something funky with the old dataset too. I've found a way around it for now, but I will go back and try again...More curious than anything.

Thanks Tom.

FrankReynolds
Calcite | Level 5

That worked! Thank you...

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1752 views
  • 0 likes
  • 4 in conversation