BookmarkSubscribeRSS Feed
Newtrix
Fluorite | Level 6

This code is supposed to count number of doctor visits per patient, which it does the first time it's executed. But in subsequent executions, the number for the first visit clears in the FIRST... THEN DO loop but continues to accrue for the subsequent visits. How do I get the program to accumulate these visits within each execution, but then clear them for the next time I run it? Thanks in advance ...

 

DATA bp_list;

SET bp_list END=last_obs;
BY subject;
RETAIN visit;
RETAIN subject_count 0;

IF FIRST.subject THEN DO;
visit=0;
subject_count=subject_count+1;
END;
IF sbp NE . THEN visit=visit+1;

IF _N_=1 THEN PUT @1 'ID'
@5 'DoB'
@15 'Age'
@23'Sex'
@30'Visit'
@40 'SBP'
@50'DBP' /;

IF FIRST.subject THEN PUT @1 subject @5 dob MMDDYY8. @15 age 3.0 @25 gender @;
IF sbp NE . THEN PUT @30 visit @40 sbp @50 dbp;
IF LAST.subject THEN PUT @3 'Subject ' subject ' has ' visit ' visits.' /;
IF last_obs THEN PUT @1 'The list includes ' subject_count ' subjects.';
RUNScreen Shot 2020-02-27 at 6.24.24 AM.png

10 REPLIES 10
mkeintz
PROC Star

Please provide some input data, in the form of a DATA step, and another DATA step with the desired output data, and a description of what unexpected data values you are generating with your program.

 

I presume by "first time it's executed" you mean the first subject, yes?   Frankly it seems to me that your program represents the objective you describe, which is why I ask to see data.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Newtrix
Fluorite | Level 6

Thanks for responding. The issue isn't with the first time I run it - it works for that. But the next time I run it, the visit number clears for the first visit for each subject, but continues to accumulate for the second, third and fourth subjects. The screenshot at the bottom is from the second time it was run: first visit is fine, but the second visit in each case becomes the third, the third the fourth, and so on. 

 

The data was merged from two files, several variables dropped, etc. - it's a lot to go thru if you don't need it. I can still provide if necessary but does the above clarify the issue?

 

andreas_lds
Jade | Level 19

The problem could be caused by the first two lines of your program:

data bp_list;
  set bp_list end=last_obs;

Overwriting a dataset is rarely a good idea, if you have to execute a step multiple times in one session.

Kurt_Bremser
Super User

This:

DATA bp_list;
SET bp_list END=last_obs;

is where your problem comes from.

NEVER (unless you are a very experienced hand who knows what you're doing) overwrite the input dataset in the same step.

Do this instead:

DATA bp_list_new;
SET bp_list END=last_obs;

and see how the data is changed subtly.

 

Kurt_Bremser
Super User

Oh yes, and you absolutely need to show us your original dataset (before you run this step). Post example data in a data  step with datalines (this is absolutely necessary!), use the "little running man" to post code.

Newtrix
Fluorite | Level 6

I do appreciate your help. This is the answer, I think - that I need to rename the dataset each time. It does require additional tweaking, though. When I renamed the set and ran it twice, I came up with the same issue. Is it because in the SET statement, I'm using the same bp_list dataset with the uncleared "visit" values?

 

Here's the entire, original code:

LIBNAME today '/folders/myfolders';

DATA demog;

INFORMAT dob MMDDYY10.;

FORMAT dob MMDDYY10.;

INPUT subject gender $ dob;

DATALINES;

1   F    03/03/1941

2   F    04/09/1945

3   M    07/04/1939

4   F    04/19/1955

5   M    10/06/1942

6   M    08/08/1938

7   F    01/16/1942

8   M    06/23/1938

9   F    05/15/1950

10  F    11/06/1934

11  M    10/04/1938

12  M    02/16/1955

13  M    02/15/1947

14  F    04/11/1937

15  F    04/07/1948

;

 

DATA bp;

INFORMAT visit_date MMDDYY10.;

FORMAT visit_date MMDDYY10.;

 

RETAIN subject;

INPUT temp 1 @;

IF temp NE .THEN INPUT @1 subject @;

INPUT @5 visit_date @20 sbp @30 dbp;

DROP temp;

 

DATALINES;

1   08/13/2002          134       105

    08/19/2002          136       112

    08/29/2002          134       114

2   09/20/2002          103        91

    09/28/2002          105        82

4   10/07/2002          146       113

    10/17/2002          142       118

5   08/31/2002          104        73

    09/10/2002          111        77

3   07/01/2002          148       106

    07/05/2002          146       107

;

RUN;

 

PROC SORT DATA=demog;

BY subject;

RUN;

 

PROC SORT DATA=bp;

BY subject;

RUN;

 

DATA all;

MERGE demog bp;

BY subject;

RUN;

 

DATA today.bp;

SET all (RENAME=(visit_date=dov));

           FORMAT dob MMDDYY10. age 4.1;

IF sbp NE .;

RUN;

 

DATA bp_list;

SET today.bp;

age=(dov-dob)/365.25;

PUT @1 subject @5 dob @20 age 4.1 @30 gender @35 sbp @45 dbp;

RUN;

 

PROC SORT DATA=today.bp OUT=bp_list;

BY subject dov;

RUN;

 

DATA bp_list;

SET bp_list END=last_obs;

 

BY subject;

 

RETAIN visit;

RETAIN subject_count 0;

 

age=(dov-dob)/365.25;

 

IF FIRST.subjectTHEN DO;

visit=0;

subject_count=subject_count+1;

END;

 

IF sbp NE .THEN visit=visit+1;

 

IF _N_=1THEN PUT@1'ID'

@5'DoB'

@15 'Age'

@23'Sex'

@30'Visit'

@40'SBP'

@50 'DBP' /;

 

IF FIRST.subjectTHEN PUT @1 subject @5 dob MMDDYY8. @15 age 3.0 @25 gender @;

 

IF sbp NE .THEN PUT @30 visit @40 sbp @50 dbp;

 

IF LAST.subjectTHEN PUT @3 'Subject ' subject ' has ' visit ' visits.' /;

 

IF last_obsTHEN PUT @1 'The list includes ' subject_count ' subjects.';

RUN;

Patrick
Opal | Level 21

Taking your code here what you're potentially looking for.

Below should at least give you some ideas like how to use the yrdif() function to calculate the age (also does it right for leap years).

DATA demog;
  INFORMAT dob MMDDYY10.;
  FORMAT dob MMDDYY10.;
  INPUT subject gender $ dob;
  DATALINES;
1   F    03/03/1941
2   F    04/09/1945
3   M    07/04/1939
4   F    04/19/1955
5   M    10/06/1942
6   M    08/08/1938
7   F    01/16/1942
8   M    06/23/1938
9   F    05/15/1950
10  F    11/06/1934
11  M    10/04/1938
12  M    02/16/1955
13  M    02/15/1947
14  F    04/11/1937
15  F    04/07/1948
;

DATA bp;
  INFORMAT visit_date MMDDYY10.;
  FORMAT visit_date MMDDYY10.;
  RETAIN subject;
  INPUT temp 1 @;

  IF temp NE . THEN INPUT @1 subject @;
  INPUT @5 visit_date @20 sbp @30 dbp;
  DROP temp;
  DATALINES;
1   08/13/2002          134       105
    08/19/2002          136       112
    08/29/2002          134       114
2   09/20/2002          103        91
    09/28/2002          105        82
4   10/07/2002          146       113
    10/17/2002          142       118
5   08/31/2002          104        73
    09/10/2002          111        77
3   07/01/2002          148       106
    07/05/2002          146       107
;

PROC SORT DATA=demog;
  BY subject;
RUN;

PROC SORT DATA=bp;
  BY subject visit_date;
RUN;

DATA _null_;
  file print;
  MERGE demog bp(in=inbp rename=(visit_date=dov)) end=last_obs;
  BY subject;
  if inbp;

  FORMAT dob MMDDYY10. age 4.1;
  age=yrdif(dob, dov, 'age');
  IF FIRST.subject THEN
    DO;
      visit=0;
      subject_count+1;
    END;
  visit+1;

  IF _N_=1 THEN PUT @1 'ID'  @5 'DoB'  @15 'Age'  @23 'Sex'  @ /;
  IF FIRST.subject THEN PUT @1 subject @5 dob MMDDYY8. @15 age 3.0 @25 gender @;
  IF LAST.subject THEN PUT @30 'Subject ' subject ' has ' visit ' visits.' /;
  IF last_obs THEN PUT @1 'The list includes ' subject_count ' subjects.';
RUN;

 

Newtrix
Fluorite | Level 6

Thanks! What caused my code not to clear? Also, I'm not getting the last line of code executed in my output file. I appreciate your help.

Tom
Super User Tom
Super User

The RETAIN statement just says that you don't want the data step to reset the variable to missing when it starts the next iteration of the data step.  You do not need to use a RETAIN statement for a variable that is being sourced from an input dataset.  All such variables are automatically retained.  But in general you never noticed because the retained value is immediately overwritten when the next observation is read from the source dataset.

 

What  others have tried to tell you is that by using the output of the first run as the input for the second run you are no longer going to get the same behavior for your "retained" variable.

 

The solution might be as simple as dropping the variable.  There is even an option you can change, DKRICOND , that you can change so that you could use the DROP= dataset option on your input dataset even if you are not sure if the variable you want to drop is there or not. Then you could run the same code against the same dataset name and not have it fail when it tries to drop the variable.

 

Here is a simple example that shows the issue.  Notice how in STEP3 the values of COUNTER are always one more than the value in STEP2 (except for the first observation per BY group).

 

But that the result in STEP4  when you don't allow the COUNTER variable to come in from STEP2 the counter is created the same it did when STEP2 was made.

proc sort data=sashelp.class out=step1 ;
  by sex;
run;

data step2;
  set step1 ;
  by sex ;
  if first.sex then counter=0;
  counter+1;
run;

data step3;
 set step2;
 by sex;
 if first.sex then counter=0;
 counter+1;
run;

proc compare data=step2 compare=step3;
run;

data step4;
  set step2(drop=counter);
  by sex;
 if first.sex then counter=0;
 counter+1;
run;

proc compare data=step2 compare=step4;
run;

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 1250 views
  • 0 likes
  • 6 in conversation