SAS Programming

SWEETSAS · Posted 10-17-2020 06:13 PM

I have dataset by visit in which I am trying to use the observations for the previous record for the current visit. Below are example data (HAVE) for three subid. When correctly done it should look like data WANT. That is, the records for visit=0 is carried over to visit=1; the record for visit=1 is carried over to visit=2, etc. Because the data for visit=0 will be missing when carried over to the visit=1, the data for visit=0 will retain its value and it is also used for visit=1. The last record within each subjid will vanish.

This is needed for all variables in the dataset. The LAG function is not giving the correct output as the LAG function is not pushing the observations by subjid.

data have;
input subjid $ vis var1 var2;
datalines;
001-001 0 4 5
001-001 1 3 6
001-001 2 2 9
001-001 3 5 7

001-002 0 5 7
001-002 1 8 9
001-002 2 6 2
001-002 3 9 4

001-003 0 2 7
001-003 1 6 9
001-003 2 5 2
;

data want;
input subjid $ vis var1 var2 var1new var2new;
datalines;
001-001 0 4 5 4 5
001-001 1 3 6 4 5
001-001 2 2 9 3 6
001-001 3 5 7 2 9

001-002 0 5 7 5 7
001-002 1 8 9 5 7
001-002 2 6 2 8 9
001-002 3 9 4 8 2

001-003 0 2 7 2 7
001-003 1 6 9 2 7
001-003 2 5 2 6 9
;
run;

PaigeMiller · Posted 10-17-2020 06:47 PM

/* UNTESTED CODE */
data want;
    set have;
    by subjid;
    prev_var1=lag(var1);
    prev_var2=lag(var2);
    if first.subjid then do;
        var1new=var1;
        var2new=var2;
    end;
    else do;
        var1new=prev_var1;
        var2new=prev_var2;
    end;
    drop prev_:;
run;

As a comment, I think a better naming scheme is varnew1 and varnew2 or newvar1 and newvar2. When you use this naming scheme, you can easily refer to all of them via a list or via varnew: or newvar: and if you leave the naming scheme as you stated it, then you have to type out the variable names, you can't use a list or the colon to refer to them.

--
Paige Miller

SWEETSAS · Posted 10-19-2020 09:23 AM

Thanks a million @PaigeMiller.

mkeintz · Posted 10-17-2020 11:33 PM

This is a good example for using the lag function embedded in an IFN function:

data want;
  set have;
  by subjid;
  var1new=ifn(first.subjid,var1,lag(var1));
  var2new=ifn(first.subjid,var2,lag(var2));
run;

Because the lag function is inside an IFN function, it is always evaluated, even though it is not always selected based on the first argument of the ifn. That is a good thing, since you want the queue of values underlying the lag to always be synchronized with each new observation - i.e. the beginning of each new id get the lagged value from the end of the prior id. But the IFN will ignore that instance.

BTW, I think you have an erroneous value for var1new in the 8th record. You have 8, but I think it should be 6.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

SWEETSAS · Posted 10-19-2020 09:24 AM

Many thanks @mkeintz

SAS Programming

Using Previous Observations for Current Visits by Group

Re: Using Previous Observations for Current Visits by Group

Re: Using Previous Observations for Current Visits by Group

Re: Using Previous Observations for Current Visits by Group

Re: Using Previous Observations for Current Visits by Group

Sas code to compare previous day files using import.

Flag all observations on the first visit date

Adding prefix to the visit based on previous or upcoming visits

subtracting number by previous value by group

Conditional statements and grouping observations

Follow Us

What is...

SAS Programming

Our biggest data and AI event of the year.

SAS Training: Just a Click Away

Follow Us

What is...