BookmarkSubscribeRSS Feed
SWEETSAS
Obsidian | Level 7

I have dataset  by visit in which I am trying to use the observations for the previous record for the current visit. Below are example  data (HAVE) for three subid. When correctly done it should look like data WANT. That is, the records for visit=0 is carried over to visit=1; the record for visit=1 is carried over to visit=2, etc. Because the data for  visit=0 will be missing when carried over to the visit=1, the data for visit=0 will retain its value and it is also used for visit=1. The last record within each subjid will vanish. 

 

This is needed for all variables in the dataset. The LAG function is not giving the correct output as the LAG function is not pushing the observations by subjid.

 

data have;
input subjid $ vis var1 var2;
datalines;
001-001 0 4 5
001-001 1 3 6
001-001 2 2 9
001-001 3 5 7

001-002 0 5 7
001-002 1 8 9
001-002 2 6 2
001-002 3 9 4

001-003 0 2 7
001-003 1 6 9
001-003 2 5 2
;

 

data want;
input subjid $ vis var1 var2 var1new var2new;
datalines;
001-001 0 4 5 4 5
001-001 1 3 6 4 5
001-001 2 2 9 3 6
001-001 3 5 7 2 9

001-002 0 5 7 5 7
001-002 1 8 9 5 7
001-002 2 6 2 8 9
001-002 3 9 4 8 2

001-003 0 2 7 2 7
001-003 1 6 9 2 7
001-003 2 5 2 6 9
;
run;

4 REPLIES 4
PaigeMiller
Diamond | Level 26
/* UNTESTED CODE */
data want;
    set have;
    by subjid;
    prev_var1=lag(var1);
    prev_var2=lag(var2);
    if first.subjid then do;
        var1new=var1;
        var2new=var2;
    end;
    else do;
        var1new=prev_var1;
        var2new=prev_var2;
    end;
    drop prev_:;
run;

As a comment, I think a better naming scheme is varnew1 and varnew2 or newvar1 and newvar2. When you use this naming scheme, you can easily refer to all of them via a list or via varnew: or newvar: and if you leave the naming scheme as you stated it, then you have to type out the variable names, you can't use a list or the colon to refer to them.

--
Paige Miller
mkeintz
PROC Star

This is a good example for using the lag function embedded in an IFN function:

 

data want;
  set have;
  by subjid;
  var1new=ifn(first.subjid,var1,lag(var1));
  var2new=ifn(first.subjid,var2,lag(var2));
run;

Because the lag function is inside an IFN function, it is always evaluated, even though it is not always selected based on the first argument of the ifn.   That is a good thing, since you want the queue of values underlying the lag to always be synchronized with each new observation - i.e. the beginning of each new id get the lagged value from the end of the prior id.  But the IFN will ignore that instance.

 

BTW, I think you have an erroneous value for var1new in the 8th record.  You have 8, but I think it should be 6.

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 1865 views
  • 0 likes
  • 3 in conversation