BookmarkSubscribeRSS Feed
112211
Obsidian | Level 7


data svn;
input id sv vt ;
informat sv vt date9.;
cards ;
101 01jan2020 31jan2020
101 15feb2020 28feb2020
101 11mar2020 31mar2020
102 01apr2020 30apr2020
102 18may2020 31may2020
102 22sep2020 30sep2020
;

 

From using above data i want below mentioned output :

 

ID      SV                          VT                   lag                         tot

101      01jan2020        31jan2020          .       

101       15feb2020      28feb2020       31JAN2020         15

101       11mar2020       31mar2020      28FEB2020        12

102       01apr2020      30apr2020          .                             .

102       18may2020     31may2020     30APR2020            18

102       22sep2020      30sep2020      31MAY2020            114

2 REPLIES 2
mkeintz
PROC Star

There's a good argument to be made that the LAG function is misleadingly named.  Folks familiar with spreadsheets, for example, just think of the LAG function as a way to look back 1 row (or 2 rows for a 2-period lag, etc.).  So you might naturally expect that the syntax below, using an IF ... THEN ...  ELSE construct.

  if first.id then y=.;
  else y=lag(x);

 

This works for all other functions, but will yield unexpected results for the lag function (and the related dif function).  You will discover that the second obs for all id's but the first id will get a Y value taken from the last obs of the preceding ID, rather than from the first obs of the same ID.

 

That's because the unfortunately named LAG function is not a lookback - it is a FIFO queue update (a queue of length 1 in this case).  Think of the queue as a retained value (or retained values for lag2+).  For most purposes, including this task, that queue of retained value(s) has to be updated for every observation even if the value in the queue is not going to be used.  So many folks use this 4-statement structure:

 

 

lx = lag(x);
if first.id) then y=.;
else y=lx;
drop lx;

 

 

But you can instead do the above in one statement using the IFN (or IFC) function, as in

 

 

y=ifn(first.id,.,lag(x));

 

 

This works because the IFN always evaluates both the second and third arguments, regardless of whether the first argument is true or false.  The lag-generated queue is therefore always updated, as is needed in this case.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 389 views
  • 0 likes
  • 3 in conversation