BookmarkSubscribeRSS Feed
RandyStan
Fluorite | Level 6

Dear All:

 

My data is as follows:

 

 

Date                                       ID                  Lag_ID             

21FEB2014                            1                   .

21FEB2014                            1                   1

21FEB2014                             1                  1                   

21FEB2014                           1                     1

21FEB2014                           2                     1

21FEB2014                            3                    2

21FEB2014                           3                    3

21FEB2014                           3                   3

 

What I want is the first observation of lag_ID for each ID be '.'

so In need

 

Date                                       ID                  Lag_ID             

21FEB2014                            1                   .

21FEB2014                            1                   1

21FEB2014                             1                  1                   

21FEB2014                           1                     1

21FEB2014                           2                     .

21FEB2014                            3                    .

21FEB2014                           3                    3

21FEB2014                           3                   3

 

i wrote 

data want; set have;

by ID;

if first.ID then Lag_ID = . ;

run;

Am I making a mistake?

  Randy

4 REPLIES 4
PaigeMiller
Diamond | Level 26

What happens if you run that code? Does it do what you want?

--
Paige Miller
mkeintz
PROC Star

Whenever you use a lag function it is updating a queue - i.e. a stack of values in memory, that is retained from observation to observation.  Often the queue is only 1 value in length  (i.e. lag(x)).  But it could be 200 in length (lag200(x)).

 

So ordinarily putting a lag function in a if condition then y=lag(x) will only update the queue when the condition is true.  I.e. it is NOT a look back like in excel.

 

So this would be bad:

 

data want;
  set have;
  by id;
  if not(first.id) then lag_id=lag(id);
run;

because it would contaminate the second instance each id with the value of the preceding id.  Why?  Because the queue was not updated when the first.id case was read.  So a common approach is:

 

 

 

data want;
  set have;
  by id;
  lag_id=lag(id);
  if first.id then lag_id=.;
run;

 

That is ALWAYS UPDATE THE QUEUE, but revise the lag_id variable when necessary.

 

Another approach is to use the IFN function wrapped around a LAG function:

 

data want;
  set have;
  by id;
  lag_id=ifn(first.id,.,lag(id));
run;

The reason this works is because, regardless of whether the first argument is true, both of the alternatives in arguments 2 and 3 are calculated.  Then one of those alternatives is returned by the IFN function, depending upon the first argument.  The benefit in this case is that the lag queue is always updated, but only selectively returned by ifn.

 

 

    lag_id=ifn(first.id,

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
mkeintz
PROC Star

And I shouldn't even have bothered with the lag discourse.  In your case:

 

data want;
  set have;
  by id;
  if first.id=0 then lag_id=id;
run;

which leaves lag_id as missing for the beginning of each id.

 

 

This begs the question - since you already have the first.id dummy variable available, why do you need the lag_id variable?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
VDD
Ammonite | Level 13 VDD
Ammonite | Level 13

@RandyStanhas

What I want is the first observation of lag_ID for each ID be '.'

so In need

 

Date                                       ID                  Lag_ID             

21FEB2014                            1                   .

21FEB2014                            1                   1

21FEB2014                             1                  1                   

21FEB2014                           1                     1

21FEB2014                           2                     .

21FEB2014                            3                    .

21FEB2014                           3                    3

21FEB2014                           3                   3

 

based on the incoming data @RandyStan has the second to the last record fails your request in your sample output data.  it should reset Lag_id reset . based on the request.

 

data have;
input Date  date9.  ID  Lag_ID;
cards; 
21FEB2014                            1                   .
21FEB2014                            1                   1
21FEB2014                             1                  1  
21FEB2014                           1                     1
21FEB2014                           2                     1
21FEB2014                            3                    2
21FEB2014                           3                    3
21FEB2014                           3                   3
;
proc sort data=have;
	by id lag_id date;
run;
data want;
	set have;
	by id lag_id;
	if first.lag_id then lag_id = .;
run;

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 669 views
  • 0 likes
  • 4 in conversation