Hi I am trying to get an output which is similar to creating lags but lag functions doesn't give me desired output. Please suggest how to get the output like shown below.
Mainly I want to create a new variable "l_no" which looks back to last year value from "no" variable by id and yr variable (if no value available for last year then missing).
data sin_x;
input id yr no;
datalines;
1 2015 41
1 2014 2
2 2012 3
2 2011 4
3 2010 7
3 2009 0
3 2008 4
4 2007 0
4 2006 7
;
Desired output:
id yr no l_no
1 2015 41 2
1 2014 2 .
2 2012 3 4
2 2011 4 .
3 2010 7 0
3 2009 0 4
3 2008 4 .
4 2007 0 7
4 2006 7 .
Appreciate your help.
data sin_x;
input id yr no;
datalines;
1 2015 41
1 2014 2
2 2012 3
2 2011 4
3 2010 7
3 2009 0
3 2008 4
4 2007 0
4 2006 7
;
run;
data want;
if _n_=1 then do;
if 0 then set sin_x(rename=(yr=_yr no=i_no));
declare hash h(dataset:'sin_x(rename=(yr=_yr no=i_no))');
h.definekey('id','_yr');
h.definedata('i_no');
h.definedone();
end;
call missing(of _all_);
set sin_x;
_yr=yr-1;
rc=h.find();
drop _: rc;
run;
Go for a left join:
proc sql;
create table desired as
select a.id, a.yr, b.no as l_no
from sin_x as a left join sin_x as b on a.id=b.id and a.yr=b.yr+1
order by id, yr desc;
select * from desired;
quit;
If you like a data step program here you go:
The maximum group size is set to 3. This can be changed to the actual max number. It can be programatically found.
data sin_x;
input id yr no;
datalines;
1 2015 41
1 2014 2
2 2012 3
2 2011 4
3 2010 7
3 2009 0
3 2008 4
4 2007 0
4 2006 7
;
run;
%let groupMax = 3;
data want;
array k[&groupMax] _temporary_;
do i = 1 by 1 until(last.id);
set sin_x;
by id;
k[i] = no;
end;
do j = 1 to dim(k) -1;
l_no = k[j + 1];
output;
end;
call missing(of k[*]);
drop i j;
run;
proc print data = want;
run;
Sort data by ID year.
Use Lag, pay attention to id
resort by id descending year
data sin_x;
input id yr no;
datalines;
1 2015 41
1 2014 2
2 2012 3
2 2011 4
3 2010 7
3 2009 0
3 2008 4
4 2007 0
4 2006 7
;
run;
data want;
if _n_=1 then do;
if 0 then set sin_x(rename=(yr=_yr no=i_no));
declare hash h(dataset:'sin_x(rename=(yr=_yr no=i_no))');
h.definekey('id','_yr');
h.definedata('i_no');
h.definedone();
end;
call missing(of _all_);
set sin_x;
_yr=yr-1;
rc=h.find();
drop _: rc;
run;
LAG does exactly what you want. You just have your data sorted incorrectly. Also remember to call LAG() on every observation.
proc sort data=have; by id yr; run;
data want ;
set have ;
by id yr ;
lag_no=lag(no);
if not first.id then l_no=lag_no;
run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.