<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Choosing a lagged value of a variable each time it &amp;quot;updates&amp;quot; in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Choosing-a-lagged-value-of-a-variable-each-time-it-quot-updates/m-p/726079#M225612</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/303550"&gt;@elbarto&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I have a dataset, an excerpt of it is given below:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA have;
id	year	output
;
DATALINES;
1001	1987	.
1001	1988	.
1001	1989	.
1001	1990	5
1001	1991	.
1001	1992	.
1001	1993	.
1001	1994	34
1001	1995	22
1001	1996	33
1001	1997	15
1001	1998	.
1001	1999	.
1001	2000	.
1001	2001	23
1001	2002	.
1001	2003	45
1001	2004	23
1001	2005	12
1001	2006	.
1001	2007	.
1001	2008	.
1001	2009	.
1001	2010	2
1001	2011	.
1001	2012	.
1001	2013	56
1001	2014	.
1001	2015	.
1001	2016	.
1001	2017	.
1001	2018	23
1001	2019	.
1002	1987	34
1002	1988	.
1002	1989	12
1002	1990	13
1002	1991	55
1002	1992	32
1002	1993	.
1002	1994	.
1002	1995	54
1002	1996	64
1002	1997	.
1002	1998	.
1002	1999	23
1002	2000	.
1002	2001	.
1002	2002	.
1002	2003	.
1002	2004	64
1002	2005	12
1002	2006	.
1002	2007	.
1002	2008	.
1002	2009	.
1002	2010	3
1002	2011	.
1002	2012	.
1002	2013	.
1002	2014	.
1002	2015	.
1002	2016	.
1002	2017	4
1002	2018	.
1002	2019	12
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Each id always has year going from 1987 to 2019 (no gaps). I want to "lag" the output variable to produce the following dataset&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA want;
id	year	output	output_prev;
DATALINES;
1001	1987	.	.
1001	1988	.	.
1001	1989	.	.
1001	1990	5	.
1001	1991	.	5
1001	1992	.	5
1001	1993	.	5
1001	1994	34	5
1001	1995	22	34
1001	1996	33	22
1001	1997	15	33
1001	1998	.	15
1001	1999	.	15
1001	2000	.	15
1001	2001	23	15
1001	2002	.	23
1001	2003	45	23
1001	2004	23	45
1001	2005	12	23
1001	2006	.	12
1001	2007	.	12
1001	2008	.	12
1001	2009	.	12
1001	2010	2	12
1001	2011	.	2
1001	2012	.	2
1001	2013	56	2
1001	2014	.	56
1001	2015	.	56
1001	2016	.	56
1001	2017	.	56
1001	2018	23	56
1001	2019	.	23
1002	1987	34	.
1002	1988	.	34
1002	1989	12	34
1002	1990	13	12
1002	1991	55	13
1002	1992	32	55
1002	1993	.	32
1002	1994	.	32
1002	1995	54	32
1002	1996	64	54
1002	1997	.	64
1002	1998	.	64
1002	1999	23	64
1002	2000	.	23
1002	2001	.	23
1002	2002	.	23
1002	2003	.	23
1002	2004	64	23
1002	2005	12	64
1002	2006	.	12
1002	2007	.	12
1002	2008	.	12
1002	2009	.	12
1002	2010	3	12
1002	2011	.	3
1002	2012	.	3
1002	2013	.	3
1002	2014	.	3
1002	2015	.	3
1002	2016	.	3
1002	2017	4	3
1002	2018	.	4
1002	2019	12	4
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;It seems simple but I can't seem to do it. I want to lag the output variable so that each year for a given id always takes the "latest" value of output. As an example, for id=1001, year=1991, output_prev=5 because that was the latest value of output (last updated in year 1990). This continues until we reach year 1995, where output_prev changes to 34 because the latest value of output is 34 (last updated in year 1994).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Please make sure that your example data step code runs. You are missing an INPUT statement.&lt;/P&gt;
&lt;P&gt;For this data you actually want a RETAIN to hold the value across iterations of the data step and know when to write the output record to the data. (BTW naming a variable OUTPUT makes this somewhat confusing as Output is a programming statement as well and could lead to all sorts of miscommunication).&lt;/P&gt;
&lt;P&gt;This seems to create your "want" from the example.&lt;/P&gt;
&lt;PRE&gt;DATA have;
input id	year	output
;
DATALINES;
1001	1987	.
1001	1988	.
1001	1989	.
1001	1990	5
1001	1991	.
1001	1992	.
1001	1993	.
1001	1994	34
1001	1995	22
1001	1996	33
1001	1997	15
1001	1998	.
1001	1999	.
1001	2000	.
1001	2001	23
1001	2002	.
1001	2003	45
1001	2004	23
1001	2005	12
1001	2006	.
1001	2007	.
1001	2008	.
1001	2009	.
1001	2010	2
1001	2011	.
1001	2012	.
1001	2013	56
1001	2014	.
1001	2015	.
1001	2016	.
1001	2017	.
1001	2018	23
1001	2019	.
1002	1987	34
1002	1988	.
1002	1989	12
1002	1990	13
1002	1991	55
1002	1992	32
1002	1993	.
1002	1994	.
1002	1995	54
1002	1996	64
1002	1997	.
1002	1998	.
1002	1999	23
1002	2000	.
1002	2001	.
1002	2002	.
1002	2003	.
1002	2004	64
1002	2005	12
1002	2006	.
1002	2007	.
1002	2008	.
1002	2009	.
1002	2010	3
1002	2011	.
1002	2012	.
1002	2013	.
1002	2014	.
1002	2015	.
1002	2016	.
1002	2017	4
1002	2018	.
1002	2019	12
;

/* assumes sorted by ID, if not*/
proc sort data=have;
   by id year;
run;

data want;
   set have;
   retain out_prev;
   by id;
   if first.id then call missing(out_prev);
   output;
   if not missing(output) then out_prev=output;
run;
&lt;/PRE&gt;
&lt;P&gt;Retain keeps values of variables across the data step boundary.&lt;/P&gt;
&lt;P&gt;Placing the OUTPUT statement to write to the data before assigning the "new" value for output (see what I meant about confusion) means the the record is written with the last value of output_prev.&lt;/P&gt;</description>
    <pubDate>Sun, 14 Mar 2021 06:20:28 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2021-03-14T06:20:28Z</dc:date>
    <item>
      <title>Choosing a lagged value of a variable each time it "updates"</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Choosing-a-lagged-value-of-a-variable-each-time-it-quot-updates/m-p/726076#M225609</link>
      <description>&lt;P&gt;I have a dataset, an excerpt of it is given below:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA have;
id	year	output
;
DATALINES;
1001	1987	.
1001	1988	.
1001	1989	.
1001	1990	5
1001	1991	.
1001	1992	.
1001	1993	.
1001	1994	34
1001	1995	22
1001	1996	33
1001	1997	15
1001	1998	.
1001	1999	.
1001	2000	.
1001	2001	23
1001	2002	.
1001	2003	45
1001	2004	23
1001	2005	12
1001	2006	.
1001	2007	.
1001	2008	.
1001	2009	.
1001	2010	2
1001	2011	.
1001	2012	.
1001	2013	56
1001	2014	.
1001	2015	.
1001	2016	.
1001	2017	.
1001	2018	23
1001	2019	.
1002	1987	34
1002	1988	.
1002	1989	12
1002	1990	13
1002	1991	55
1002	1992	32
1002	1993	.
1002	1994	.
1002	1995	54
1002	1996	64
1002	1997	.
1002	1998	.
1002	1999	23
1002	2000	.
1002	2001	.
1002	2002	.
1002	2003	.
1002	2004	64
1002	2005	12
1002	2006	.
1002	2007	.
1002	2008	.
1002	2009	.
1002	2010	3
1002	2011	.
1002	2012	.
1002	2013	.
1002	2014	.
1002	2015	.
1002	2016	.
1002	2017	4
1002	2018	.
1002	2019	12
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Each id always has year going from 1987 to 2019 (no gaps). I want to "lag" the output variable to produce the following dataset&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA want;
id	year	output	output_prev;
DATALINES;
1001	1987	.	.
1001	1988	.	.
1001	1989	.	.
1001	1990	5	.
1001	1991	.	5
1001	1992	.	5
1001	1993	.	5
1001	1994	34	5
1001	1995	22	34
1001	1996	33	22
1001	1997	15	33
1001	1998	.	15
1001	1999	.	15
1001	2000	.	15
1001	2001	23	15
1001	2002	.	23
1001	2003	45	23
1001	2004	23	45
1001	2005	12	23
1001	2006	.	12
1001	2007	.	12
1001	2008	.	12
1001	2009	.	12
1001	2010	2	12
1001	2011	.	2
1001	2012	.	2
1001	2013	56	2
1001	2014	.	56
1001	2015	.	56
1001	2016	.	56
1001	2017	.	56
1001	2018	23	56
1001	2019	.	23
1002	1987	34	.
1002	1988	.	34
1002	1989	12	34
1002	1990	13	12
1002	1991	55	13
1002	1992	32	55
1002	1993	.	32
1002	1994	.	32
1002	1995	54	32
1002	1996	64	54
1002	1997	.	64
1002	1998	.	64
1002	1999	23	64
1002	2000	.	23
1002	2001	.	23
1002	2002	.	23
1002	2003	.	23
1002	2004	64	23
1002	2005	12	64
1002	2006	.	12
1002	2007	.	12
1002	2008	.	12
1002	2009	.	12
1002	2010	3	12
1002	2011	.	3
1002	2012	.	3
1002	2013	.	3
1002	2014	.	3
1002	2015	.	3
1002	2016	.	3
1002	2017	4	3
1002	2018	.	4
1002	2019	12	4
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;It seems simple but I can't seem to do it. I want to lag the output variable so that each year for a given id always takes the "latest" value of output. As an example, for id=1001, year=1991, output_prev=5 because that was the latest value of output (last updated in year 1990). This continues until we reach year 1995, where output_prev changes to 34 because the latest value of output is 34 (last updated in year 1994).&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 14 Mar 2021 05:59:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Choosing-a-lagged-value-of-a-variable-each-time-it-quot-updates/m-p/726076#M225609</guid>
      <dc:creator>elbarto</dc:creator>
      <dc:date>2021-03-14T05:59:04Z</dc:date>
    </item>
    <item>
      <title>Re: Choosing a lagged value of a variable each time it "updates"</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Choosing-a-lagged-value-of-a-variable-each-time-it-quot-updates/m-p/726079#M225612</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/303550"&gt;@elbarto&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I have a dataset, an excerpt of it is given below:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA have;
id	year	output
;
DATALINES;
1001	1987	.
1001	1988	.
1001	1989	.
1001	1990	5
1001	1991	.
1001	1992	.
1001	1993	.
1001	1994	34
1001	1995	22
1001	1996	33
1001	1997	15
1001	1998	.
1001	1999	.
1001	2000	.
1001	2001	23
1001	2002	.
1001	2003	45
1001	2004	23
1001	2005	12
1001	2006	.
1001	2007	.
1001	2008	.
1001	2009	.
1001	2010	2
1001	2011	.
1001	2012	.
1001	2013	56
1001	2014	.
1001	2015	.
1001	2016	.
1001	2017	.
1001	2018	23
1001	2019	.
1002	1987	34
1002	1988	.
1002	1989	12
1002	1990	13
1002	1991	55
1002	1992	32
1002	1993	.
1002	1994	.
1002	1995	54
1002	1996	64
1002	1997	.
1002	1998	.
1002	1999	23
1002	2000	.
1002	2001	.
1002	2002	.
1002	2003	.
1002	2004	64
1002	2005	12
1002	2006	.
1002	2007	.
1002	2008	.
1002	2009	.
1002	2010	3
1002	2011	.
1002	2012	.
1002	2013	.
1002	2014	.
1002	2015	.
1002	2016	.
1002	2017	4
1002	2018	.
1002	2019	12
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Each id always has year going from 1987 to 2019 (no gaps). I want to "lag" the output variable to produce the following dataset&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;DATA want;
id	year	output	output_prev;
DATALINES;
1001	1987	.	.
1001	1988	.	.
1001	1989	.	.
1001	1990	5	.
1001	1991	.	5
1001	1992	.	5
1001	1993	.	5
1001	1994	34	5
1001	1995	22	34
1001	1996	33	22
1001	1997	15	33
1001	1998	.	15
1001	1999	.	15
1001	2000	.	15
1001	2001	23	15
1001	2002	.	23
1001	2003	45	23
1001	2004	23	45
1001	2005	12	23
1001	2006	.	12
1001	2007	.	12
1001	2008	.	12
1001	2009	.	12
1001	2010	2	12
1001	2011	.	2
1001	2012	.	2
1001	2013	56	2
1001	2014	.	56
1001	2015	.	56
1001	2016	.	56
1001	2017	.	56
1001	2018	23	56
1001	2019	.	23
1002	1987	34	.
1002	1988	.	34
1002	1989	12	34
1002	1990	13	12
1002	1991	55	13
1002	1992	32	55
1002	1993	.	32
1002	1994	.	32
1002	1995	54	32
1002	1996	64	54
1002	1997	.	64
1002	1998	.	64
1002	1999	23	64
1002	2000	.	23
1002	2001	.	23
1002	2002	.	23
1002	2003	.	23
1002	2004	64	23
1002	2005	12	64
1002	2006	.	12
1002	2007	.	12
1002	2008	.	12
1002	2009	.	12
1002	2010	3	12
1002	2011	.	3
1002	2012	.	3
1002	2013	.	3
1002	2014	.	3
1002	2015	.	3
1002	2016	.	3
1002	2017	4	3
1002	2018	.	4
1002	2019	12	4
;
RUN;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;It seems simple but I can't seem to do it. I want to lag the output variable so that each year for a given id always takes the "latest" value of output. As an example, for id=1001, year=1991, output_prev=5 because that was the latest value of output (last updated in year 1990). This continues until we reach year 1995, where output_prev changes to 34 because the latest value of output is 34 (last updated in year 1994).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Please make sure that your example data step code runs. You are missing an INPUT statement.&lt;/P&gt;
&lt;P&gt;For this data you actually want a RETAIN to hold the value across iterations of the data step and know when to write the output record to the data. (BTW naming a variable OUTPUT makes this somewhat confusing as Output is a programming statement as well and could lead to all sorts of miscommunication).&lt;/P&gt;
&lt;P&gt;This seems to create your "want" from the example.&lt;/P&gt;
&lt;PRE&gt;DATA have;
input id	year	output
;
DATALINES;
1001	1987	.
1001	1988	.
1001	1989	.
1001	1990	5
1001	1991	.
1001	1992	.
1001	1993	.
1001	1994	34
1001	1995	22
1001	1996	33
1001	1997	15
1001	1998	.
1001	1999	.
1001	2000	.
1001	2001	23
1001	2002	.
1001	2003	45
1001	2004	23
1001	2005	12
1001	2006	.
1001	2007	.
1001	2008	.
1001	2009	.
1001	2010	2
1001	2011	.
1001	2012	.
1001	2013	56
1001	2014	.
1001	2015	.
1001	2016	.
1001	2017	.
1001	2018	23
1001	2019	.
1002	1987	34
1002	1988	.
1002	1989	12
1002	1990	13
1002	1991	55
1002	1992	32
1002	1993	.
1002	1994	.
1002	1995	54
1002	1996	64
1002	1997	.
1002	1998	.
1002	1999	23
1002	2000	.
1002	2001	.
1002	2002	.
1002	2003	.
1002	2004	64
1002	2005	12
1002	2006	.
1002	2007	.
1002	2008	.
1002	2009	.
1002	2010	3
1002	2011	.
1002	2012	.
1002	2013	.
1002	2014	.
1002	2015	.
1002	2016	.
1002	2017	4
1002	2018	.
1002	2019	12
;

/* assumes sorted by ID, if not*/
proc sort data=have;
   by id year;
run;

data want;
   set have;
   retain out_prev;
   by id;
   if first.id then call missing(out_prev);
   output;
   if not missing(output) then out_prev=output;
run;
&lt;/PRE&gt;
&lt;P&gt;Retain keeps values of variables across the data step boundary.&lt;/P&gt;
&lt;P&gt;Placing the OUTPUT statement to write to the data before assigning the "new" value for output (see what I meant about confusion) means the the record is written with the last value of output_prev.&lt;/P&gt;</description>
      <pubDate>Sun, 14 Mar 2021 06:20:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Choosing-a-lagged-value-of-a-variable-each-time-it-quot-updates/m-p/726079#M225612</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-03-14T06:20:28Z</dc:date>
    </item>
  </channel>
</rss>

