calculating lag

Accepted Solution Solved
Reply
Regular Contributor
Posts: 183
Accepted Solution

calculating lag

Following is the sample of my dataset. I need to calculate lag of "asset" for each company_id. Then years 2003 and 2013 will have to be dropped. How can I do that?

year   company_id       asset

2003AN80685710863109000
2004AN80685710862997000
2005AN80685710863496000
2006AN80685710862998873
2007AN80685710863169033
2008AN80685710863692000
2009AN80685710864642000
2010AN80685710865006000
2011AN80685710864837000
2012AN80685710866312000
2013AN80685710868478000
2003ANN6748L102726679
2004ANN6748L102729661
2005ANN6748L102765606
2006ANN6748L102725151
2007ANN6748L102728439
2008ANN6748L102718375
2009ANN6748L102717421
2010ANN6748L102727269
2011ANN6748L102762001
2012ANN6748L102739696
2013ANN6748L102739492
2003BMG0129K1045.
2004BMG0129K10450
2005BMG0129K1045741997
2006BMG0129K104558002
2007BMG0129K104513546
2008BMG0129K104580947
2009BMG0129K1045142666
2010BMG0129K1045239957
2011BMG0129K1045295522
2012BMG0129K1045618217
2013BMG0129K1045777386

Accepted Solutions
Solution
‎10-26-2014 09:44 AM
Respected Advisor
Posts: 3,775

Re: calculating lag

I like this version a bit better.

data difasset2;
   set asset;
   by company_id;
   array _v
  •   ca   cl   stdebt;
  •    array _d[3,3] dca1 dcl1 dstdept1 dca2 dcl2 dstdept2 dca3 dcl3 dstdept3;
       do i = 1 to dim(_v);
          _d[1,i] = dif1(_v);
          _d[2,i] = dif2(_v);
          _d[3,i] = dif3(_v);
          end;
      
    if first.company_id then c=0;
       c +
    1;
      
    do i = c to dim(_v);
          do j = 1 to dim(_v);
             call missing(of _d[i,j]);
             end;
         
    end;
      
    drop i c j;
       run;

    View solution in original post


    All Replies
    Super Contributor
    Posts: 490

    Re: calculating lag

    what do you mean by lag.

    Regular Contributor
    Posts: 183

    Re: calculating lag

    Say, these are asset values of the companies at time t. I want to have asset values at t-1. This means, in year 2004, I want the asset value of year 2003.

    Super Contributor
    Posts: 490

    Re: calculating lag

    proc sort data=have;

    by company_id year;

    run;

    data want;

    set have;

    by company_id;

    retain x .;

    if first.company_id then x=.;

    Else if not last.company_id then output;

    x=asset;

    run;

    Regular Contributor
    Posts: 183

    Re: calculating lag

    Thanks. Your code works.

    Super Contributor
    Posts: 251

    Re: calculating lag

    Assuming that the company_id is unique then sort the data set by it.

    proc sort data = have;

    by company_id;

    run;

    There is a SAS LAG() function available. The following code just saves the previous ASSET and writes it on the next observation.

    The empty format statement retains the order of the variables.

    data want;

    format year company_id asset _lag;

    retain prev_asset _lag;

    drop prev_asset;

       do until(last.company_id);

          set have;

          by company_id;

          if first.company_id then do; prev_asset = asset; _lag = .; end;

          else prev_asset = asset;

          output;

          _lag = prev_asset;

       end;

    run;

    Regular Contributor
    Posts: 183

    Re: calculating lag

    Thanks. Your code works as well. But I also need to drop the observations of year 2003 and 2013.

    Respected Advisor
    Posts: 3,775

    Re: calculating lag

    I'm not sure what you mean by 2003 and 2013 will have to be dropped.  I learnt this from support.sas.com.

    data asset;
       infile cards dsd;
      
    input year company_id:$12. asset;
       cards;
    2003,AN8068571086,3109000
    2004,AN8068571086,2997000
    2005,AN8068571086,3496000
    2006,AN8068571086,2998873
    2007,AN8068571086,3169033
    2008,AN8068571086,3692000
    2009,AN8068571086,4642000
    2010,AN8068571086,5006000
    2011,AN8068571086,4837000
    2012,AN8068571086,6312000
    2013,AN8068571086,8478000
    2003,ANN6748L1027,26679
    2004,ANN6748L1027,29661
    2005,ANN6748L1027,65606
    2006,ANN6748L1027,25151
    2007,ANN6748L1027,28439
    2008,ANN6748L1027,18375
    2009,ANN6748L1027,17421
    2010,ANN6748L1027,27269
    2011,ANN6748L1027,62001
    2012,ANN6748L1027,39696
    2013,ANN6748L1027,39492
    2003,BMG0129K1045,.
    2004,BMG0129K1045,0
    2005,BMG0129K1045,741997
    2006,BMG0129K1045,58002
    2007,BMG0129K1045,13546
    2008,BMG0129K1045,80947
    2009,BMG0129K1045,142666
    2010,BMG0129K1045,239957
    2011,BMG0129K1045,295522
    2012,BMG0129K1045,618217
    2013,BMG0129K1045,777386
    ;;;;
       run;
    data lagasset;
       set asset;
       by company_id;
       array lasset[3];
       lasset[1] = lag1(asset);
       lasset[2] = lag2(asset);
       lasset[3] = lag3(asset);
       if first.company_id then c=0;
       c+
    1;
      
    do i = c to dim(lasset);
          call missing(lasset);
          end;
      
    drop i c;
       run;
    proc print;
      
    run;
    10-25-2014 6-18-04 AM.png     
    Regular Contributor
    Posts: 183

    Re: calculating lag

    Thanks. I also need to drop the observations of year 2003 and 2013. So will I just write the two following lines after "drop i c"?

       if year = 2003 then delete;

       if year = 2013 then delete;

    Regular Contributor
    Posts: 183

    Re: calculating lag

    I also need to calculate the change in asset i.e. delta_asset = asset - lasset1.

    If I want to perform these operations (i.e. calculating lag then calculating the change) for more than one variable (say, asset is one variable, debt is another variable etc.) then what will be the code?

    Respected Advisor
    Posts: 3,775

    Re: calculating lag

    This "asset - lasset1" is calculated by DIF function similar to lag but difference.  I don't know what the rest of your question is saying.  Show some code data etc.

    Regular Contributor
    Posts: 183

    Re: calculating lag

    Ok, the following is my sample dataset. ca is current asset, cl is current liability and stdebt is short term debt. So I need to calculate the lca1, lca2, lca3 (which we already did before). Now need to calculate the difference between ca and lca1 .... (which is the change in ca between year 2004 and year 2003). In this way, need to same for cl and stdebt as well. In final output, observations for year 2003 and 2013 will have to be dropped.

    year   companyISIN     ca            cl             stdebt

    2003AN80685710861036900067950001411000
    2004AN806857108670600004701000716000
    2005AN806857108685540005515000796000
    2006AN8068571086918600064550001321000
    2007AN80685710861105500075050001672000
    2008AN80685710861289400081240001597000
    2009AN80685710861365000072590001125000
    2010AN806857108618098000108650002595000
    2011AN806857108620539000105380001377000
    2012AN806857108624156000123680002121000
    2013AN806857108626225000135250002783000
    2003ANN6748L1027117555420878852
    2004ANN6748L1027119830334497468
    2005ANN6748L10271753556605212915
    2006ANN6748L1027177348462682586
    2007ANN6748L1027192332502158252
    2008ANN6748L1027197199538463759
    2009ANN6748L1027209047614613868
    2010ANN6748L1027232885623948450
    2011ANN6748L102731234313541814527
    2012ANN6748L10272580575214812
    2013ANN6748L102720401848645.
    2003BMG0129K1045...
    2004BMG0129K1045793557460
    2005BMG0129K1045745097553813496757
    2006BMG0129K104565682391042185702
    2007BMG0129K1045185031534434945035
    2008BMG0129K104584108799848136738
    2009BMG0129K1045145607868804170089
    2010BMG0129K10452417721110606302223
    2011BMG0129K10452991681080563247228
    2012BMG0129K10456238421079469281611
    2013BMG0129K1045832675602781365325
    Respected Advisor
    Posts: 3,775

    Re: calculating lag

    Do you need DIF1 only or do you need 3 or more like I showed you for LAG1 LAG2 and LAG3.

    How much of the documentation on this subject have you read?

    Regular Contributor
    Posts: 183

    Re: calculating lag

    Sometimes I need DIF1 only, and sometimes I need all three. So it's better to do for all three.

    I can do the DIF1 for one variable (according to your previous code). But now there are three variables (ca, cl and stdebt).

    Respected Advisor
    Posts: 3,775

    Re: calculating lag

    It gets a bit fiddly with multiple lags and variables.  How much of the documentation on this subject have you read?

    data asset;
       infile cards expandtabs firstobs=2;
      
    input year company_id:$12. ca cl stdebt;
       cards;
    year   companyISIN     ca            cl             stdebt
    2003  AN8068571086   10369000 6795000  1411000
    2004  AN8068571086   7060000  4701000  716000
    2005  AN8068571086   8554000  5515000  796000
    2006  AN8068571086   9186000  6455000  1321000
    2007  AN8068571086   11055000 7505000  1672000
    2008  AN8068571086   12894000 8124000  1597000
    2009  AN8068571086   13650000 7259000  1125000
    2010  AN8068571086   18098000 10865000 2595000
    2011  AN8068571086   20539000 10538000 1377000
    2012  AN8068571086   24156000 12368000 2121000
    2013  AN8068571086   26225000 13525000 2783000
    2003  ANN6748L1027   117555   42087 8852
    2004  ANN6748L1027   119830   33449 7468
    2005  ANN6748L1027   175355   66052 12915
    2006  ANN6748L1027   177348   46268 2586
    2007  ANN6748L1027   192332   50215 8252
    2008  ANN6748L1027   197199   53846 3759
    2009  ANN6748L1027   209047   61461 3868
    2010  ANN6748L1027   232885   62394 8450
    2011  ANN6748L1027   312343   135418   14527
    2012  ANN6748L1027   258057   52148 12
    2013  ANN6748L1027   204018   48645 .
    2003  BMG0129K1045   .  .  .
    2004  BMG0129K1045   7935  5746  0
    2005  BMG0129K1045   745097   553813   496757
    2006  BMG0129K1045   65682 391042   185702
    2007  BMG0129K1045   18503 1534434  945035
    2008  BMG0129K1045   84108 799848   136738
    2009  BMG0129K1045   145607   868804   170089
    2010  BMG0129K1045   241772   1110606  302223
    2011  BMG0129K1045   299168   1080563  247228
    2012  BMG0129K1045   623842   1079469  281611
    2013  BMG0129K1045   832675   602781   365325
    ;;;;
       run;
    data difasset;
       set asset;
       by company_id;
       array _v
  •   ca   cl   stdebt;
  •    array dca[3];
       array dcl[3];
       array dstdept[3];
       array _d1
  • dca1 dcl1 dstdept1;
  •    array _d2
  • dca2 dcl2 dstdept2;
  •    array _d3
  • dca3 dcl3 dstdept3;
  •    do i = 1 to dim(_v);
          _d1 = dif1(_v);
          _d2 = dif2(_v);
          _d3 = dif3(_v);
         
    end;
      
    if first.company_id then c=0;
       c +
    1;
      
    do i = c to dim(_v);
          call missing(of dca dcl dstdept);
          end;
      
    drop i c;
       run;
    proc print;
      
    run;
         
    ☑ This topic is SOLVED.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 19 replies
    • 665 views
    • 6 likes
    • 5 in conversation