DATA Step, Macro, Functions and more

Creating a dataset with avg. standard deviation with rolling time frames

Accepted Solution Solved
Reply
New Contributor
Posts: 4
Accepted Solution

Creating a dataset with avg. standard deviation with rolling time frames

Hello!

I'm new to SAS and I love it! Hopefully I can explain my question clearly.

I'm working with a data set which contains daily security returns (and a variable to mark the dates of the returns). I need to create a data set in the same format with the average standard deviation of the previous five days of returns, including the current day.

I found code at http://support.sas.com/kb/41/380.html on calculating a rolling standard deviation using an array. My university's statistician helped me to adapt the code:

data sub(keep = B);

   set returns;

run;

%let roll_num = 5;

data sub1;

set sub;

   array summed[&roll_num] _temporary_;

   if E = &roll_num then E = 1;

   else E + 1;

   summed = B;

/*start summing once &roll_num values have been read from the data set*/

if _N_ >= &roll_num then do;

    roll_std = std(of summed

  • );
  • end;

    format roll_std comma10.4;

    id = _n_;

    run;

    proc print data = sub1 (obs=10);

    run;

    This produces the following results:

    B
    Eroll_stdid
    0.00551.1
    -0.00982.2
    0.00573.3
    0.00294.4
    0.006950.00695
    0.025210.01256
    0.008120.00887
    0.008130.00868
    0.036140.01329
    0.005050.013510


    This does some of what I want, but it is structured to perform the calculations on one variable only, as the KEEP=option [data sub(keep = B);] drops everything but the second variable in my data set. I think the intent was to use DO loops to perform this calculation on all the variables, but I do not know how to do this or to capture the output that I want (the roll_std variable). This would be one way to complete this problem.

    Alternatively, it may be easier to keep all the variables in the sub data set and expand this code to the entire data set. I tried removing the KEEP= option, but this gives me the following error after the last DATA step: "ERROR: Array subscript out of range" at line xx column y.

    Thanks in advance for any help with this!


    Accepted Solutions
    Solution
    ‎04-09-2015 11:00 AM
    SAS Employee bcs
    SAS Employee
    Posts: 1

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Posted in reply to Douglas26

    Hope this provides the result you are looking for.  Modified the code at http://support.sas.com/kb/41/380.html by adding a macro to support multiple columns.  Code to create the input dataset has also been included so you can see the entire process.  After running this program see the log file for code written by the macro.

    options mprint;

    /* create monthly sales data from January 2008 to December 2010 */

    data work.test;

       do mo_period = '01jan2008'd to '31dec2010'd;

          mo_period = intnx('month', mo_period, 0, 'END');

          /* create 3 colums, one for each value to be evaluated. */

          do eval_column = 'col_1', 'col_2', 'col_3';

             sales = round(ranuni(1234567) * 1000, .01);

             output;

          end;

       end;

       format mo_period monyy7. sales comma10.2;

    run;

    /* confirm output is as expected. */

    proc print data=work.test(obs=20);

    run;

    /* convert data to across format. */

    proc transpose data=work.test out=work.test1(drop=_name_);

       var sales;

       by mo_period;

       id eval_column;

    run;

    /* confirm output is as expected. */

    proc print data=work.test1(obs=10);

    run;

    /*specify the number of periods in the rolling sum and average*/

    %let roll_num = 12;

    /*specify the number of columns to process */

    %let columns = 3;

    %macro tst;

       data work.new;

          set work.test1;

          /*create array with specific number of elements*/

          /*passing _temporary_ arrays as arguments to functions is not supported until SAS 9.2*/

          /*if the array is a _TEMPORARY_ array, the elements are automatically retained*/

          /* for the first observation only, create an array for each column to be evaluated.*/

          if (_n_ = 1) then

             do;

                %do i = 1 %to &columns;

                   array summed_&i [&roll_num] _temporary_;

                   format roll_sum_&i roll_avg_&i comma10.2;

                %end;

             end;

          /*alternatively, a non-temporary array can be used but must be retained:*/

          /*array summed[&roll_num];*/

          /*retain summed;*/

          /*E represents the element of the array to assign a sales value*/

          /*Increment it by one unless it is equal to &roll_num, at which point*/

          /*start over and assign it a value of 1. This causes the oldest period to*/

          /*be replaced by the newest period once &roll_num periods have been read.*/

          if (E = &roll_num) then

             E = 1;

          else

             E + 1;

          %do i = 1 %to &columns;

             /*assign value to proper element of the array*/

             summed_&i = col_&i;

             /*start summing once &roll_num values have been read from the data set*/

             if (_N_ >= &roll_num) then

                do;

                   roll_sum_&i = sum(of summed_&i

  • );
  •                roll_avg_&i = mean(of summed_&i

  • );
  •                roll_std_&i = std(of summed_&i

  • );
  •             end;

          %end;

       run;

    %mend tst;

    %tst;

    options ls=160 ps=30;

    proc print;

    run;

    View solution in original post


    All Replies
    Solution
    ‎04-09-2015 11:00 AM
    SAS Employee bcs
    SAS Employee
    Posts: 1

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Posted in reply to Douglas26

    Hope this provides the result you are looking for.  Modified the code at http://support.sas.com/kb/41/380.html by adding a macro to support multiple columns.  Code to create the input dataset has also been included so you can see the entire process.  After running this program see the log file for code written by the macro.

    options mprint;

    /* create monthly sales data from January 2008 to December 2010 */

    data work.test;

       do mo_period = '01jan2008'd to '31dec2010'd;

          mo_period = intnx('month', mo_period, 0, 'END');

          /* create 3 colums, one for each value to be evaluated. */

          do eval_column = 'col_1', 'col_2', 'col_3';

             sales = round(ranuni(1234567) * 1000, .01);

             output;

          end;

       end;

       format mo_period monyy7. sales comma10.2;

    run;

    /* confirm output is as expected. */

    proc print data=work.test(obs=20);

    run;

    /* convert data to across format. */

    proc transpose data=work.test out=work.test1(drop=_name_);

       var sales;

       by mo_period;

       id eval_column;

    run;

    /* confirm output is as expected. */

    proc print data=work.test1(obs=10);

    run;

    /*specify the number of periods in the rolling sum and average*/

    %let roll_num = 12;

    /*specify the number of columns to process */

    %let columns = 3;

    %macro tst;

       data work.new;

          set work.test1;

          /*create array with specific number of elements*/

          /*passing _temporary_ arrays as arguments to functions is not supported until SAS 9.2*/

          /*if the array is a _TEMPORARY_ array, the elements are automatically retained*/

          /* for the first observation only, create an array for each column to be evaluated.*/

          if (_n_ = 1) then

             do;

                %do i = 1 %to &columns;

                   array summed_&i [&roll_num] _temporary_;

                   format roll_sum_&i roll_avg_&i comma10.2;

                %end;

             end;

          /*alternatively, a non-temporary array can be used but must be retained:*/

          /*array summed[&roll_num];*/

          /*retain summed;*/

          /*E represents the element of the array to assign a sales value*/

          /*Increment it by one unless it is equal to &roll_num, at which point*/

          /*start over and assign it a value of 1. This causes the oldest period to*/

          /*be replaced by the newest period once &roll_num periods have been read.*/

          if (E = &roll_num) then

             E = 1;

          else

             E + 1;

          %do i = 1 %to &columns;

             /*assign value to proper element of the array*/

             summed_&i = col_&i;

             /*start summing once &roll_num values have been read from the data set*/

             if (_N_ >= &roll_num) then

                do;

                   roll_sum_&i = sum(of summed_&i

  • );
  •                roll_avg_&i = mean(of summed_&i

  • );
  •                roll_std_&i = std(of summed_&i

  • );
  •             end;

          %end;

       run;

    %mend tst;

    %tst;

    options ls=160 ps=30;

    proc print;

    run;

    New Contributor
    Posts: 4

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Thanks bcs,

    This was what I was looking for. I had to rename my variables to match col_1, col_2, etc., but once I did that, it worked great.

    Thanks!

    Super User
    Posts: 19,772

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Posted in reply to Douglas26

    Do you have a license for SAS/ETS? If so, look into proc expand which is much easier.

    New Contributor
    Posts: 4

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Reeza,

    I do have a SAS license. I looked into proc expand and I can see what you mean. In the future, I'll have to use it instead of this method.

    Thanks for the helpful suggestion.

    Super User
    Posts: 10,023

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Posted in reply to Douglas26

    What does your data look like , and what does your output look like ?

    New Contributor
    Posts: 4

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Hi Xia,

    My data has a variable with dates, as well as several variables with returns (like column B in my original post). I've now renamed these variables "col_1, col_2, etc." so they work with the macro bcs wrote.

    The output has the same data but also adds variables like roll_std_1, roll_std_2, etc. to match the input variables. This data looks like the roll_std column in my original post.

    Does this answer your question?

    Super User
    Posts: 10,023

    Re: Creating a dataset with avg. standard deviation with rolling time frames

    Posted in reply to Douglas26

    OK. But I would like to use SQL.

    🔒 This topic is solved and locked.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 7 replies
    • 567 views
    • 5 likes
    • 4 in conversation