BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cemccoy
Calcite | Level 5

Hello,

I am working with stock market price data. My columns are date, company_number (co_num), and price (prc).

I need to compute the standard deviation of price for each company across a one, two, three, four, and five year range.

I can do the single year using the following in Base SAS 9.4:

Proc sql;

Create table new as

Select co_num, year(date) as year, std(prc) as std_1_year

From old

Group by co_num, year(date);

Quit;

How do I do the additional years? Thank you for the assistance!

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Not sure if this is what you need, but one way to start the conversation:

libname test "\\xxxxx\temp";

data _o_1;

     set test.old;

     year=year(date);

     drop date;

run;

proc sort data=_o_1 out=_o_2 (keep=co_num year) nodupkey;

     by co_num year;

run;

proc sql;

     create table want as

           select *, (select std(prc) from _o_1 where co_num=a.co_num and year=a.year) as ste_1_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-1 <= year<=a.year) as ste_2_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-2 <= year<=a.year) as ste_3_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-3 <= year<=a.year) as ste_4_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-4 <= year<=a.year) as ste_5_year

           from _o_2 a

     ;

quit;

View solution in original post

9 REPLIES 9
Reeza
Super User

Do you have SAS ETS licences?

If so look into proc expand.

cemccoy
Calcite | Level 5

Reeza,

Thank you for the response. I indeed have access to the expand procedure. I am not sure how to use it in this case. I don't really want to collapse my existing time series. I want to be able to group by more than one year. The problem I encounter is that my multi-year periods overlap. For example 2001 is part of the two year sequence of 2000-2001 and 2001-2002.

stat_sas
Ammonite | Level 13

Please provide some sample data. In your example how this reflects standard deviation based on single year only?

cemccoy
Calcite | Level 5

I added some sample data to the original post.

cemccoy
Calcite | Level 5

I also usually add a unique statement after the select to get just one line for company, year, and standard deviation.

Reeza
Super User

Here's a way using a multi label format and proc means. More code and probably a way to simplify but for starting out it will probably work:

EDIT: Needed to change the loop boundaries from 1 to 5 to 0 to 4 to get proper year counts!

*Get all the years in your dataset;

proc sql;

  create table years as

  select distinct year(date) as year, max(calculated year) as max_year

  from old;

quit;

*Create a format that does the grouping you want;

data year_fmt;

length hlo $2. label $25.;

  set years;

  fmtname='year_fmt';

  hlo='SM';

  start=year;

  do i=0 to 4;

  end=year+i;

  label=catx("-", put(start, 4. -l), put(end, 4. -l));

  if end<=max_year then output;

  end;

  drop i year;

run;

proc format cntlin=year_fmt;

run;

*Add year variable into data set;

data old2;

  set old;

  year=year(date);

run;

*calculate STD for all groups at once;

proc means data=old2 noprint;

class co_num year/mlf;

format year year_fmt.;

ways 2;

var prc;

output out=want std(prc)=std_want;

run;

Haikuo
Onyx | Level 15

Not sure if this is what you need, but one way to start the conversation:

libname test "\\xxxxx\temp";

data _o_1;

     set test.old;

     year=year(date);

     drop date;

run;

proc sort data=_o_1 out=_o_2 (keep=co_num year) nodupkey;

     by co_num year;

run;

proc sql;

     create table want as

           select *, (select std(prc) from _o_1 where co_num=a.co_num and year=a.year) as ste_1_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-1 <= year<=a.year) as ste_2_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-2 <= year<=a.year) as ste_3_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-3 <= year<=a.year) as ste_4_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-4 <= year<=a.year) as ste_5_year

           from _o_2 a

     ;

quit;

cemccoy
Calcite | Level 5

Thank you, this does it. I had to add the following statement:

data want;

set want;

if ste_2_year=ste_1_year then ste_2_year=.;

if ste_3_year=ste_2_year then ste_3_year=.;

if ste_4_year=ste_3_year then ste_4_year=.;

if ste_5_year=ste_4_year then ste_5_year=.;

if ste_1_year=. then ste_2_year=.;

if ste_2_year=. then ste_3_year=.;

if ste_3_year=. then ste_4_year=.;

if ste_4_year=. then ste_5_year=.;

run;

This keeps data that doesn't have the required length of times from returning identical values for 1 year and 2 year sd's.

Haikuo
Onyx | Level 15

In that case, you can also add some logic and have it done in the step of Proc SQL:

proc sql;

  create table want as

  select *, (select std(prc) from _o_1 where co_num=a.co_num and year=a.year) as ste_1_year

  ,(select case when count (distinct year) >1 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-1 <= year<=a.year) as ste_2_year

  ,(select case when count (distinct year) >2 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-2 <= year<=a.year) as ste_3_year

  ,(select case when count (distinct year) >3 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-3 <= year<=a.year) as ste_4_year

  ,(select case when count (distinct year) >4 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-4 <= year<=a.year) as ste_5_year

  from _o_2 a

  ;

quit;

Good Luck,

Haikuo

BTW: Your data step used for clean up is not robust enough. What if you do have identical Stds across multiple years?

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 3528 views
  • 1 like
  • 4 in conversation