BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cemccoy
Calcite | Level 5

Hello,

I am working with stock market price data. My columns are date, company_number (co_num), and price (prc).

I need to compute the standard deviation of price for each company across a one, two, three, four, and five year range.

I can do the single year using the following in Base SAS 9.4:

Proc sql;

Create table new as

Select co_num, year(date) as year, std(prc) as std_1_year

From old

Group by co_num, year(date);

Quit;

How do I do the additional years? Thank you for the assistance!

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

Not sure if this is what you need, but one way to start the conversation:

libname test "\\xxxxx\temp";

data _o_1;

     set test.old;

     year=year(date);

     drop date;

run;

proc sort data=_o_1 out=_o_2 (keep=co_num year) nodupkey;

     by co_num year;

run;

proc sql;

     create table want as

           select *, (select std(prc) from _o_1 where co_num=a.co_num and year=a.year) as ste_1_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-1 <= year<=a.year) as ste_2_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-2 <= year<=a.year) as ste_3_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-3 <= year<=a.year) as ste_4_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-4 <= year<=a.year) as ste_5_year

           from _o_2 a

     ;

quit;

View solution in original post

9 REPLIES 9
Reeza
Super User

Do you have SAS ETS licences?

If so look into proc expand.

cemccoy
Calcite | Level 5

Reeza,

Thank you for the response. I indeed have access to the expand procedure. I am not sure how to use it in this case. I don't really want to collapse my existing time series. I want to be able to group by more than one year. The problem I encounter is that my multi-year periods overlap. For example 2001 is part of the two year sequence of 2000-2001 and 2001-2002.

stat_sas
Ammonite | Level 13

Please provide some sample data. In your example how this reflects standard deviation based on single year only?

cemccoy
Calcite | Level 5

I added some sample data to the original post.

cemccoy
Calcite | Level 5

I also usually add a unique statement after the select to get just one line for company, year, and standard deviation.

Reeza
Super User

Here's a way using a multi label format and proc means. More code and probably a way to simplify but for starting out it will probably work:

EDIT: Needed to change the loop boundaries from 1 to 5 to 0 to 4 to get proper year counts!

*Get all the years in your dataset;

proc sql;

  create table years as

  select distinct year(date) as year, max(calculated year) as max_year

  from old;

quit;

*Create a format that does the grouping you want;

data year_fmt;

length hlo $2. label $25.;

  set years;

  fmtname='year_fmt';

  hlo='SM';

  start=year;

  do i=0 to 4;

  end=year+i;

  label=catx("-", put(start, 4. -l), put(end, 4. -l));

  if end<=max_year then output;

  end;

  drop i year;

run;

proc format cntlin=year_fmt;

run;

*Add year variable into data set;

data old2;

  set old;

  year=year(date);

run;

*calculate STD for all groups at once;

proc means data=old2 noprint;

class co_num year/mlf;

format year year_fmt.;

ways 2;

var prc;

output out=want std(prc)=std_want;

run;

Haikuo
Onyx | Level 15

Not sure if this is what you need, but one way to start the conversation:

libname test "\\xxxxx\temp";

data _o_1;

     set test.old;

     year=year(date);

     drop date;

run;

proc sort data=_o_1 out=_o_2 (keep=co_num year) nodupkey;

     by co_num year;

run;

proc sql;

     create table want as

           select *, (select std(prc) from _o_1 where co_num=a.co_num and year=a.year) as ste_1_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-1 <= year<=a.year) as ste_2_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-2 <= year<=a.year) as ste_3_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-3 <= year<=a.year) as ste_4_year

                ,(select std(prc) from _o_1 where co_num=a.co_num and a.year-4 <= year<=a.year) as ste_5_year

           from _o_2 a

     ;

quit;

cemccoy
Calcite | Level 5

Thank you, this does it. I had to add the following statement:

data want;

set want;

if ste_2_year=ste_1_year then ste_2_year=.;

if ste_3_year=ste_2_year then ste_3_year=.;

if ste_4_year=ste_3_year then ste_4_year=.;

if ste_5_year=ste_4_year then ste_5_year=.;

if ste_1_year=. then ste_2_year=.;

if ste_2_year=. then ste_3_year=.;

if ste_3_year=. then ste_4_year=.;

if ste_4_year=. then ste_5_year=.;

run;

This keeps data that doesn't have the required length of times from returning identical values for 1 year and 2 year sd's.

Haikuo
Onyx | Level 15

In that case, you can also add some logic and have it done in the step of Proc SQL:

proc sql;

  create table want as

  select *, (select std(prc) from _o_1 where co_num=a.co_num and year=a.year) as ste_1_year

  ,(select case when count (distinct year) >1 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-1 <= year<=a.year) as ste_2_year

  ,(select case when count (distinct year) >2 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-2 <= year<=a.year) as ste_3_year

  ,(select case when count (distinct year) >3 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-3 <= year<=a.year) as ste_4_year

  ,(select case when count (distinct year) >4 then std(prc) else . end from _o_1 where co_num=a.co_num and a.year-4 <= year<=a.year) as ste_5_year

  from _o_2 a

  ;

quit;

Good Luck,

Haikuo

BTW: Your data step used for clean up is not robust enough. What if you do have identical Stds across multiple years?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 2018 views
  • 1 like
  • 4 in conversation