BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
daradanye
Obsidian | Level 7

Hi all,

 

I need SAS to calculate rolling standard deviation and number of observation it has used. Instead of controlling for number of observations I need to use (I know I can use expand in this case), I want to use all observations prior that year.  Since my dataset is very large, proc sql may be very slow.  I will appreciate it very much if someone can help me with this!

 

 

This is the data I have

firmyearroa
119990.05
120000.048
120010.01
120020.02
219990.024
220000.07
220010.065

 

This is the data I want:

firmyearroastdcount
119990.05  
120000.0480.0014142
120010.010.0225393
120020.020.0200674
219990.024  
220000.070.0325272
220010.0650.0252393

 

Many thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
1) You need 10 temporary array.
2)
data HAVE;
infile cards expandtabs;
input FIRM YEAR ROA bm;
cards;
1	1999	0.05	3.2
1	2000	0.048	 .
1	2001	 .	4.4
1	2002	0.02	2.5
2	1999	0.024	 .
2	2000	 .	2
2	2001	0.065	3
;
run;

data want;
 set have;
 by firm;
 array x1{999} _temporary_;
 array x2{999} _temporary_;

 if first.firm then do;n=0;call missing(of x1{*} x2{*]);end;
 n+1;
 x1{n}=roa;
 x2{n}=bm;
 count_roa=n(of x1{*});
 std_roa=std(of x1{*});
 count_bm=n(of x2{*});
 std_bm=std(of x2{*});
drop n;
run;




View solution in original post

6 REPLIES 6
ChrisNZ
Tourmaline | Level 20

Like this?

data HAVE;
input FIRM YEAR ROA;
cards;
1 1999 0.05
1 2000 0.048
1 2001 0.01
1 2002 0.02
2 1999 0.024
2 2000 0.07
2 2001 0.065
run;
proc format ;
 value prevyr (multilabel) 1999      ='Pre1999'
                           1999-2000 ='Pre2000'
                           1999-2001 ='Pre2001'
                           1999-2002 ='Pre2002'
                           1999-2003 ='Pre2003'
                           1999-2004 ='Pre2004';
run;
proc summary nway; 
  class FIRM ; 
  class YEAR / mlf; 
  var ROA; 
  output out=WANT std=;
  format YEAR prevyr.;
run;

 

FIRM YEAR _TYPE_ _FREQ_ ROA
1 Pre1999 3 1 .
1 Pre2000 3 2 0.001414
1 Pre2001 3 3 0.022539
1 Pre2002 3 4 0.020067
1 Pre2003 3 4 0.020067
1 Pre2004 3 4 0.020067
2 Pre1999 3 1 .
2 Pre2000 3 2 0.032527
2 Pre2001 3 3 0.025239
2 Pre2002 3 3 0.025239
2 Pre2003 3 3 0.025239
2 Pre2004 3 3 0.025239

 

PGStats
Opal | Level 21

DIY method

data want;
count = 0;
s1 = 0;
s2 = 0;
do until(last.firm);
    set have; by firm;
    if not missing(roa) then do;
        count = count + 1;
        s1 = s1 + roa;
        s2 = s2 + roa*roa;
        if count > 1 then std = sqrt((count*s2-s1*s1)/(count*(count-1)));
        end;
    output;
    end;
drop s1 s2;
run;
PG
s_lassen
Meteorite | Level 14

It can be done using a datastep, e.g.:

data want;
  do until(last.firm);
    set have;
    by firm year;
	_sum+roa;
	_sqsum+roa*roa;
	count+not missing(roa);
	if last.year then do;
	  if count<2 then
	    std=.;
	  else std=sqrt((count*_sqsum-_sum*_sum)/(count*(count-1)));
	  end;
	  output;
	end;
  call missing (count,_sum,_sqsum);
  drop _s:;
run;
Ksharp
Super User
data HAVE;
input FIRM YEAR ROA;
cards;
1 1999 0.05
1 2000 0.048
1 2001 0.01
1 2002 0.02
2 1999 0.024
2 2000 0.07
2 2001 0.065
;
run;

data want;
 set have;
 by firm;
 array x{999} _temporary_;
 if first.firm then do;n=0;call missing(of x{*});end;
 n+1;
 x{n}=roa;
 std=std(of x{*});
 
run;

proc print;run;
 
 
daradanye
Obsidian | Level 7

Hi Ksharp,

 

Thanks so much !  But I still have two more questions:

 

(1)  What if I have more than one variables (say about 10 variables) that needs to calculate std?

 

(2) What if there are missing values and I only want to calculate the number of observations that have been used to calculate standard deviation? 

 

The data I have:

 

firmyearroabm
119990.053.2
120000.048 
12001 4.4
120020.022.5
219990.024 
22000 2
220010.065

3

 

 

The data I want:

firmyearroastdroacountroabmstdbmcountbm
119990.05  3.2 1
120000.0480.00142  1
12001 0.001424.40.848532
120020.020.016832.50.96093
219990.024     
22000  12 2
220010.0650.029230.707113

 

 

Many Thanks!

Dara

Ksharp
Super User
1) You need 10 temporary array.
2)
data HAVE;
infile cards expandtabs;
input FIRM YEAR ROA bm;
cards;
1	1999	0.05	3.2
1	2000	0.048	 .
1	2001	 .	4.4
1	2002	0.02	2.5
2	1999	0.024	 .
2	2000	 .	2
2	2001	0.065	3
;
run;

data want;
 set have;
 by firm;
 array x1{999} _temporary_;
 array x2{999} _temporary_;

 if first.firm then do;n=0;call missing(of x1{*} x2{*]);end;
 n+1;
 x1{n}=roa;
 x2{n}=bm;
 count_roa=n(of x1{*});
 std_roa=std(of x1{*});
 count_bm=n(of x2{*});
 std_bm=std(of x2{*});
drop n;
run;




sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1424 views
  • 3 likes
  • 5 in conversation