BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
daradanye
Obsidian | Level 7

Hi all,

 

I need SAS to calculate rolling standard deviation and number of observation it has used. Instead of controlling for number of observations I need to use (I know I can use expand in this case), I want to use all observations prior that year.  Since my dataset is very large, proc sql may be very slow.  I will appreciate it very much if someone can help me with this!

 

 

This is the data I have

firmyearroa
119990.05
120000.048
120010.01
120020.02
219990.024
220000.07
220010.065

 

This is the data I want:

firmyearroastdcount
119990.05  
120000.0480.0014142
120010.010.0225393
120020.020.0200674
219990.024  
220000.070.0325272
220010.0650.0252393

 

Many thanks!

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
1) You need 10 temporary array.
2)
data HAVE;
infile cards expandtabs;
input FIRM YEAR ROA bm;
cards;
1	1999	0.05	3.2
1	2000	0.048	 .
1	2001	 .	4.4
1	2002	0.02	2.5
2	1999	0.024	 .
2	2000	 .	2
2	2001	0.065	3
;
run;

data want;
 set have;
 by firm;
 array x1{999} _temporary_;
 array x2{999} _temporary_;

 if first.firm then do;n=0;call missing(of x1{*} x2{*]);end;
 n+1;
 x1{n}=roa;
 x2{n}=bm;
 count_roa=n(of x1{*});
 std_roa=std(of x1{*});
 count_bm=n(of x2{*});
 std_bm=std(of x2{*});
drop n;
run;




View solution in original post

6 REPLIES 6
ChrisNZ
Tourmaline | Level 20

Like this?

data HAVE;
input FIRM YEAR ROA;
cards;
1 1999 0.05
1 2000 0.048
1 2001 0.01
1 2002 0.02
2 1999 0.024
2 2000 0.07
2 2001 0.065
run;
proc format ;
 value prevyr (multilabel) 1999      ='Pre1999'
                           1999-2000 ='Pre2000'
                           1999-2001 ='Pre2001'
                           1999-2002 ='Pre2002'
                           1999-2003 ='Pre2003'
                           1999-2004 ='Pre2004';
run;
proc summary nway; 
  class FIRM ; 
  class YEAR / mlf; 
  var ROA; 
  output out=WANT std=;
  format YEAR prevyr.;
run;

 

FIRM YEAR _TYPE_ _FREQ_ ROA
1 Pre1999 3 1 .
1 Pre2000 3 2 0.001414
1 Pre2001 3 3 0.022539
1 Pre2002 3 4 0.020067
1 Pre2003 3 4 0.020067
1 Pre2004 3 4 0.020067
2 Pre1999 3 1 .
2 Pre2000 3 2 0.032527
2 Pre2001 3 3 0.025239
2 Pre2002 3 3 0.025239
2 Pre2003 3 3 0.025239
2 Pre2004 3 3 0.025239

 

PGStats
Opal | Level 21

DIY method

data want;
count = 0;
s1 = 0;
s2 = 0;
do until(last.firm);
    set have; by firm;
    if not missing(roa) then do;
        count = count + 1;
        s1 = s1 + roa;
        s2 = s2 + roa*roa;
        if count > 1 then std = sqrt((count*s2-s1*s1)/(count*(count-1)));
        end;
    output;
    end;
drop s1 s2;
run;
PG
s_lassen
Meteorite | Level 14

It can be done using a datastep, e.g.:

data want;
  do until(last.firm);
    set have;
    by firm year;
	_sum+roa;
	_sqsum+roa*roa;
	count+not missing(roa);
	if last.year then do;
	  if count<2 then
	    std=.;
	  else std=sqrt((count*_sqsum-_sum*_sum)/(count*(count-1)));
	  end;
	  output;
	end;
  call missing (count,_sum,_sqsum);
  drop _s:;
run;
Ksharp
Super User
data HAVE;
input FIRM YEAR ROA;
cards;
1 1999 0.05
1 2000 0.048
1 2001 0.01
1 2002 0.02
2 1999 0.024
2 2000 0.07
2 2001 0.065
;
run;

data want;
 set have;
 by firm;
 array x{999} _temporary_;
 if first.firm then do;n=0;call missing(of x{*});end;
 n+1;
 x{n}=roa;
 std=std(of x{*});
 
run;

proc print;run;
 
 
daradanye
Obsidian | Level 7

Hi Ksharp,

 

Thanks so much !  But I still have two more questions:

 

(1)  What if I have more than one variables (say about 10 variables) that needs to calculate std?

 

(2) What if there are missing values and I only want to calculate the number of observations that have been used to calculate standard deviation? 

 

The data I have:

 

firmyearroabm
119990.053.2
120000.048 
12001 4.4
120020.022.5
219990.024 
22000 2
220010.065

3

 

 

The data I want:

firmyearroastdroacountroabmstdbmcountbm
119990.05  3.2 1
120000.0480.00142  1
12001 0.001424.40.848532
120020.020.016832.50.96093
219990.024     
22000  12 2
220010.0650.029230.707113

 

 

Many Thanks!

Dara

Ksharp
Super User
1) You need 10 temporary array.
2)
data HAVE;
infile cards expandtabs;
input FIRM YEAR ROA bm;
cards;
1	1999	0.05	3.2
1	2000	0.048	 .
1	2001	 .	4.4
1	2002	0.02	2.5
2	1999	0.024	 .
2	2000	 .	2
2	2001	0.065	3
;
run;

data want;
 set have;
 by firm;
 array x1{999} _temporary_;
 array x2{999} _temporary_;

 if first.firm then do;n=0;call missing(of x1{*} x2{*]);end;
 n+1;
 x1{n}=roa;
 x2{n}=bm;
 count_roa=n(of x1{*});
 std_roa=std(of x1{*});
 count_bm=n(of x2{*});
 std_bm=std(of x2{*});
drop n;
run;




How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 2411 views
  • 3 likes
  • 5 in conversation