Hello,
I have a dataset containing firms' monthly returns for ten years. I need to measure the semi variance of each firm in each year. To do that I need to calculate the standard deviation of those monthly returns that are less than mean of the returns in a given year for each year and for each firm. My data sets is similar to the following:
firm date returns
1 20100131 0.2
1 20100228 -0.1
. . .
. . .
. . .
1 20171230 0.3
2 20100131 0.1
2 20100228 0.04
. . .
. . .
. . .
2 20171230 -0.02
.
.
.
Could you please help me how to do that as I am a new SAS user. I am using SAS 9.4. I really appreciate it.
Aliz
Here is an approach:
First, some test data:
data have;
do firm=1 to 10;
do year=2009 to 2019;
do month=1 to 12;
date=mdy(month,1,year);
returns=round(ranuni(414)*2-0.6,0.01);
output;
end;
end;
end;
format date yymmdd10.;
keep firm date returns;
run;
Then calculate the year:
data have_y;
set have;
year=year(date);
run;
For the final step, the data must sorted by firm and year:
data want;
array year_returns(12) 8 _temporary_;
do _N_=1 by 1 until(last.year);
set have_y;
by firm year;
year_returns(_N_)=returns;
end;
avg_return=mean(of year_returns(*));
do _N_=1 to dim(year_returns);
if year_returns(_N_)>=avg_return then
call missing(year_returns(_N_));
end;
semi_variance=var(of year_returns(*));
keep firm year avg_return semi_variance;
run;
The idea is very simple: put the returns in an array, calculate the average. Blank out (call missing) those that are average or above. Calculate the variance of the rest.
Here is an approach:
First, some test data:
data have;
do firm=1 to 10;
do year=2009 to 2019;
do month=1 to 12;
date=mdy(month,1,year);
returns=round(ranuni(414)*2-0.6,0.01);
output;
end;
end;
end;
format date yymmdd10.;
keep firm date returns;
run;
Then calculate the year:
data have_y;
set have;
year=year(date);
run;
For the final step, the data must sorted by firm and year:
data want;
array year_returns(12) 8 _temporary_;
do _N_=1 by 1 until(last.year);
set have_y;
by firm year;
year_returns(_N_)=returns;
end;
avg_return=mean(of year_returns(*));
do _N_=1 to dim(year_returns);
if year_returns(_N_)>=avg_return then
call missing(year_returns(_N_));
end;
semi_variance=var(of year_returns(*));
keep firm year avg_return semi_variance;
run;
The idea is very simple: put the returns in an array, calculate the average. Blank out (call missing) those that are average or above. Calculate the variance of the rest.
Hello, Thank you very much for your help.
I have a question. I am actually not sure what the first part of your code indicate. May I ask you to elaborate a bit more on it? my real data set has around 200,000 firms (more or less in different years). How should I adjust the code to take it into account?
Best,
Aliz
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.