Hello,
I have been looking at this post: https://communities.sas.com/t5/General-SAS-Programming/Multiplying-observations-values-in-row-1-by-v...
I was unable to post a reply so I have created this new question. I want to do something similar to what the above question posted i.e. multiply values of first observation with second observation and so forth by year or date. The solution given in that question does not seem to work.
The solution code being:
proc sql;
create table want as
select year, returns, exp(sum(log(returns))) as newcol
from have
group by year;
quit;
I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.
If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code.
Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?
Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me.
@Adubhai wrote:
Hello,
I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.
If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code.
Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?
I do not get the result you report for 0.944444444 * 0.9901960781 * 0.9900990101 * 1). I get 0.9259259254, both via a data step, and via the SQL code:
If you are not getting the same results, then either the NEWCOL is being modified subsequent to the create table statement in your code, or you have other observations in the calculation. I think it is the latter, because you report using
group by year
but you should probably be using
group by yearwk
or more likely in your case
group by cusip,yearwk
You can check on my supposition by adding a variable (n_returns) to the select clause:
select year, returns, exp(sum(log(returns))) as newcol, n(returns) as n_returns
The new variable n_returns will report the number of non-missing returns for each cusip*yearwk group.
Can you show what code is giving you the output in weeklyret1? That would be helpful to see what's going on.
Here's my code and I get the expected result.
data have;
input id return relreturn;
datalines;
1 . .
1 -0.055556 0.9444444444
1 -0.009804 0.9901960781
1 -0.009901 0.9900990101
1 0.0
;
run;
proc sql;
create table want as
select id, exp(sum(log(relreturn))) as product format=decimal10.10
from have
group by id
;quit;
-unison
Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.