cancel
Showing results for 
Search instead for 
Did you mean: 

multiplying observations in a column by group

SOLVED
Adubhai
Obsidian | Level 7
Solved!

multiplying observations in a column by group

Message contains a hyperlink Message contains an attachment

Hello,

 

I have been looking at this post: https://communities.sas.com/t5/General-SAS-Programming/Multiplying-observations-values-in-row-1-by-v...

 

I was unable to post a reply so I have created this new question. I want to do something similar to what the above question posted i.e. multiply values of first observation with second observation and so forth by year or date. The solution given in that question does not seem to work. 

 

The solution code being: 

proc sql;

create table want as

select year, returns, exp(sum(log(returns))) as newcol

from have

group by year;

quit;

 

I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.

 

If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code. 

 

Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?

1 ACCEPTED SOLUTION

Accepted Solutions
Adubhai
Obsidian | Level 7
Solution

Re: multiplying observations in a column by group

Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me. 

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

Re: multiplying observations in a column by group


@Adubhai wrote:

Hello,

I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.

 

If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code. 

 

Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?


 

I do not get the result you report for   0.944444444 * 0.9901960781 * 0.9900990101 * 1).  I get  0.9259259254, both via a data step, and via the SQL code:

 

If you are not getting the same results, then either the NEWCOL is being modified subsequent to the create table statement in your code, or you have other observations in the calculation.  I think it is the latter, because you report using

 

 

group by year

but you should probably be using

 

 

group by yearwk

or more likely in your case

 

 

group by cusip,yearwk

 

 

 

 

You can check on my supposition by adding a variable (n_returns) to the select clause:

 

select year, returns, exp(sum(log(returns))) as newcol, n(returns) as  n_returns

The new variable n_returns will report the number of non-missing returns for each cusip*yearwk group.

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
unison
Lapis Lazuli | Level 10

Re: multiplying observations in a column by group

Can you show what code is giving you the output in weeklyret1? That would be helpful to see what's going on.

 

Here's my code and I get the expected result.

data have;
input id return relreturn;
datalines;
1 . .
1 -0.055556  0.9444444444
1 -0.009804  0.9901960781
1 -0.009901  0.9900990101
1  0.0
;
run;

proc sql;
create table want as
select id, exp(sum(log(relreturn))) as product format=decimal10.10
from have
group by id
;quit;

-unison

-unison
Adubhai
Obsidian | Level 7
Solution

Re: multiplying observations in a column by group

Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me.