Solved: Re: multiplying observations in a column by group

Adubhai · Posted 11-27-2019 08:51 PM

Hello,

I have been looking at this post: https://communities.sas.com/t5/General-SAS-Programming/Multiplying-observations-values-in-row-1-by-v...

I was unable to post a reply so I have created this new question. I want to do something similar to what the above question posted i.e. multiply values of first observation with second observation and so forth by year or date. The solution given in that question does not seem to work.

The solution code being:

proc sql;

create table want as

select year, returns, exp(sum(log(returns))) as newcol

from have

group by year;

quit;

I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.

If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code.

Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?

Adubhai · Posted 12-02-2019 02:36 AM

Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me.

View solution in original post

mkeintz · Posted 11-28-2019 12:15 AM

@Adubhai wrote:

Hello,

I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.

If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code.

Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?

I do not get the result you report for 0.944444444 * 0.9901960781 * 0.9900990101 * 1). I get 0.9259259254, both via a data step, and via the SQL code:

If you are not getting the same results, then either the NEWCOL is being modified subsequent to the create table statement in your code, or you have other observations in the calculation. I think it is the latter, because you report using

group by year

but you should probably be using

group by yearwk

or more likely in your case

group by cusip,yearwk

You can check on my supposition by adding a variable (n_returns) to the select clause:

select year, returns, exp(sum(log(returns))) as newcol, n(returns) as  n_returns

The new variable n_returns will report the number of non-missing returns for each cusip*yearwk group.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------

unison · Posted 11-28-2019 12:22 AM

Can you show what code is giving you the output in weeklyret1? That would be helpful to see what's going on.

Here's my code and I get the expected result.

data have;
input id return relreturn;
datalines;
1 . .
1 -0.055556  0.9444444444
1 -0.009804  0.9901960781
1 -0.009901  0.9900990101
1  0.0
;
run;

proc sql;
create table want as
select id, exp(sum(log(relreturn))) as product format=decimal10.10
from have
group by id
;quit;

-unison

Adubhai · Posted 12-02-2019 02:36 AM

Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me.

multiplying observations in a column by group

Re: multiplying observations in a column by group

Re: multiplying observations in a column by group

Re: multiplying observations in a column by group

Re: multiplying observations in a column by group

SAS Innovate 2026 Registration is Open

SAS Training: Just a Click Away