BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Adubhai
Obsidian | Level 7

Hello,

 

I have been looking at this post: https://communities.sas.com/t5/General-SAS-Programming/Multiplying-observations-values-in-row-1-by-v...

 

I was unable to post a reply so I have created this new question. I want to do something similar to what the above question posted i.e. multiply values of first observation with second observation and so forth by year or date. The solution given in that question does not seem to work. 

 

The solution code being: 

proc sql;

create table want as

select year, returns, exp(sum(log(returns))) as newcol

from have

group by year;

quit;

 

I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.

 

If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code. 

 

Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?

1 ACCEPTED SOLUTION

Accepted Solutions
Adubhai
Obsidian | Level 7

Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me. 

View solution in original post

3 REPLIES 3
mkeintz
PROC Star

@Adubhai wrote:

Hello,

I manually calculated for one group in my dataset and saw that the result of the product of the observations is not the same as the result given by the sql code. However when I calculated exp(sum(log(returns))) manually and it matches with the result give by the code. This means that exp(sum(log(returns))) does not calculate the product of the observations in a column.

 

If you take a look at the attached screenshot. The dataset in the top/background (weeklyret1) is the result of sql code and the dataset in the front/bottom (weeklyret) is the "have". I have calculated the product of (0.944444444X0.9901960781X0.9900990101X1) for cusip-00002130, fyear-1990, week-33 and the result is 0.92675926037. But the sql code gave the result -0.090601481. But when I calculated exp((log0.944444444)+(log0.9901960781)+(log0.9900990101)+(log1)) then the result matches with that of the sql code. 

 

Can someone shed some light in this? And possibly give me a correct way of multiplying observations of a column by group?


 

I do not get the result you report for   0.944444444 * 0.9901960781 * 0.9900990101 * 1).  I get  0.9259259254, both via a data step, and via the SQL code:

 

If you are not getting the same results, then either the NEWCOL is being modified subsequent to the create table statement in your code, or you have other observations in the calculation.  I think it is the latter, because you report using

 

 

group by year

but you should probably be using

 

 

group by yearwk

or more likely in your case

 

 

group by cusip,yearwk

 

 

 

 

You can check on my supposition by adding a variable (n_returns) to the select clause:

 

select year, returns, exp(sum(log(returns))) as newcol, n(returns) as  n_returns

The new variable n_returns will report the number of non-missing returns for each cusip*yearwk group.

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
unison
Lapis Lazuli | Level 10

Can you show what code is giving you the output in weeklyret1? That would be helpful to see what's going on.

 

Here's my code and I get the expected result.

data have;
input id return relreturn;
datalines;
1 . .
1 -0.055556  0.9444444444
1 -0.009804  0.9901960781
1 -0.009901  0.9900990101
1  0.0
;
run;

proc sql;
create table want as
select id, exp(sum(log(relreturn))) as product format=decimal10.10
from have
group by id
;quit;

-unison

-unison
Adubhai
Obsidian | Level 7

Thanks for the suggestions. Actually I solved the problem another way. I took the arithmetic mean and raised it to the power of the frequency of each group. This does not give the accurate result of the product of observations but it gives very close results and that will work fine with me. 

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 3602 views
  • 0 likes
  • 3 in conversation