- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I have a dataset with 2 different groups. Below is a sample and a PROC SQL for what calculation I need to preform:
DATA Have;
INPUT Grp1 $ Grp2 $ Scenario Value;
DATALINES;
A X 1 1.3
A X 2 0.8
A X 3 1
A X 4 0.9
A Y 1 0.7
A Y 2 1.2
A Y 3 0.8
A Y 4 0.9
A Z 1 0.7
A Z 2 0.8
A Z 3 1.1
A Z 4 0.7
B X 1 1
B X 2 0.7
B X 3 1.2
B X 4 1.3
B Y 1 1
B Y 2 0.9
B Y 3 0.7
B Y 4 1.1
;RUN;
PROC SQL;
CREATE TABLE Want AS
SELECT Grp1,
Scenario,
(PRODUCT(Value) - 1 ) - ( SUM(Value) - COUNT(Grp2) ) AS Result
FROM Have
GROUP BY Grp1, Scenario;
QUIT;
I need to perform the 'Result' calculation in the 'Want' table. However, there doesn't appear to be a PRODUCT function? Am I missing something really obvious here?
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
From people smarter than me:
http://www.sascommunity.org/wiki/Computing_Products
And I would probably drop down to a data step solution - or IML if you're so inclined. I usually do data step because I can then add in multiple types of calculations at once.
DATA Have; INPUT Grp1 $ Grp2 $ Scenario Value; DATALINES; A X 1 1.3 A X 2 0.8 A X 3 1 A X 4 0.9 A Y 1 0.7 A Y 2 1.2 A Y 3 0.8 A Y 4 0.9 A Z 1 0.7 A Z 2 0.8 A Z 3 1.1 A Z 4 0.7 B X 1 1 B X 2 0.7 B X 3 1.2 B X 4 1.3 B Y 1 1 B Y 2 0.9 B Y 3 0.7 B Y 4 1.1 ;RUN; proc sort data=have; by grp1 Scenario; run; data want; set have; by grp1 Scenario; retain running_Total running_product counter; if first.Scenario then do; result=0; running_Total=0; running_product=0; counter=0; end; running_total=running_total + value; running_product = running_product*value; if not missing(Value) then counter+1; if last.Scenario then do; result = (running_product - 1) - (running_total - counter); output; end; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What's a PRODUCT() function?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Same thing as a SUM function, but using multiplication as the operator instead of addition
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Assuming this is the same definition you're using, you can use the same workaround:
http://michaeljswart.com/2011/03/the-aggregate-function-product/
EXP(SUM(LOG(field)))
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
I saw that workaround before posting, but it doesn't handle 0's or missing values very well. I wasn't sure if maybe SAS had addressed this since that workaround was posted. I must say I'm somewhat surprised that SAS doesn't have a PRODUCT function. If it can add N numbers together with a SUM function, why can't it multiply N numbers together with a PRODUCT function??
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
@SASaholic629 wrote:
I saw that workaround before posting, but it doesn't handle 0's or missing values very well. I wasn't sure if maybe SAS had addressed this since that workaround was posted. I must say I'm somewhat surprised that SAS doesn't have a PRODUCT function. If it can add N numbers together with a SUM function, why can't it multiply N numbers together with a PRODUCT function??
And in 15 years I've never come across a need for that particular calculation either in SQL or SAS...there are multiple other ways to calculate things as well. If you explain what you're trying to do there's likely another option. For example, PROC MEANS will do a lot of summary statistics and then there are other PROCs that will calculate time series type data.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
When asking questions on here, I find it best to keep it as simple as possible. I thought simply writing out the formula would be enough in this case, without going into too much detail. But maybe a longer explanation might help?
- I have about 500 sets of 20,000 random scenarios (unratioed losses)
- I need to calculate ratioed losses on those unratioed losses by using liabilities from a separate table
- That liability table has approximately 100 observations
- Before asking my question, I had already brought the two tables together (the 'Value' field is from the Liability table)
- Now I need to do the equation where I essentially combine those 500 sets of random losses down to 100 sets of random losses that correspond to my liabilities (this calculates the deltas between loss sets)
- There are more pieces involved after this step to get to the ratioed losses, but getting the deltas seems to be the trickiest part
Not sure if that helps or not. I appreciate you taking the time to try understand my situation better.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
From people smarter than me:
http://www.sascommunity.org/wiki/Computing_Products
And I would probably drop down to a data step solution - or IML if you're so inclined. I usually do data step because I can then add in multiple types of calculations at once.
DATA Have; INPUT Grp1 $ Grp2 $ Scenario Value; DATALINES; A X 1 1.3 A X 2 0.8 A X 3 1 A X 4 0.9 A Y 1 0.7 A Y 2 1.2 A Y 3 0.8 A Y 4 0.9 A Z 1 0.7 A Z 2 0.8 A Z 3 1.1 A Z 4 0.7 B X 1 1 B X 2 0.7 B X 3 1.2 B X 4 1.3 B Y 1 1 B Y 2 0.9 B Y 3 0.7 B Y 4 1.1 ;RUN; proc sort data=have; by grp1 Scenario; run; data want; set have; by grp1 Scenario; retain running_Total running_product counter; if first.Scenario then do; result=0; running_Total=0; running_product=0; counter=0; end; running_total=running_total + value; running_product = running_product*value; if not missing(Value) then counter+1; if last.Scenario then do; result = (running_product - 1) - (running_total - counter); output; end; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This works really well. Thank you!
Only minor change is you have to set running_product to 1 instead of 0. Otherwise it'll always be 0