Hi All,
I have a SAS dataset that produces a report for total count based on the year, product and type. Now I want to calculate the ratio from this report. One way I can do is output the SAS report, and do all the ratio calculation in the Excel; however, I'm wondering if there is a way that this ratio calculation can be done in SAS too.
Here is the report:
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |||
year | product | type | count | count | count | count | count | count | count | count | count | count | count | count |
2014 | aa | t | 560 | 560 | 687 | 560 | 560 | 560 | 315 | 560 | 245 | 555 | 567 | 566 |
w | 785 | 432 | 450 | 222 | 450 | 333 | 450 | 131 | 321 | 435 | 450 | 454 | ||
x | 160 | 160 | 765 | 160 | 160 | 160 | 676 | 160 | 141 | 325 | 675 | 160 | ||
y | 360 | 360 | 360 | 360 | 360 | 360 | 657 | 121 | 360 | 321 | 360 | 360 | ||
z | 570 | 570 | 493 | 570 | 688 | 976 | 570 | 570 | 153 | 643 | 543 | 848 | ||
bb | x | 160 | 245 | 542 | 160 | 160 | 222 | 334 | 777 | 578 | 545 | 344 | 666 | |
y | 432 | 340 | 360 | 532 | 567 | 360 | 567 | 776 | 454 | 325 | 780 | 455 | ||
cc | p | 555 | 457 | 675 | 988 | 777 | 578 | 324 | 865 | 160 | 643 | 547 | 444 |
And the results I want to get is (column ' roduct' is not in the final output, I list it here to explain which product we need in final result):
month | ||||||||||||||
product: | Catg | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | |
in aa | t / w | 0.71 | 1.30 | 1.53 | 2.52 | 1.24 | 1.68 | 0.70 | 4.27 | 0.76 | 1.28 | 1.26 | 1.25 | |
in aa | w / y | 2.18 | 1.20 | 1.25 | 0.62 | 1.25 | 0.93 | 0.68 | 1.08 | 0.89 | 1.36 | 1.25 | 1.26 | |
in aa | x / y | 0.44 | 0.44 | 2.13 | 0.44 | 0.44 | 0.44 | 1.03 | 1.32 | 0.39 | 1.01 | 1.88 | 0.44 | |
in bb | x / y | 0.37 | 0.72 | 1.51 | 0.30 | 0.28 | 0.62 | 0.59 | 1.00 | 1.27 | 1.68 | 0.44 | 1.46 | |
w in aa y in bb | w / y | 1.82 | 1.27 | 1.25 | 0.42 | 0.79 | 0.93 | 0.79 | 0.17 | 0.71 | 1.34 | 0.58 | 1.00 | |
x in bb p in cc | x / p | 0.29 | 0.54 | 0.80 | 0.16 | 0.21 | 0.38 | 1.03 | 0.90 | 3.61 | 0.85 | 0.63 | 1.50 |
As you see both product 'aa' and 'bb' have type of 'x' and 'y'. I tried transpose type to calculate ratio, but when I transpose back, I lost 2nd 'x/y' which should have product 'bb'. Also the last ratio pick 'x' in product 'aa', not 'bb'.
If we can do this calculation step in SAS, that'll be great. otherwise, we have to manually calculate in the Excel.
I'm attaching the report above, and program here.
Thanks in advance for your help.
Adding one more question:
I also need to calculate rolling 12 month average of the ratio. My real data is 1+ year. For example,
When I run December 2014, the average ratio will be Jan 2014 to Dec 2014.
When I run January 2015, the average ratio will be Feb 2014 to Jan 2015.
Thanks again.
Those look like hard coded rules.
Not a great solution, and I'd still consider transposing, but an option would be to create a dataset that lists the product and desired numerator/denominator and 'id' for each row. Then merge the dataset with itself by product, numerator and denominator. This will give you the data side by side and you can calculate your ratios using two arrays.
So create a data set like this with your rules, you may need to add year depending on your criteria.
PROD_NUM NUM PROD_DEN DEN
aa t aa w
aa w aa y
...
aa w bb y
bb x cc p
Then merge the data set with your have dataset, twice once for the numerator data and once for denominator. You'll have to rename the variables. Then calculate your ratio.
proc sql;
create table want as
select a.*, b.month1 as num_month1,
...
b.month12 as num_month12,
c.month1 as den_month1,
...
c.month12 as den_month12
/*a.month1/b.month1 as ratio1*/ /*get merge right first*/
from rules as a
left join have as b
on a.prod_num=b.product
and a.num=b.type
left join have as c
on a.prod_den=c.product
and a.den=c.type;
quit;
Once you have your data set up, create two arrays and calculate your ratios. Or do it in the step above if you want to elimate a step.
Those look like hard coded rules.
Not a great solution, and I'd still consider transposing, but an option would be to create a dataset that lists the product and desired numerator/denominator and 'id' for each row. Then merge the dataset with itself by product, numerator and denominator. This will give you the data side by side and you can calculate your ratios using two arrays.
So create a data set like this with your rules, you may need to add year depending on your criteria.
PROD_NUM NUM PROD_DEN DEN
aa t aa w
aa w aa y
...
aa w bb y
bb x cc p
Then merge the data set with your have dataset, twice once for the numerator data and once for denominator. You'll have to rename the variables. Then calculate your ratio.
proc sql;
create table want as
select a.*, b.month1 as num_month1,
...
b.month12 as num_month12,
c.month1 as den_month1,
...
c.month12 as den_month12
/*a.month1/b.month1 as ratio1*/ /*get merge right first*/
from rules as a
left join have as b
on a.prod_num=b.product
and a.num=b.type
left join have as c
on a.prod_den=c.product
and a.den=c.type;
quit;
Once you have your data set up, create two arrays and calculate your ratios. Or do it in the step above if you want to elimate a step.
Hi Reeza,
I tried your method, it works very well. Thanks for your quick response.
I add one more question. I'd really appreciate if you know the answer.
Posting questions here again:
Adding one more question:
I also need to calculate rolling 12 month average of the ratio. My real data is 1+ year. For example,
When I run December 2014, the average ratio will be Jan 2014 to Dec 2014.
When I run January 2015, the average ratio will be Feb 2014 to Jan 2015.
Thanks
In general, once your original question has been answered you should mark the question answered and start a new one.
Moving stats are better calculated using a SAS data step and an array. If you've transposed and use PGStats solution then the lag() function is an option.
Here's a reference on calculating moving stats, originally provided by PGStats. It may be shorter...the code below is untested.
data want;
set have;
array numRolling(0:11) _temporary_;
array denRolling(0:11) _temporary_;
array num(24) num1-num24;
array den(24) den1-den24;
array ratioRolling(12:24);
do i=1 to dim(num);
numRolling{mod(i,12)} = num(i);
denRolling{mod(i,12)} = den(i);
if i>=12 then do;
ratioRolling(i)=sum(of numRolling(*))/sum(of denRolling(*));
end;
run;
Thanks Reeza. Just marked.
I will test your method. If I'll have any questions, I'll start a new post.
SAS can do that for sure
data test;
array m month1-month12;
input year product $ type $ month1-month12;
do month = 1 to 12;
count = m{month};
output;
end;
keep year month product type count;
datalines;
2014 aa t 560 560 687 560 560 560 315 560 245 555 567 566
2014 aa w 785 432 450 222 450 333 450 131 321 435 450 454
2014 aa x 160 160 765 160 160 160 676 160 141 325 675 160
2014 aa y 360 360 360 360 360 360 657 121 360 321 360 360
2014 aa z 570 570 493 570 688 976 570 570 153 643 543 848
2014 bb x 160 245 542 160 160 222 334 777 578 545 344 666
2014 bb y 432 340 360 532 567 360 567 776 454 325 780 455
2014 cc p 555 457 675 988 777 578 324 865 160 643 547 444
;
data test_ratio;
input catg &$;
datalines;
t / w
w / y
x / y
x / y
w / y
x / p
;
proc sql;
create table ratio0 as
select a.year, a.month,
catx(" / ", a.product, b.product) as products length=12,
catx(" / ", a.type, b.type) as catg length=8,
a.count/b.count as ratio format=best6.
from test as a inner join test as b
on a.year=b.year and a.month=b.month
where calculated catg in (select catg from test_ratio)
order by year, products, catg, month;
quit;
proc transpose data=ratio0 out=ratio(drop=_name_) prefix=ratio;
by year products catg;
var ratio;
id month;
run;
options linesize=120;
proc print data=ratio noobs; run;
Hi PGStats,
Your method works only if I need the all the combinations.
For example, in your result below, I don't need row 5, 6, and 7.
Thanks
year | products | catg | ratio1 | ratio2 | ratio3 | ratio4 | ratio5 | ratio6 | ratio7 | ratio8 | ratio9 | ratio10 | ratio11 | ratio12 |
2014 | aa / aa | t / w | 0.7134 | 1.2963 | 1.5267 | 2.5225 | 1.2444 | 1.6817 | 0.7 | 4.2748 | 0.7632 | 1.2759 | 1.26 | 1.2467 |
2014 | aa / aa | w / y | 2.1806 | 1.2 | 1.25 | 0.6167 | 1.25 | 0.925 | 0.6849 | 1.0826 | 0.8917 | 1.3551 | 1.25 | 1.2611 |
2014 | aa / aa | x / y | 0.4444 | 0.4444 | 2.125 | 0.4444 | 0.4444 | 0.4444 | 1.0289 | 1.3223 | 0.3917 | 1.0125 | 1.875 | 0.4444 |
2014 | aa / bb | w / y | 1.8171 | 1.2706 | 1.25 | 0.4173 | 0.7937 | 0.925 | 0.7937 | 0.1688 | 0.707 | 1.3385 | 0.5769 | 0.9978 |
2014 | aa / bb | x / y | 0.3704 | 0.4706 | 2.125 | 0.3008 | 0.2822 | 0.4444 | 1.1922 | 0.2062 | 0.3106 | 1 | 0.8654 | 0.3516 |
2014 | aa / cc | x / p | 0.2883 | 0.3501 | 1.1333 | 0.1619 | 0.2059 | 0.2768 | 2.0864 | 0.185 | 0.8813 | 0.5054 | 1.234 | 0.3604 |
2014 | bb / aa | x / y | 0.4444 | 0.6806 | 1.5056 | 0.4444 | 0.4444 | 0.6167 | 0.5084 | 6.4215 | 1.6056 | 1.6978 | 0.9556 | 1.85 |
2014 | bb / bb | x / y | 0.3704 | 0.7206 | 1.5056 | 0.3008 | 0.2822 | 0.6167 | 0.5891 | 1.0013 | 1.2731 | 1.6769 | 0.441 | 1.4637 |
2014 | bb / cc | x / p | 0.2883 | 0.5361 | 0.803 | 0.1619 | 0.2059 | 0.3841 | 1.0309 | 0.8983 | 3.6125 | 0.8476 | 0.6289 | 1.5 |
Simply adapt:
data test;
array m month1-month12;
input year product $ type $ month1-month12;
do month = 1 to 12;
count = m{month};
output;
end;
keep year month product type count;
datalines;
2014 aa t 560 560 687 560 560 560 315 560 245 555 567 566
2014 aa w 785 432 450 222 450 333 450 131 321 435 450 454
2014 aa x 160 160 765 160 160 160 676 160 141 325 675 160
2014 aa y 360 360 360 360 360 360 657 121 360 321 360 360
2014 aa z 570 570 493 570 688 976 570 570 153 643 543 848
2014 bb x 160 245 542 160 160 222 334 777 578 545 344 666
2014 bb y 432 340 360 532 567 360 567 776 454 325 780 455
2014 cc p 555 457 675 988 777 578 324 865 160 643 547 444
;
data wantedRatios;
length ratioId $20;
input (p1 p2 t1 t2) ($);
ratioId = catx("@", p1, p2, t1, t2);
keep ratioId;
datalines;
aa aa t w
aa aa w y
aa aa x y
bb bb x y
aa bb w y
bb cc x p
;
proc sql;
create table ratio0 as
select a.year, a.month,
catx(" / ", a.product, b.product) as products length=12,
catx(" / ", a.type, b.type) as catg length=8,
a.count/b.count as ratio format=best6.
from test as a inner join test as b
on a.year=b.year and a.month=b.month
where catx("@", a.product, b.product, a.type, b.type) in
(select ratioId from wantedRatios)
order by year, products, catg, month;
quit;
proc transpose data=ratio0 out=ratio(drop=_name_) prefix=ratio;
by year products catg;
var ratio;
id month;
run;
options linesize=120;
proc print data=ratio noobs; run;
Thanks PGStats, this is the result that I was expecting. Thank you so much.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.