Hi everyone,
I am trying to calculate the average with the Column . I was doing it in excel and want this to be converted to SAS programming .
On example below ,
There are each month , the data feeds and I am trying to calculate the average with the Fifth load .
Avg_1 :
Calculate average difference between the first loaded data vs fifth time loaded value for Mon1
Calculate average difference between the first loaded data vs fifth time loaded value for Mon2
Avg_2 :
Calculate average difference between the second time loaded data vs fifth time loaded value for Mon1
Calculate average difference between the second time loaded data vs fifth time loaded value for Mon2
Avg_3 :
Calculate average difference between the third time loaded data vs fifth time loaded value for Mon1
Calculate average difference between the third time loaded data vs fifth time loaded value for Mon2
Avg_4 :
Calculate average difference between the fourth time loaded data vs fifth time loaded value for Mon1
Calculate average difference between the fourth time loaded data vs fifth time loaded value for Mon2
then once it reaches the fifth time load, it has to exit , further no need of calc averages.
Can someone help me to get in SAS loop to done ?
HAVE | ||
S.No | Mon1 | Mon2 |
10 | 1815 | 1456 |
9 | 1770 | 1428 |
8 | 1773 | 1432 |
7 | 1770 | 1429 |
6 | 1747 | 1412 |
5 | 1773 | 1419 |
4 | 2136 | 1427 |
3 | 2289 | 1798 |
2 | 1943 | 1875 |
1 | 1520 | |
OUTPUT | Mon1 | Mon2 |
Avg_1 | -11% | -7% |
(1-(1747/1943)) | (1-(1419/1520)) | |
Avg_2 | -31% | -33% |
(1-(1747/2289)) | (1-(1419/1875)) | |
Avg_3 | -22% | -27% |
(1-(1747/2136)) | (1-(1419/1798)) | |
Avg_4 | -1% | -1% |
(1-(1747/1773)) | (1-(1419/1427)) |
Please let me know if this needs further explanation . Thank you !
What is a "fifth time loaded" and exactly how to do we tell what it is from the values in the data set?
It may help to show the results in terms of NEW variables added to the data instead of attempting to replace them as the replacement has a very good chance of messing with what you are calculating for some values of "xth time loaded".
Are there ever more or fewer than 10 rows of data? Does the actual value of the S. No have any role in the calculations?
Thanks for the response !
What is a "fifth time loaded" and exactly how to do we tell what it is from the values in the data set?
> Incase of Mon1, the fifth value is 1747 which is fifth from the bottom to top
> Incase of Mon2, the fifth value is 1419 which is fifth from the bottom to top
> It would change for the next month , to a row up again. It follows the sequence from bottom to top on the fifth entered value .
It may help to show the results in terms of NEW variables added to the data instead of attempting to replace them as the replacement has a very good chance of messing with what you are calculating for some values of "xth time loaded".
> yeh even as a separate output dataset which have calc average from 5 vs 4, 5 vs 3 , 5 vs 2 and 5 vs 1 would be fine
Are there ever more or fewer than 10 rows of data? Does the actual value of the S. No have any role in the calculations?
Its not always 10 rows of data, in the mon1 , there is only 9 rows has the value
I have added the S.No here , you can add, or remove any intermediate column if we want to .
I have done the shared output calculation in excel basically , for some of the modelling techniques, I am trying to do the math from SAS
The idea of doing something in Excel, and then trying to do the same thing in SAS, usually is a difficult one. Excel and SAS work fundamentally differently, and so the layout you have in Excel (which works well in Excel) is a pretty poor layout for doing things in SAS.
You will just have to get used to this, for some things doing it in SAS the same way as you did it in Excel is going to be difficult.
It's a lot easier if you could turn the data on its side in SAS, or even leave it in columns but reverse the order so that S. No = 1 is at the top of the data set and not at the bottom.
So, do you want to go that way in SAS?
yes PaigeMiller. I need not to maintain same order as I showed. If this needs to be rearrange and get the calc average is also works for me
UNTESTED CODE because I can't write code to work with your screen capture of the data
proc sort data=have;
by sno;
run;
data want;
set have;
if not missing then count1+1;
if not missing then count2+1;
mon1_1=lag(mon1);
mon1_2=lag2(mon1);
mon1_3=lag3(mon1);
mon1_4=lag4(mon1);
mon2_1=lag(mon2);
mon2_2=lag2(mon2);
mon2_3=lag3(mon2);
mon2_4=lag4(mon2);
if count1=5 then do;
mon1_avg1=1-(mon1/mon1_4);
mon2_avg2=1-(mon1/mon1_3);
mon1_avg3=1-(mon1/mon1_2);
mon1_avg4=1-(mon1/mon1_1);
end;
if count2=5 then do;
mon2_avg1=1-(mon2/mon2_4);
/* I'm lazy, you type the rest */
end;
run;
I challenge you to think of a better way to design this problem and design this analysis in the future, that will work better within SAS. Excel thinking doesn't help when you are working in SAS.
Here is a solution, but maybe not the best solution. However, it fulfills your request.
data want;
length stat $5;
set have;
array mon[*] mon:;
array m[7,2] _temporary_;
* correct for the misalignment of data;
if not missing(mon1) then do;
i1 + 1;
m[i1,1] = mon1;
end;
if not missing(mon2) then do;
i2 + 1;
m[i2,2] = mon2;
end;
* after collecting 5 full months data for each month, compute changes;
if min(i1, i2) = 5 then do;
do i = 1 to 4;
do j = 1 to 2;
stat = cats('avg_',i);
mon[j] = 1 - m[i,j]/m[5,j];
end;
output;
end;
stop;
end;
format mon: percentn8.0;
keep stat mon:;
run;
Result:
stat Mon1 Mon2
avg_1 -11% -7%
avg_2 -31% -32%
avg_3 -22% -27%
avg_4 -1% -1%
Here's the prep steps to get the data in the right order.
data have;
infile cards dlm='09'x dsd;
input SNo Mon1 Mon2;
cards;
10 1815 1456
9 1770 1428
8 1773 1432
7 1770 1429
6 1747 1412
5 1773 1419
4 2136 1427
3 2289 1798
2 1943 1875
1 . 1520
;
run;
proc sort;
by SNo;
run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.