Hello,
I am using a dataset with 12 numeric variables. I need to use a total of one variable, x12, to calculate every observation's percentage of the whole, and I would like to do this within the DATA step.
DATA example;
INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12;
x13 = (x12 / total of x12) * 100;
DATALINES;
... ... ... ... ... ... .. .. .. 200
... ... .. .. .. .. .. .. .. .. 300
;
PROC PRINT DATA = example;
SUM(x12)
SUM(x13)
RUN;
Now, I need to obtain the sum of all x12 values and use it in the calculation of a new variable, x13 which is a relative percentage. The formula to calculate x13 is as follows:
x13 = (current x12 value / total sum of all x12 values) * 100
Can anyone help me with obtaining the total sum of all x12 values in the DATA step?
DATA example;
INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12;
array x x1-x12;
array pct pct1-pct12;
rowsum=sum(of x1-x12);
do i=1 to 12;
pct(i) = (x(i) / rowsum) * 100;
end;
datalines;
;
run;
Hello @PaigeMiller ,
Thank you for your swift response. I think you misunderstood my question. I need to obtain the sum of every value in the column x12, not each row.
SQL is faster.
proc sql;
create table want as
select *, sum(x12) as total, x12 / calculated total as x13
from example;
quit;
Otherwise, you need to:
If you really want a data step, see this example on how to merge in averages (SUM) would be the total and you can modify it as needed.
https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas
@JethrowCatch wrote:
Hello,
I am using a dataset with 12 numeric variables. I need to use a total of one variable, x12, to calculate every observation's percentage of the whole, and I would like to do this within the DATA step.
DATA example; INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12; x13 = (x12 / total of x12) * 100; DATALINES; ... ... ... ... ... ... .. .. .. 200 ... ... .. .. .. .. .. .. .. .. 300 ; PROC PRINT DATA = example; SUM(x12) SUM(x13) RUN;
Now, I need to obtain the sum of all x12 values and use it in the calculation of a new variable, x13 which is a relative percentage. The formula to calculate x13 is as follows:
x13 = (current x12 value / total sum of all x12 values) * 100
Can anyone help me with obtaining the total sum of all x12 values in the DATA step?
Hello @Reeza ,
Thank you for your response. We have not learned about PROC SQL in school yet. I do not want to seem as if I am cheating by using functions I haven't learned yet.
I have given your 3 step method a go, but I am having some trouble understanding how to obtain the total value before calculating the percentage.
DATA example;
INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12;
RETAIN TOTAL 0;
TOTAL = sum(TOTAL, x12);
DATALINES;
.....
;
DATA PCT;
MERGE example; BY x1;
x13 = (x12/TOTAL) * 100;
RUN;
PROC PRINT DATA = PCT;
RUN:
My problem is that the value of TOTAL increases for each observation. I need for TOTAL to be a variable that has one value, which is the value it takes on after adding the x12 value of the last observation to itself.
In other words, what I would like to do is to use a specific index of TOTAL in every x13 calculation. However, indexing with curly or straight brackets does not work.
Essentially, the x13 calculation should look as follows:
x13 = ( x12 / TOTAL[30]) * 100;
30 is my number of observations in the example dataset.
Would anyone happen to know how to index observations in SAS?
Yes, I have. I now obtain the total that I need in the variable TOTALCT. However, the output that I get under the OUTPUT DATA tab is a table showing type, frequency, and total.
The output that I need is a table showing all the original variables x1-x12, with x13 added on. Therefore, I need to use PROC PRINT, but at this moment, nothing shows up under the results tab.
PROC MEANS DATA = EXAMPLE NOPRINT;
OUTPUT OUT = example SUM(x12) = TOTALCT;
run;
DATA PCT;
MERGE example; BY x1;
x13 = (x12 / TOTALCT) * 100;
RUN;
PROC PRINT DATA = PCT;
RUN;
Your merge is not correct. Please review the example to see my use of the double SET statement instead.
FYI - The NOPRINT option is specified on the PROC MEANS which suppresses any results. If you'd like to see them, remove the NOPRINT option.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.