BookmarkSubscribeRSS Feed
JethrowCatch
Fluorite | Level 6

Hello,

 

I am using a dataset with 12 numeric variables. I need to use a total of one variable, x12, to calculate every observation's percentage of the whole, and I would like to do this within the DATA step.

 

DATA example;
INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12;
x13 = (x12 / total of x12) * 100;

DATALINES;
... ... ... ... ... ... .. .. .. 200
... ... .. .. .. .. .. ..  .. .. 300

;
PROC PRINT DATA = example;
                        SUM(x12)
                        SUM(x13)
RUN;

Now, I need to obtain the sum of all x12 values and use it in the calculation of a new variable, x13 which is a relative percentage. The formula to calculate x13 is as follows:

 

x13 = (current x12 value / total sum of all x12 values) * 100

 

Can anyone help me with obtaining the total sum of all x12 values in the DATA step?

 

 

 

 

8 REPLIES 8
PaigeMiller
Diamond | Level 26
DATA example;
INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12;
array x x1-x12;
array pct pct1-pct12;
rowsum=sum(of x1-x12);
do i=1 to 12;
    pct(i) = (x(i) / rowsum) * 100;
end;
datalines;

;
run;
--
Paige Miller
JethrowCatch
Fluorite | Level 6

Hello @PaigeMiller ,

 

Thank you for your swift response. I think you misunderstood my question. I need to obtain the sum of every value in the column x12, not each row.

Reeza
Super User

SQL is faster.

 

proc sql;
create table want as
select *, sum(x12) as total,  x12 / calculated total as x13
from example;
quit;

Otherwise, you need to:

  1. Create a total variable
  2. Merge it in to your data set
  3. Do the calculation. 

If you really want a data step, see this example on how to merge in averages (SUM) would be the total and you can modify it as needed.

https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas

 


@JethrowCatch wrote:

Hello,

 

I am using a dataset with 12 numeric variables. I need to use a total of one variable, x12, to calculate every observation's percentage of the whole, and I would like to do this within the DATA step.

 

DATA example;
INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12;
x13 = (x12 / total of x12) * 100;

DATALINES;
... ... ... ... ... ... .. .. .. 200
... ... .. .. .. .. .. ..  .. .. 300

;
PROC PRINT DATA = example;
                        SUM(x12)
                        SUM(x13)
RUN;

Now, I need to obtain the sum of all x12 values and use it in the calculation of a new variable, x13 which is a relative percentage. The formula to calculate x13 is as follows:

 

x13 = (current x12 value / total sum of all x12 values) * 100

 

Can anyone help me with obtaining the total sum of all x12 values in the DATA step?

 

 

 

 


 

JethrowCatch
Fluorite | Level 6

Hello @Reeza ,

 

Thank you for your response. We have not learned about PROC SQL in school yet. I do not want to seem as if I am cheating by using functions I haven't learned yet.

 

I have given your 3 step method a go, but I am having some trouble understanding how to obtain the total value before calculating the percentage.

DATA example;
INPUT x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12;
RETAIN TOTAL 0;
TOTAL = sum(TOTAL, x12);

DATALINES;
.....
;

DATA PCT;  			
MERGE  example; BY x1;
x13 =  (x12/TOTAL) * 100;
RUN;	

PROC PRINT DATA = PCT;
RUN:

My problem is that the value of TOTAL increases for each observation. I need for TOTAL to be a variable that has one value, which is the value it takes on after adding the x12 value of the last observation to itself.

JethrowCatch
Fluorite | Level 6

In other words, what I would like to do is to use a specific index of TOTAL in every x13 calculation. However, indexing with curly or straight brackets does not work.

 

Essentially, the x13 calculation should look as follows:

 

x13 = ( x12 / TOTAL[30]) * 100;

 

30 is my number of observations in the example dataset.

 

Would anyone happen to know how to index observations in SAS?

Reeza
Super User
Did you work through the example at the link?
Note the PROC MEANS step that calculated the average, you would modify that to calculate the sum.
JethrowCatch
Fluorite | Level 6

Yes, I have. I now obtain the total that I need in the variable TOTALCT. However, the output that I get under the OUTPUT DATA tab is a table showing type, frequency, and total.

 

The output that I need is a table showing all the original variables x1-x12, with x13 added on. Therefore, I need to use PROC PRINT, but at this moment, nothing shows up under the results tab.

 

PROC MEANS DATA = EXAMPLE NOPRINT;
    OUTPUT OUT = example SUM(x12) = TOTALCT;
run;
	  
DATA PCT;
MERGE example; BY x1;
x13 = (x12 / TOTALCT) * 100;
RUN;

PROC PRINT DATA = PCT;
RUN;
Reeza
Super User

Your merge is not correct. Please review the example to see my use of the double SET statement instead.

FYI - The NOPRINT option is specified on the PROC MEANS which suppresses any results. If you'd like to see them, remove the NOPRINT option.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 8 replies
  • 1429 views
  • 0 likes
  • 3 in conversation