As easy as sounds. Seems like a proc sql, but I would also be interested in see data step option.
data meta;
input A;
datalines;
12
8
14
25
8
16
;
Results
A B
12 83
8 83
14 83
25 83
8 83
16 83
This is my way too long code so far...
PROC MEANS NOPRINT DATA=meta;
VAR A;
OUTPUT OUT=summarydata SUM(A) = sigma_A;
DATA meta_summary;
IF _N_=1 THEN SET summarydata;
SET meta;
drop _type_ _freq_;
run;
proc print data=meta_summary;
run;
Hi H,
On the post you want to edit, click on the "..." in the upper right side and select "edit post/reply."
Anna
the approach works astounding, but I don't know if it will work for me. In that I will have many more columns to sum and other data steps within this one.
some reason I thought you could insert the sum back into the dataset within the proc means statement I used. It would use an "in" I believe.
Any body familiar with that approach?
PROC SQL solution:
proc sql;
create table B as
select *, sum(a)
from have;
quit;
I don't think you can add the obs back in with proc means, there may be a way with proc summary though, but I'm unfamiliar with that procedure.
Reeza,
How do you name the new variable in proc sql? It currently gets named "_TEMG001".
Say I want to call it Sigma_A.
Thanks.
I am using the following to sum two different variables (i.e., A and C):
proc sql;
create table B as
select *, sum(A)as Sigma_A,
sum(C)as Sigma_C
from meta;
drop D;
quit;
But now I would like to sum them based on a group variable in the dataset called replicate. There are 100 replicate groups (i.e., 1-100) all with 300 observations. I would like to execute the about code but have the sums be for the replicates, and inserted into the new dataset as before.
Any help would be appreciated - I am currently having difficulties getting the "group by" to work with the multiple sums in the step.
I am now using the following, which works out better for my needs (you can see I am looking at 4 variable sums):
proc means data=summies;
class replicate;
var A B C D;
output out=LR_test sum(A)= A_sum
sum(B)= B_sum
sum(C)= C_sum
sum(D)= D_sum;
run;
However, the generated dataset has an extra row for the totals. So it has a replicate = "." with the totals. Plus the 100 other rows with the sums. Is there a way to get rid of this extra within the above data step?
H,
Almost all of what you ask for is relatively easy to program. But you will need to set a fixed target, not a moving target. Many replicates? No problem. But a different problem. Name the new fields with "Sigma_"? No problem. But a different problem.
One thing you will have to think through is the length of the new variable names. With an original name like "A", there's no problem creating "Sigma_A". But what if the original variable name were 30 characters long? Now there's no room to put "Sigma_" in front.
Anyway, spell out a final form to the problem, and the solution won't be that difficult.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.