Hi All,
I have the following sample dataset:
COL1 COL2 COL3
A X 100
A Y 250
C Z 300
What I want to achieve is the following:
When Col 1 = A and Col 2 = X or Y then add the numbers in COL3 (result should be 350 in this case)
I tried SUM but this did not work. Any tips?
Not sure if this is what you want though:
data have;
input COL1$ COL2$ COL3;
cards;
A X 100
A Y 250
C Z 300
;
data want;
do until (last.col1);
set have;
by col1 notsorted;
if col1='A' and col2 in ('X', 'Y') then total+col3;
else total=col3;
end;
do until (last.col1);
set have;
by col1 notsorted;
output;
end;
run;
proc print;run;
Haikuo
Oops, if have to use SQL, for the same results:
data have;
input COL1$ COL2$ COL3;
cards;
A X 100
A Y 250
C Z 300
;
proc sql;
select * from
(select *, sum(col3) as total from have group by col1)
where col1='A' and col2 in ('X', 'Y')
union all
select *, col3 as total from have where not (col1='A' and col2 in ('X', 'Y'))
;
quit;
Haikuo
Thanks, if I have text in column 1 how would this code need to be modified? It does not seem to work for anything other than ABCD....
Example if COL1 was:
Textile
Textile
Chore
Works fine for me:
data have;
input COL1:$8. COL2$ COL3;
cards;
Textile X 100
Textile Y 250
Chore Z 300
;
proc sql;
select * from
(select *, sum(col3) as total from have group by col1)
where col1='Textile' and col2 in ('X', 'Y')
union all
select *, col3 as total from have where not (col1='Textile' and col2 in ('X', 'Y'))
;
quit;
Haikuo
I agree with Haikuo, your fully intended result is not obvious from you explaination. Here is a solution using proc freq for weighted crossfreqs.
data foo;
input (col1 col2) ($) col3;
cards;
A X 100
A Y 250
C Z 300
;
run;
proc freq data=foo;
tables col1*col2 /nopercent nocum nocol norow;
weight col3;
run;
col1 col2
Frequency|X |Y |Z | Total
---------+--------+--------+--------+
A | 100 | 250 | 0 | 350
---------+--------+--------+--------+
C | 0 | 0 | 300 | 300
---------+--------+--------+--------+
Total 100 250 300 650
What you want to do is use the proc means statement.
data have;
input COL1$ COL2$ COL3;
cards;
A X 100
A Y 250
C Z 300
;
run;
proc sort data=have; by COL1; run;
proc means data=have noprint;
by COL1;
output sum=COL3 out=want (drop = _type_ _freq_);
run;
proc print data=want;
run;
This yields the following, which is close to what you asked for:
obs COL1 COL3
1 A 350
2 B 300
If you want it to be exactly what you asked for (below), you can just merge "want" with "have".
obs COL1 COL2 COL3
1 A X 350
2 | A | Y | 350 |
2 B Z 300
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.