Hi All,
I have the following sample dataset:
COL1 COL2 COL3
A X 100
A Y 250
C Z 300
What I want to achieve is the following:
When Col 1 = A and Col 2 = X or Y then add the numbers in COL3 (result should be 350 in this case)
I tried SUM but this did not work. Any tips?
Not sure if this is what you want though:
data have;
input COL1$ COL2$ COL3;
cards;
A X 100
A Y 250
C Z 300
;
data want;
do until (last.col1);
set have;
by col1 notsorted;
if col1='A' and col2 in ('X', 'Y') then total+col3;
else total=col3;
end;
do until (last.col1);
set have;
by col1 notsorted;
output;
end;
run;
proc print;run;
Haikuo
Oops, if have to use SQL, for the same results:
data have;
input COL1$ COL2$ COL3;
cards;
A X 100
A Y 250
C Z 300
;
proc sql;
select * from
(select *, sum(col3) as total from have group by col1)
where col1='A' and col2 in ('X', 'Y')
union all
select *, col3 as total from have where not (col1='A' and col2 in ('X', 'Y'))
;
quit;
Haikuo
Thanks, if I have text in column 1 how would this code need to be modified? It does not seem to work for anything other than ABCD....
Example if COL1 was:
Textile
Textile
Chore
Works fine for me:
data have;
input COL1:$8. COL2$ COL3;
cards;
Textile X 100
Textile Y 250
Chore Z 300
;
proc sql;
select * from
(select *, sum(col3) as total from have group by col1)
where col1='Textile' and col2 in ('X', 'Y')
union all
select *, col3 as total from have where not (col1='Textile' and col2 in ('X', 'Y'))
;
quit;
Haikuo
I agree with Haikuo, your fully intended result is not obvious from you explaination. Here is a solution using proc freq for weighted crossfreqs.
data foo;
input (col1 col2) ($) col3;
cards;
A X 100
A Y 250
C Z 300
;
run;
proc freq data=foo;
tables col1*col2 /nopercent nocum nocol norow;
weight col3;
run;
col1 col2
Frequency|X |Y |Z | Total
---------+--------+--------+--------+
A | 100 | 250 | 0 | 350
---------+--------+--------+--------+
C | 0 | 0 | 300 | 300
---------+--------+--------+--------+
Total 100 250 300 650
What you want to do is use the proc means statement.
data have;
input COL1$ COL2$ COL3;
cards;
A X 100
A Y 250
C Z 300
;
run;
proc sort data=have; by COL1; run;
proc means data=have noprint;
by COL1;
output sum=COL3 out=want (drop = _type_ _freq_);
run;
proc print data=want;
run;
This yields the following, which is close to what you asked for:
obs COL1 COL3
1 A 350
2 B 300
If you want it to be exactly what you asked for (below), you can just merge "want" with "have".
obs COL1 COL2 COL3
1 A X 350
2 | A | Y | 350 |
2 B Z 300
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.