Hi,
I have this table
DATE CF C P
31-Oct-18 A1 10 1
31-Oct-18 A2 20 1
31-Oct-18 A3 30
30-Nov-18 A1 10 1
30-Nov-18 A2 20 1
30-Nov-18 A3 30
31-Dec-18 A1 10 1
31-Dec-18 A2 20 1
31-Dec-18 A3 30
and I wrote this program
proc sql ;
create table result as
select DATE, sum(c) as TOTALCC,
(SELECT SUM(C)
FROM data
WHERE P=1 ) AS IPCC
FROM data
group by date;
quit;
to get this result
DATE C P
31-Oct-18 60 30
30-Nov-18 60 30
31-Dec-18 60 30
but instead of 30 under column P, I get 90 as it sums all months.
Where is the mistake?
Thanks.
You need to join the subselect on date also:
data have;
infile cards truncover;
input date :date9. cf $ c p;
format date date9.;
cards;
31-Oct-18 A1 10 1
31-Oct-18 A2 20 1
31-Oct-18 A3 30
30-Nov-18 A1 10 1
30-Nov-18 A2 20 1
30-Nov-18 A3 30
31-Dec-18 A1 10 1
31-Dec-18 A2 20 1
31-Dec-18 A3 30
;
run;
proc sql;
create table want as
select
date,
sum(c) as totalcc,
(
select sum(c)
from have b
where P = 1 and a.date = b.date
) as ipcc
from have a
group by date;
quit;
proc print data=want noobs;
run;
Result:
date totalcc ipcc 31OCT2018 60 30 30NOV2018 60 30 31DEC2018 60 30
you do not have group by for sum in subquery. so it is doing total for all . try with group by date for subquery. for your code as you do not need a subquery. you can so sum in your query for C.
Hi,
the IPCC is sum(c) when P=1.
I have already tried to include a Group by date in my subquery but it's not working.
you do not need a subquery at all
You need to join the subselect on date also:
data have;
infile cards truncover;
input date :date9. cf $ c p;
format date date9.;
cards;
31-Oct-18 A1 10 1
31-Oct-18 A2 20 1
31-Oct-18 A3 30
30-Nov-18 A1 10 1
30-Nov-18 A2 20 1
30-Nov-18 A3 30
31-Dec-18 A1 10 1
31-Dec-18 A2 20 1
31-Dec-18 A3 30
;
run;
proc sql;
create table want as
select
date,
sum(c) as totalcc,
(
select sum(c)
from have b
where P = 1 and a.date = b.date
) as ipcc
from have a
group by date;
quit;
proc print data=want noobs;
run;
Result:
date totalcc ipcc 31OCT2018 60 30 30NOV2018 60 30 31DEC2018 60 30
sorry, i did not see and you need to join back to main query and @Kurt_Bremser answer is perfect. put you still do not need a subquery for this
@kiranv_ wrote:
sorry, i did not see and you need to join back to main query and @Kurt_Bremser answer is perfect. put you still do not need a subquery for this
Absolutely. One just puts a case into the summary function:
proc sql;
create table want as
select
date,
sum(c) as totalcc,
sum(case when p = 1 then c else 0 end) as ipcc
from have
group by date;
quit;
or uses a data step, as have is already sorted:
data want;
set have;
by date;
if first.date
then do;
totalcc = 0;
ipcc = 0;
end;
totalcc + c;
if p then ipcc + c;
if last.date;
keep date totalcc ipcc;
run;
Work on the subquery first, alone. Once you're satisfied with those results, put that code into your full program.
Hi @cmemtsa I agree with @kiranv_ 's thought process of not requiring a subquery in this case. However his demo didn't get the required result.
Consider the below,since your p seems binomial var, I'd use boolean expressions.
data have;
infile cards truncover;
input date :date9. cf $ c p;
format date date9.;
cards;
31-Oct-18 A1 10 1
31-Oct-18 A2 20 1
31-Oct-18 A3 30
30-Nov-18 A1 10 1
30-Nov-18 A2 20 1
30-Nov-18 A3 30
31-Dec-18 A1 10 1
31-Dec-18 A2 20 1
31-Dec-18 A3 30
;
run;
proc sql;
create table want as
select date,sum(c) as totalcc ,sum(p*c) as ipcc
from have
group by date;
quit;
proc print noobs;run;
Result:
date totalcc ipcc 31OCT2018 60 30 30NOV2018 60 30 31DEC2018 60 30
If you really want to master boolean expressions across various scenarios that includes (dates,char,num ints, floats , and all expressions), the one exclusive super star of boolean/binomial expressions whose posts are of super value is @SASKiwi . The privilege of reading SASkiwi's posts is priceless. Just my $0.2 🙂
you are right and I have edited my answer
Hi @kiranv_ Thank you. However, just a humble request. It is difficult to follow what you have edited as I think you had a your solution that you deleted rather than an edit, which is fine but hard to guage unless one checks their email what the previous post was.
Taking your latest edit on that very post "for your code as you do not need a subquery. you can so sum in your query for C."
I am afraid that response in italics is still not accruate coz there is no real logical explanation there.
Let me illistrate my reasoning:
1. That statement only portrays your earlier response where your code was
proc sql ; create table result as select DATE, sum(c) as TOTALCC, SUM(C) as IPCC FROM have group by date; quit;
This will of course produce incorrect results.
2. Secondly, the catch here as OP pointed out the IPCC is sum(c) when P=1.
This is where a conditional logic kicks in. We could assume value of P is (1 or . ) or the value of P is (1 or some other number). Either way, it's not a direct summation with a group by. Of course to address this Kurt demonstrated using case when, which is right.
3. I approach the conditonal logic with a boolean expression taking advantage of true positive/true negative with equality/inquality operation.
So something that will handle the conditional construct comfortably can be i. case when ii. ifn and iii. bolean
So, here is my slight variant on to my previous solution
proc sql;
create table want as
select date,sum(c) as totalcc ,sum((p=1)*c) as ipcc
from have
group by date;
quit;
Now, intuitive reasoning case when vs Boolean. Case when is used reset the value of the variable and then compute sum whereas Boolean doesn't reset as it is mere math. Nonetheless, I don't think that causes any performance impact. The point of the whole discussion, if we state a sub-query isn't required then that has to be supported with reasoning and illustration so that viewers will clearly understand. Thank you!
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.