BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cmemtsa
Quartz | Level 8

Hi,

 

I have this table

 

 

DATE                          CF                            C                              P

31-Oct-18                A1                           10                           1

31-Oct-18                A2                           20                           1

31-Oct-18                A3                           30                          

30-Nov-18               A1                           10                           1

30-Nov-18               A2                           20                           1

30-Nov-18               A3                           30                          

31-Dec-18               A1                           10                           1

31-Dec-18               A2                           20                           1

31-Dec-18               A3                           30                          

 

and I wrote this program

proc sql ;

create table result as

select DATE, sum(c) as TOTALCC,

(SELECT SUM(C)

FROM data

WHERE P=1 ) AS IPCC

FROM data

group by date;

quit;

 

to get this result

DATE                          C                              P

31-Oct-18                60                           30

30-Nov-18               60                           30

31-Dec-18               60                           30

 

but instead of 30 under column P, I  get 90 as it sums all months.

 

Where is the mistake?

 

Thanks.

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

You need to join the subselect on date also:

data have;
infile cards truncover;
input date :date9. cf $ c p;
format date date9.;
cards;
31-Oct-18 A1 10 1
31-Oct-18 A2 20 1
31-Oct-18 A3 30
30-Nov-18 A1 10 1
30-Nov-18 A2 20 1
30-Nov-18 A3 30
31-Dec-18 A1 10 1
31-Dec-18 A2 20 1
31-Dec-18 A3 30
;
run;

proc sql;
create table want as
select
  date,
  sum(c) as totalcc,
  (
    select sum(c)
    from have b
    where P = 1 and a.date = b.date
  ) as ipcc
from have a
group by date;
quit;

proc print data=want noobs;
run;

Result:

     date    totalcc    ipcc

31OCT2018       60       30 
30NOV2018       60       30 
31DEC2018       60       30 

View solution in original post

10 REPLIES 10
kiranv_
Rhodochrosite | Level 12

you do not have group by for sum in subquery. so it is doing total for all . try with group by date for subquery. for your code as you do not need a subquery. you can so sum in your query for C.

cmemtsa
Quartz | Level 8

Hi,

 

the IPCC is sum(c) when  P=1.

 

I have already tried to include a Group by date in my subquery but it's not working.

kiranv_
Rhodochrosite | Level 12

you do not need a subquery at all

Kurt_Bremser
Super User

You need to join the subselect on date also:

data have;
infile cards truncover;
input date :date9. cf $ c p;
format date date9.;
cards;
31-Oct-18 A1 10 1
31-Oct-18 A2 20 1
31-Oct-18 A3 30
30-Nov-18 A1 10 1
30-Nov-18 A2 20 1
30-Nov-18 A3 30
31-Dec-18 A1 10 1
31-Dec-18 A2 20 1
31-Dec-18 A3 30
;
run;

proc sql;
create table want as
select
  date,
  sum(c) as totalcc,
  (
    select sum(c)
    from have b
    where P = 1 and a.date = b.date
  ) as ipcc
from have a
group by date;
quit;

proc print data=want noobs;
run;

Result:

     date    totalcc    ipcc

31OCT2018       60       30 
30NOV2018       60       30 
31DEC2018       60       30 
kiranv_
Rhodochrosite | Level 12

sorry, i did not see and you need to join back to main query and @Kurt_Bremser  answer is perfect. put you still do not need a subquery for this

Kurt_Bremser
Super User

@kiranv_ wrote:

sorry, i did not see and you need to join back to main query and @Kurt_Bremser  answer is perfect. put you still do not need a subquery for this


Absolutely. One just puts a case into the summary function:

proc sql;
create table want as
select
  date,
  sum(c) as totalcc,
  sum(case when p = 1 then c else 0 end) as ipcc
from have
group by date;
quit;

or uses a data step, as have is already sorted:

data want;
set have;
by date;
if first.date
then do;
  totalcc = 0;
  ipcc = 0;
end;
totalcc + c;
if p then ipcc + c;
if last.date;
keep date totalcc ipcc;
run;

 

tomrvincent
Rhodochrosite | Level 12

Work on the subquery first, alone.  Once you're satisfied with those results, put that code into your full program.

novinosrin
Tourmaline | Level 20

Hi @cmemtsa   I agree with @kiranv_  's thought process of not requiring a subquery in this case. However his demo didn't get the required result. 

 

Consider the below,since your p seems binomial var, I'd use boolean expressions. 

 

data have;
infile cards truncover;
input date :date9. cf $ c p;
format date date9.;
cards;
31-Oct-18 A1 10 1
31-Oct-18 A2 20 1
31-Oct-18 A3 30
30-Nov-18 A1 10 1
30-Nov-18 A2 20 1
30-Nov-18 A3 30
31-Dec-18 A1 10 1
31-Dec-18 A2 20 1
31-Dec-18 A3 30
;
run;

proc sql;
create table want as
select date,sum(c) as totalcc ,sum(p*c) as ipcc
from have
group by date;
quit;

proc print noobs;run;

Result:

date totalcc ipcc 
31OCT2018 60 30 
30NOV2018 60 30 
31DEC2018 60 30 

If you really want to master boolean expressions across various scenarios that includes (dates,char,num ints, floats , and all expressions), the one exclusive super star of boolean/binomial expressions whose posts are of super value is @SASKiwi . The privilege of reading SASkiwi's posts is priceless. Just my $0.2 🙂

kiranv_
Rhodochrosite | Level 12

you are right  and I have edited my answer

novinosrin
Tourmaline | Level 20

Hi @kiranv_   Thank you. However, just a humble request. It is difficult to follow what you have edited as I think you had a your solution that you deleted rather than an edit, which is fine but hard to guage unless one checks their email what the previous post was. 

 

Taking your latest edit on that very post "for your code as you do not need a subquery. you can so sum in your query for C."

 

I am afraid that response in italics  is still not accruate coz there is no real logical explanation there. 

Let me illistrate  my reasoning:

1. That statement only portrays your earlier response where your code was

proc sql ;

create table result as

select DATE, sum(c) as TOTALCC, SUM(C) as IPCC

FROM have
group by date;

quit;

This will of course produce incorrect results. 

 

2. Secondly, the catch here as OP pointed out the IPCC is sum(c) when  P=1.

This is where a conditional logic kicks in. We could assume value of P is (1 or . )  or the value of P is (1 or some other number). Either way, it's not a direct summation with a group by. Of course to address this Kurt demonstrated using case when, which is right.

 

3. I approach the conditonal logic with a boolean expression taking advantage of true positive/true negative  with equality/inquality operation. 

So something that will handle the conditional construct comfortably can be i. case when  ii. ifn and iii. bolean

 

So, here is my slight variant on to my previous solution

 


proc sql;
create table want as
select date,sum(c) as totalcc ,sum((p=1)*c) as ipcc
from have
group by date;
quit;

Now, intuitive reasoning case when vs Boolean. Case when is used reset the value of the variable and then compute sum whereas Boolean doesn't reset as it is mere math. Nonetheless, I don't think that causes any performance impact. The point of the whole discussion, if we state a sub-query isn't required then that has to be supported with reasoning and illustration so that viewers will clearly understand. Thank you!

 

 

 

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 10 replies
  • 5074 views
  • 3 likes
  • 5 in conversation