BookmarkSubscribeRSS Feed
yudhishtirb
Calcite | Level 5

Hello Team,

 

Greetings

 

I have one existing program which has statements like below

 

proc sql;
     create table ABC.masterward as
     select A, B, C, D, E, G
     from db.Master
    order by A, B;
quit;

 

data ABC.masterward;
     set ABC.masterward(keep = A B C D E G);
      end;
run;


data ABC.lastjoin;
     merge ABC.ptp ABC.masterward(in = a where = (open eq 1)) ABC.smscalls ABC.agrpayment;
     by A B;
     if a;
run;



proc sql;
     create table ABC.summary as
     select A, B, sum(C) as OpenMatters, sum(D) as TotalBookValue, sum(E) as RPC
    from ABC.lastjoin
    group by A, G;
quit;;

proc sql;
     create table ABC.summary_final as
     A, B, OpenMatters, TotalBookValue, RPC, G
     order by A, G;
quit;

 

Output of program gives us 45 rows as per expectation. Now client requested to add 4 more columns, but as soon we trying to add columns in MasterWard dataset, output changes and gives us 1000+ rows.

 

Yes, I agree it is because we have used SUM operation in "ABC.summary". But is there any way by which I can add required columns without affecting original output ?

 

Thanks In Advance

6 REPLIES 6
RahulG
Barite | Level 11

Your group by incorrect. It should be A, B instead of A,G

 

proc sql;
     create table ABC.summary as
     select A, B, sum(C) as OpenMatters, sum(D) as TotalBookValue, sum(E) as RPC
    from ABC.lastjoin
    group by A, G;
quit;;

yudhishtirb
Calcite | Level 5

Hi There,

 

Greetings

 

That is just program template. Please have a look into actual below program

 

proc sql;
     create table ABC.masterward as
     select b.szAccountNumber , b.wlevel , b.curOriginalCapital, b.dtfileclosed, b.dtFileFrozen, b.curSettlement, b.szMagicColumn6 as  Ward_Description,
            b.wclientcode, b.sztelhome, b.sztelwork, b.sztelother
     from db.Master b
     where wClientCode = -10023
     order by Ward_Description,szAccountNumber;
quit;

 

data ABC.masterward;
     set Emfuleni.masterward(keep = szAccountNumber wlevel curOriginalCapital curSettlement Ward_Description rpc ValueofRPC open Balance_Split has_valid_phone has_cell);
     if has_valid_phone=1 or has_cell = 1 then do;
     Telephone = 1;
     ValueofTele = curOriginalCapital;
     end;
     else Telephone = 0;
run;

 

data ABC.lastjoin;
     merge ABC.ptp Emfuleni.masterward(in = a where = (open eq 1)) ABC.smscalls ABC.agrpayment;
     by Ward_Description szAccountNumber;
     if a;
run;

 

 

proc sql;
     create table ABC.summary as
     select Ward_Description, Balance_Split, sum(open) as OpenMatters, sum(curOriginalCapital) as TotalBookValue, sum(RPC) as RPC, sum(ValueofRPC) as ValueofRPC,
            sum(NumberofPTP) as NumberofPTPs, sum(NumberofPTP_Distinct) as NumberofPTP_Distinct, sum(curPTPAmount) as ValueofPTPs,
            sum(Telephone) as HasValidTelephone, sum(ValueofTele) as ValueofTele, sum(outcalls) as outcalls,
            sum(incalls) as incalls, sum(sms) as sms, sum(NumberofPayers) as NumberofPayers,
            sum(NumberofPay) as NumberofPayments, sum(curPayment) as ValueofPayments,  
            mean(curPayment) as AveragePaymentDebtor
    from ABC.lastjoin
    group by Ward_Description, Balance_Split;
quit;;

proc sql;
     create table ABC.summary_final as
     select Ward_Description, Balance_Split, OpenMatters, TotalBookValue, RPC, ValueofRPC, NumberofPTPs, NumberofPTP_Distinct, ValueofPTPs, ValueofPTPs/NumberofPTPs as AveragePTP,
            HasValidTelephone, ValueofTele, HasValidTelephone/OpenMatters as Pct_Valid_Numbers format = percent8.2,  outcalls, incalls, sms, NumberofPayers, NumberofPayers/OpenMatters as Pct_Payers format = percent8.2,
            NumberofPayments, ValueofPayments, NumberofPayers/NumberofPTP_Distinct as Pct_NrofPayers_NrofPTP_Distinct format = percent8.2,
            NumberofPayments/NumberofPTPs as Pct_NrofPayments_NrofPTPs format = percent8.2 ,AveragePaymentDebtor  

     from Emfuleni.summary
     order by Ward_Description, Balance_Split;
quit;

 

 

In this program I need to add 4 more columns, but as i add new columns into MasterWard output gets changed. Can you please provide your vaulable thought on this ?

 

Thanks In Advance

Kurt_Bremser
Super User

ABC.Masterward is created but never used. Recheck your program for this and possible other typos.

 

At which step do you actually experience the increase in observations?

Kurt_Bremser
Super User

If you add columns in a SQL SELECT that contains summary functions and a GROUP BY, all new columns must either be summarized or be in the GROUP BY.

Otherwise you will get all records of the FROM.

yudhishtirb
Calcite | Level 5

HI There,

 

Greetings

 

Thanks for feedback. I have changed my program now as below


proc sql;
     create table ABC.summary as
     select Ward_Description, Balance_Split, sum(open) as OpenMatters, sum(curOriginalCapital) as TotalBookValue, sum(RPC) as RPC, sum(ValueofRPC) as ValueofRPC,
            sum(NumberofPTP) as NumberofPTPs, sum(NumberofPTP_Distinct) as NumberofPTP_Distinct, sum(curPTPAmount) as ValueofPTPs,
            sum(Telephone) as HasValidTelephone, sum(ValueofTele) as ValueofTele, sum(outcalls) as outcalls,
            sum(incalls) as incalls, sum(sms) as sms, sum(NumberofPayers) as NumberofPayers,
            sum(NumberofPay) as NumberofPayments, sum(curPayment) as ValueofPayments,  
            mean(curPayment) as AveragePaymentDebtor
    from ABC.lastjoin
    group by Ward_Description, Balance_Split;
quit;;

 

proc sql;
     create table ABC.summary_Modified as
     select Ward_Description, Balance_Split, sum(open) as OpenMatters, sum(curOriginalCapital) as TotalBookValue, sum(RPC) as RPC, sum(ValueofRPC) as ValueofRPC,
            sum(NumberofPTP) as NumberofPTPs, sum(NumberofPTP_Distinct) as NumberofPTP_Distinct, sum(curPTPAmount) as ValueofPTPs,
            sum(Telephone) as HasValidTelephone, sum(ValueofTele) as ValueofTele, sum(outcalls) as outcalls,
            sum(incalls) as incalls, sum(sms) as sms, sum(NumberofPayers) as NumberofPayers,
            sum(NumberofPay) as NumberofPayments, sum(curPayment) as ValueofPayments,  
            mean(curPayment) as AveragePaymentDebtor, Account_Type, Exclusion_Reason
    from ABC.lastjoin
    group by Ward_Description, Balance_Split, Account_Type, Exclusion_Reason;
quit;;


proc sql;
     create table ABC.summary_final as
     select  Ward_Description, Balance_Split, OpenMatters, TotalBookValue, RPC, ValueofRPC, NumberofPTPs, NumberofPTP_Distinct, ValueofPTPs, ValueofPTPs/NumberofPTPs as AveragePTP,
            HasValidTelephone, ValueofTele, HasValidTelephone/OpenMatters as Pct_Valid_Numbers format = percent8.2,  outcalls, incalls, sms, NumberofPayers, NumberofPayers/OpenMatters as Pct_Payers format = percent8.2,
            NumberofPayments, ValueofPayments, NumberofPayers/NumberofPTP_Distinct as Pct_NrofPayers_NrofPTP_Distinct format = percent8.2,
            NumberofPayments/NumberofPTPs as Pct_NrofPayments_NrofPTPs format = percent8.2 ,AveragePaymentDebtor

     from ABC.summary
     order by Ward_Description, Balance_Split;
quit;

proc sql;
     create table ABC.summary_Final_Modified as
     select  Ward_Description, Balance_Split, OpenMatters, TotalBookValue, RPC, ValueofRPC, NumberofPTPs, NumberofPTP_Distinct, ValueofPTPs, ValueofPTPs/NumberofPTPs as AveragePTP,
            HasValidTelephone, ValueofTele, HasValidTelephone/OpenMatters as Pct_Valid_Numbers format = percent8.2,  outcalls, incalls, sms, NumberofPayers, NumberofPayers/OpenMatters as Pct_Payers format = percent8.2,
            NumberofPayments, ValueofPayments, NumberofPayers/NumberofPTP_Distinct as Pct_NrofPayers_NrofPTP_Distinct format = percent8.2,
            NumberofPayments/NumberofPTPs as Pct_NrofPayments_NrofPTPs format = percent8.2 ,AveragePaymentDebtor, Account_Type, Exclusion_Reason

     from ABC.summary_Modified 
     order by Ward_Description, Balance_Split;
quit;

 

In above statements Summary_modfied has 2 extra columns Account_Type, Exclusion_Reason. By adding this output reduced but it is still not matching with original one.

 

Please assist us. Thanks In Advance

Kurt_Bremser
Super User

Well, this is quite natural. Adding more group variables results in finer granularity, causing more records to be written.

See this as a short example:

data have;
input var1 var2 var3 var4 var5;
cards;
1 1 1 1 2
1 1 1 2 2
1 1 2 1 3
1 1 2 2 3
1 2 1 1 1
2 1 1 1 2
2 2 1 1 2
;
run;

proc sql;
create table want1 as
select var1, var2, sum(var5) as var5
from have
group by var1, var2
;
create table want2 as
select var1, var2, var3, var4, sum(var5) as var5
from have
group by var1, var2, var3, var4
;
quit;

title "2 group vars";
proc print data=want1 noobs;
run;
title "4 group vars";
proc print data=want2 noobs;
run;

This results in the following outputs:

            2 group vars             

        var1    var2    var5

          1       1      10 
          1       2       1 
          2       1       2 
          2       2       2 

            4 group vars             

var1    var2    var3    var4    var5

  1       1       1       1       2 
  1       1       1       2       2 
  1       1       2       1       3 
  1       1       2       2       3 
  1       2       1       1       1 
  2       1       1       1       2 
  2       2       1       1       2 

Since multiple values of var3 and var4 are present in the first two groups built with var1 and var2, those are split into extra lines.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1782 views
  • 0 likes
  • 3 in conversation