Solved: Sum of Sum

JediApprentice · Posted 03-25-2016 05:47 PM

I have three separate proc sqls. In the first one I am selecting (sum(Rx)) as Sum_of_Rx. In the second proc sql, I select (sum(Sum_of_Rx)) as Sum_of_Sum_of_Rx. And in the third proc sql, I select (sum(Sum_of_Sum_of_Rx)) as Sum_of_sum_of_sum_of_Rx. If the value of Rx is 1, what will each of these sum variables be? Obviously the Sum_of_Rx would be 1, but what about the rest? Note: I didn't write this code. I'm trying to understand why the person who wrote it would do this.

Reeza · Posted 03-25-2016 06:53 PM

Love the naming conventions.

The first sum in the data step doesn't do anything, it only renames the variable and is pretty useless as shown.

The second sum creates an aggregate across the group by variables

the third creates another aggregate across a different set of group by variables, after joining with a different table.

for example, total by county for first SQL and then by state.

If if its set to 1 initially it could be used as a record counter to determine how many records are in each group by component.

View solution in original post

Reeza · Posted 03-25-2016 06:13 PM

It depends on the data, and possible the group by statement.

Please post some some sample data and the full code.

JediApprentice · Posted 03-25-2016 06:30 PM

Sum(Rx) is first defined in this datastep (earlier on Rx is set to value of 1)

data EGTASK.Data_Put_In_Access;
  set work.Calculated_Data2;
  format AffiliationID $6. ApprovedDispensingFeeAmount DOLLAR16.2 ApprovedPriceType $10. CarrierID $9. ClaimOriginationFlag $1.
		 ClaimStatus $1. CostTypeCode $10. DISPENSEASWRITTENDAWPRODUCTSELE $1.
		 ExtendedSupplyNetworkIndicator $1. FilledMonth $7. FilledQuarter $6.
		 GPINumber $14. MailOrderIndicator $1. MultiSourceCode $1. PlanDrugStatus $1.
		 PriceScheduleDefinition $10. RxNetworkID $6. RxNetworkID_CHAIN $6.
         SERVICEPROVIDERID $15. SuperPharmacyNetworkID $6. PHARMACYZIPCODE $10.
		 SUM_of_ApprovedIngredientCost DOLLAR16.2 SUM_of_ApprDispFeeAmount DOLLAR16.2 SUM_of_QUANTITYDISPENSED 13.3 ZIP $10.;
  RxNetworkID_CHAIN = RxNetworkID;
  SUM_of_ApprovedIngredientCost = sum(ApprovedIngredientCost);
  SUM_of_ApprDispFeeAmount = sum(ApprovedDispensingFeeAmount);
  SUM_of_AWP = sum(AWP);
  SUM_of_RX = sum(RX);
  SUM_of_QUANTITYDISPENSED = sum(QUANTITYDISPENSED);
  ZIP = PHARMACYZIPCODE;
  by AffiliationID ApprovedPriceType BRAND_GENERIC CarrierID ClaimOriginationFlag 
     ClaimStatus CLIENT CostTypeCode DISPENSEASWRITTENDAWPRODUCTSELE 
     ExtendedSupplyNetworkIndicator FilledMonth FilledQuarter FinalPlanCode GPINumber 
     MailOrderIndicator MultiSourceCode PlanDrugStatus PriceScheduleDefinition RxNetworkID 
     SERVICEPROVIDERID SuperPharmacyNetworkID PHARMACYZIPCODE Specialty_Flag EXCLUSIONS 
     IVLIndicator;
run;

And here is the first proc sql where sum of the sum takes place:

PROC SQL;
 CREATE TABLE EGTASK.Data_Put_In_Access AS 
 SELECT t1.AccountID,
        t1.AffiliationID,
        t1.ApprovedIngredientCost, 
        t1.ApprovedPriceType,
        t1.ApprovedDispensingFeeAmount,
        t1.AverageWholesalePriceUnitCost,
        t1.AWP, 
        t1.BRAND_GENERIC, 
        t1.CarrierID, 
        t1.ClaimOriginationFlag, 
        t1.ClaimStatus, 
        t1.CLIENT,
        t1.COMPOUNDCODE, 
        t1.CostTypeCode, 
        t1.DISPENSEASWRITTENDAWPRODUCTSELE,
        t1.EXCLUSIONS,
        t1.ExtendedSupplyNetworkIndicator, 
        t1.FilledMonth, 
        t1.FilledQuarter, 
        t1.FinalPlanCode, 
        t1.GPINumber,
        t1.GroupPlanCode,
        t1.MailOrderIndicator,
        t1.MemberID, 
        t1.MultiSourceCode,
        t1.NETWORK, 
        t1.PlanDrugStatus, 
        t1.PriceScheduleDefinition, 
        t1.RxNetworkID, 
        t1.RXNetworkID_CHAIN,
        t1.QUANTITYDISPENSED, 
        t1.SERVICEPROVIDERID,
        t1.Specialty_Flag, 
        t1.SuperPharmacyNetworkID, 
        t1.ZIP,
        t1.SUM_of_ApprovedIngredientCost,
        t1.SUM_of_ApprDispFeeAmount,
       /* SUM_of_SUM_of_ApprovedIngredientCost */
       (SUM(t1.SUM_of_ApprovedIngredientCost)) AS SUM_of_SUM_of_AIC, 
       /* SUM_of_SUM_of_AWP */
       (SUM(t1.SUM_of_AWP)) AS SUM_of_SUM_of_AWP, 
       /* SUM_of_SUM_of_RX */
       (SUM(t1.SUM_of_RX)) AS SUM_of_SUM_of_RX, 
       /* SUM_of_SUM_of_QUANTITYDISPENSED */
       (SUM(t1.SUM_of_QUANTITYDISPENSED)) FORMAT=13.3 AS SUM_of_SUM_of_QUANTITYDISPENSED,
       /* SUM_of_SUM_of_ApprDispFeeAmount */
       (SUM(t1.SUM_of_ApprDispFeeAmount)) AS SUM_of_SUM_of_ApprDispFeeAmount, 
       t1.AFF_NAME, 
       t1.AFF_TYPE, 
       t1.NETWORK, 
       t1.'Business Line'n, 
       t1.'Analytics Grouping 1'n, 
     FROM EGTASK.Data_Put_In_Access t1
     GROUP BY t1.AffiliationID, t1.ApprovedPriceType, t1.BRAND_GENERIC, t1.CarrierID, t1.ClaimOriginationFlag, 
              t1.ClaimStatus, t1.CLIENT, t1.CostTypeCode, t1.DISPENSEASWRITTENDAWPRODUCTSELE, 
              t1.ExtendedSupplyNetworkIndicator, t1.FilledMonth, t1.FilledQuarter, t1.FinalPlanCode, t1.GPINumber, 
              t1.MailOrderIndicator, t1.MultiSourceCode, t1.PlanDrugStatus, t1.PriceScheduleDefinition, t1.RxNetworkID, 
              t1.RXNetworkID_CHAIN, t1.SERVICEPROVIDERID, t1.SuperPharmacyNetworkID, t1.ZIP, t1.Specialty_Flag, 
              t1.SUM_of_ApprovedIngredientCost, t1.AFF_NAME, t1.AFF_TYPE, t1.NETWORK, t1.'Business Line'n, t1.
              'Analytics Grouping 1'n, (CALCULATED 'MAC Savings Grouping'n), t1.IVLIndicator;
QUIT;

And here is the second proc sql where the sum of sum of sum takes place:

PROC SQL;
   CREATE TABLE EGTASK.All_Combined_Table AS 
   SELECT t1.ApprovedDispensingFeeAmount,
          t1.ApprovedIngredientCost,
          t1.ApprovedPriceType,
          t1.AverageWholesalePriceUnitCost,
          t1.AWP,
          t1.BRAND_GENERIC,
          t1.CarrierID, 
          t1.CLIENT,
	  t1.ExtendedSupplyNetworkIndicator,
          t1.FilledMonth, 
          t1.FilledQuarter AS 'Filled Quarter'n, 
          t1.GPINumber, 
          t1.MailOrderIndicator,
          t1.MemberID, 
          t1.MultiSourceCode, 
          t1.PriceScheduleDefinition,
          t1.QUANTITYDISPENSED, 
          t1.RxNetworkID AS RXNetwork,
	  t1.RXNetworkID_CHAIN,
          t1.SuperPharmacyNetworkID, 
	  /* SUM_OF_ApprovedIngredientCost */
            (SUM(t1.SUM_of_SUM_of_AIC)) FORMAT=DOLLAR16.2 AS SUM_OF_ApprovedIngredientCost, 
          /* SUM_OF_AWP */
            (SUM(t1.SUM_of_SUM_of_AWP)) AS SUM_OF_SUM_OF_SUM_AWP, 
          /* SUM_OF_RX */
            (SUM(t1.SUM_of_SUM_of_RX)) AS SUM_OF_SUM_OF_SUM_RX, 
          /* SUM_OF_QUANTITYDISPENSED */
            (SUM(t1.SUM_of_SUM_of_QUANTITYDISPENSED)) FORMAT=13.3 AS SUM_OF_SUM_OF_SUM_QUANTITYDISPENSED,
	  /* SUM_OF_ADFA */
	    (SUM(t1.SUM_of_SUM_of_ApprDispFeeAmount)) FORMAT=DOLLAR16.2 AS SUM_OF_SUM_OF_SUM_ADFA,
          t1.AFF_NAME AS 'AFF Name'n, 
          t1.AFF_TYPE AS 'AFF Type'n, 
          t1.NETWORK,
          t1.Specialty_Flag, 
          t1.'Business Line'n AS Business_Line, 
          t1.'Analytics Grouping 1'n AS Analytics_Grouping1, 
          t1.'MAC Savings Grouping'n AS MACSavingsGrouping, 
          t1.IVLIndicator
      FROM EGTASK.DATA_WITH_MAC_GROUPING t1 LEFT JOIN EGTASK.Tables_for_SAS4 t2 ON (t1.PriceScheduleDefinition = t2.
          price_schedule)
      WHERE t1.NETWORK NOT = 'PAPERMATCH'
      GROUP BY t1.ApprovedPriceType, t1.CarrierID, t1.CLIENT, t1.FilledMonth, t1.FilledQuarter, t1.GPINumber, 
              t1.MailOrderIndicator, t1.MultiSourceCode, t1.PriceScheduleDefinition, t1.RxNetworkID, 
              t1.SuperPharmacyNetworkID, t1.AFF_NAME, t1.AFF_TYPE, t1.NETWORK, t1.'Business Line'n, t1.
              'Analytics Grouping 1'n, t1.'MAC Savings Grouping'n, (CALCULATED 'MAC List'n), t1.IVLIndicator;
QUIT;

Reeza · Posted 03-25-2016 06:53 PM

Love the naming conventions.

The first sum in the data step doesn't do anything, it only renames the variable and is pretty useless as shown.

The second sum creates an aggregate across the group by variables

the third creates another aggregate across a different set of group by variables, after joining with a different table.

for example, total by county for first SQL and then by state.

If if its set to 1 initially it could be used as a record counter to determine how many records are in each group by component.

Sum of Sum

Re: Sum of Sum

Re: Sum of Sum

Re: Sum of Sum

Re: Sum of Sum

Catch up on SAS Innovate 2026

Sum of Sum

Re: Sum of Sum

Re: Sum of Sum

Re: Sum of Sum

Re: Sum of Sum

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away