BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.

I can't post sample data because these datasets are extremely large. This is the problem I'm running into. When I run this query:

 

 

PROC SQL;
   CREATE TABLE EGTASK.DATA_WITH_MAC_GROUPING2 AS 
   SELECT t1.AffiliationID, 
          t1.ApprovedPriceType, 
          t1.BRAND_GENERIC, 
          t1.CarrierID, 
          t1.ClaimOriginationFlag, 
          t1.ClaimStatus, 
          t1.CLIENT, 
          t1.CostTypeCode, 
          t1.DISPENSEASWRITTENDAWPRODUCTSELE, 
          t1.ExtendedSupplyNetworkIndicator, 
          t1.FilledMonth, 
          t1.FilledQuarter, 
          t1.FinalPlanCode, 
          t1.GPINumber, 
          t1.MailOrderIndicator, 
          t1.MemberID, 
          t1.MultiSourceCode, 
          t1.PlanDrugStatus, 
          t1.PriceScheduleDefinition, 
          t1.RxNetworkID, 
          t1.RXNetworkID_CHAIN, 
          t1.SERVICEPROVIDERID, 
          t1.Specialty_Flag, 
          t1.SuperPharmacyNetworkID, 
          t1.ZIP, 
          t1.SUM_of_ApprovedIngredientCost, 
          t1.SUM_of_SUM_of_AIC, 
          t1.SUM_of_SUM_of_AWP, 
          t1.SUM_of_SUM_of_RX, 
          t1.SUM_of_SUM_of_QUANTITYDISPENSED, 
          t1.AFF_NAME, 
          t1.AFF_TYPE, 
          t1.NETWORK, 
          t1.'Business Line'n, 
          t1.'Analytics Grouping 1'n, 
          t1.'MAC Savings Grouping'n, 
          t1.IVLIndicator, 
          t1.Fund_Type, 
          /* GER Pharm */
            (CASE  
               WHEN t1.RXNetworkID IN('00008','00177','00608','00782','00039','00123','00207') THEN 'CVS' 
               WHEN t1.RXNetworkID IN('00A13','00226') THEN 'Walgreens'
               WHEN t1.RXNetworkID IN('00599','00886','00519') THEN 'Omnicare'
               WHEN t1.RXNetworkID IN('00631','00752','00754','00686') THEN 'Pharmerica'
               WHEN t1.RXNetworkID = '00905' AND t1.SuperPharmacyNetworkID IN('MDLTC','MDLTCS') THEN 'MHA'
               ELSE 'Other'
            END) AS 'GER Pharm'n, 
          t2.ApprovedPriceType AS Instate, 
          t3.ApprovedPriceType AS Code_MAC
      FROM EGTASK.DATA_WITH_MAC_GROUPING t1
           LEFT JOIN EGTASK.'TABLES FOR SAS'n t2 ON (t1.SuperPharmacyNetworkID = t2.SuperPharmacyNetworkID)
           LEFT JOIN EGTASK.'TABLES FOR SAS_0000'n t3 ON (t1.FinalPlanCode = t3.FinalPlanCode);
QUIT;

I'm getting about 2 million observations. If I add DISTINCT to the SELECT clause I still get 2 million observations. However when I run this one (which directly follows the previous one):

 

 

 

PROC SQL;
   CREATE TABLE EGTASK.All_Combined_Table AS 
   SELECT DISTINCT t1.ApprovedPriceType,
          t1.CarrierID, 
          t1.CLIENT,
          t1.FilledMonth, 
          t1.FilledQuarter AS 'Filled Quarter'n, 
          t1.GPINumber, 
          t1.MailOrderIndicator,
          count(distinct(t1.MemberID)) as Lives, 
          t1.MultiSourceCode, 
          t1.PriceScheduleDefinition, 
          t1.RxNetworkID AS RXNetwork,
          t1.SuperPharmacyNetworkID, 
		  /* SUM_OF_ApprovedIngredientCost */
            (SUM(t1.SUM_of_SUM_of_AIC)) FORMAT=DOLLAR16.2 AS SUM_OF_ApprovedIngredientCost, 
          /* SUM_OF_AWP */
            (SUM(t1.SUM_of_SUM_of_AWP)) AS SUM_OF_AWP, 
          /* SUM_OF_RX */
            (SUM(t1.SUM_of_SUM_of_RX)) AS SUM_OF_RX, 
          /* SUM_OF_QUANTITYDISPENSED */
            (SUM(t1.SUM_of_SUM_of_QUANTITYDISPENSED)) FORMAT=13.3 AS SUM_OF_QUANTITYDISPENSED,
          t1.AFF_NAME AS 'AFF Name'n, 
          t1.AFF_TYPE AS 'AFF Type'n, 
          t1.NETWORK, 
          t1.'Business Line'n AS Business_Line, 
          t1.'Analytics Grouping 1'n AS Analytics_Grouping1, 
          t1.'MAC Savings Grouping'n AS MACSavingsGrouping, 
          t1.IVLIndicator,
		  t1.Fund_Type,
		  t1.'GER Pharm'n,
		  t1.Instate,
		  t1.Code_MAC,
		  t1.MAC_List
      FROM EGTASK.DATA_WITH_MAC_GROUPING2 t1
      WHERE t1.NETWORK NOT = 'PAPERMATCH'
      GROUP BY t1.ApprovedPriceType, t1.CarrierID, t1.CLIENT, t1.FilledMonth, t1.FilledQuarter, t1.GPINumber, 
              t1.MailOrderIndicator, t1.MultiSourceCode, t1.PriceScheduleDefinition, t1.RxNetworkID, 
              t1.SuperPharmacyNetworkID, t1.AFF_NAME, t1.AFF_TYPE, t1.NETWORK, t1.'Business Line'n, t1.
              'Analytics Grouping 1'n, t1.'MAC Savings Grouping'n, t1.MAC_List, t1.IVLIndicator, t1.Fund_Type;
QUIT;

 

I also get 2 million records. However if I add the DISCINT to the SELECT clause on this one (The ALL_COMBINED_TABLE), I get about 800,000 observations. I don't understand why this happens.

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

On second SQL program you have a GROUP statement.

With DISTINCT you got one obervation per group only.

 

Does it explain and give an answer to your question ?

View solution in original post

3 REPLIES 3
Shmuel
Garnet | Level 18

On second SQL program you have a GROUP statement.

With DISTINCT you got one obervation per group only.

 

Does it explain and give an answer to your question ?

JediApprentice
Pyrite | Level 9

Then with the DISTINCT on the second one, would I be getting rid of records I'd actually want to keep do you think?

Shmuel
Garnet | Level 18

Sorry, I don't understand what you mean by "getting rid of records I'd actually want to keep" ?

Can you give a schematic example of one group input (detailed) and output expected ?

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 4050 views
  • 0 likes
  • 2 in conversation