<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC SQL SELECT DISTINCT in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/299010#M60379</link>
    <description>&lt;P&gt;On second SQL program you have a GROUP statement.&lt;/P&gt;&lt;P&gt;With DISTINCT you got one obervation per group only.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does it explain and give an answer to your question ?&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2016 18:40:01 GMT</pubDate>
    <dc:creator>Shmuel</dc:creator>
    <dc:date>2016-09-16T18:40:01Z</dc:date>
    <item>
      <title>PROC SQL SELECT DISTINCT</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/298969#M60367</link>
      <description>&lt;P&gt;I can't post sample data because these datasets are extremely large. This is the problem I'm running into. When I run this query:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
   CREATE TABLE EGTASK.DATA_WITH_MAC_GROUPING2 AS 
   SELECT t1.AffiliationID, 
          t1.ApprovedPriceType, 
          t1.BRAND_GENERIC, 
          t1.CarrierID, 
          t1.ClaimOriginationFlag, 
          t1.ClaimStatus, 
          t1.CLIENT, 
          t1.CostTypeCode, 
          t1.DISPENSEASWRITTENDAWPRODUCTSELE, 
          t1.ExtendedSupplyNetworkIndicator, 
          t1.FilledMonth, 
          t1.FilledQuarter, 
          t1.FinalPlanCode, 
          t1.GPINumber, 
          t1.MailOrderIndicator, 
          t1.MemberID, 
          t1.MultiSourceCode, 
          t1.PlanDrugStatus, 
          t1.PriceScheduleDefinition, 
          t1.RxNetworkID, 
          t1.RXNetworkID_CHAIN, 
          t1.SERVICEPROVIDERID, 
          t1.Specialty_Flag, 
          t1.SuperPharmacyNetworkID, 
          t1.ZIP, 
          t1.SUM_of_ApprovedIngredientCost, 
          t1.SUM_of_SUM_of_AIC, 
          t1.SUM_of_SUM_of_AWP, 
          t1.SUM_of_SUM_of_RX, 
          t1.SUM_of_SUM_of_QUANTITYDISPENSED, 
          t1.AFF_NAME, 
          t1.AFF_TYPE, 
          t1.NETWORK, 
          t1.'Business Line'n, 
          t1.'Analytics Grouping 1'n, 
          t1.'MAC Savings Grouping'n, 
          t1.IVLIndicator, 
          t1.Fund_Type, 
          /* GER Pharm */
            (CASE  
               WHEN t1.RXNetworkID IN('00008','00177','00608','00782','00039','00123','00207') THEN 'CVS' 
               WHEN t1.RXNetworkID IN('00A13','00226') THEN 'Walgreens'
               WHEN t1.RXNetworkID IN('00599','00886','00519') THEN 'Omnicare'
               WHEN t1.RXNetworkID IN('00631','00752','00754','00686') THEN 'Pharmerica'
               WHEN t1.RXNetworkID = '00905' AND t1.SuperPharmacyNetworkID IN('MDLTC','MDLTCS') THEN 'MHA'
               ELSE 'Other'
            END) AS 'GER Pharm'n, 
          t2.ApprovedPriceType AS Instate, 
          t3.ApprovedPriceType AS Code_MAC
      FROM EGTASK.DATA_WITH_MAC_GROUPING t1
           LEFT JOIN EGTASK.'TABLES FOR SAS'n t2 ON (t1.SuperPharmacyNetworkID = t2.SuperPharmacyNetworkID)
           LEFT JOIN EGTASK.'TABLES FOR SAS_0000'n t3 ON (t1.FinalPlanCode = t3.FinalPlanCode);
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I'm getting about 2 million observations. If I add DISTINCT to the SELECT clause I still get 2 million observations. However when I run this one (which directly follows the previous one):&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
   CREATE TABLE EGTASK.All_Combined_Table AS 
   SELECT DISTINCT t1.ApprovedPriceType,
          t1.CarrierID, 
          t1.CLIENT,
          t1.FilledMonth, 
          t1.FilledQuarter AS 'Filled Quarter'n, 
          t1.GPINumber, 
          t1.MailOrderIndicator,
          count(distinct(t1.MemberID)) as Lives, 
          t1.MultiSourceCode, 
          t1.PriceScheduleDefinition, 
          t1.RxNetworkID AS RXNetwork,
          t1.SuperPharmacyNetworkID, 
		  /* SUM_OF_ApprovedIngredientCost */
            (SUM(t1.SUM_of_SUM_of_AIC)) FORMAT=DOLLAR16.2 AS SUM_OF_ApprovedIngredientCost, 
          /* SUM_OF_AWP */
            (SUM(t1.SUM_of_SUM_of_AWP)) AS SUM_OF_AWP, 
          /* SUM_OF_RX */
            (SUM(t1.SUM_of_SUM_of_RX)) AS SUM_OF_RX, 
          /* SUM_OF_QUANTITYDISPENSED */
            (SUM(t1.SUM_of_SUM_of_QUANTITYDISPENSED)) FORMAT=13.3 AS SUM_OF_QUANTITYDISPENSED,
          t1.AFF_NAME AS 'AFF Name'n, 
          t1.AFF_TYPE AS 'AFF Type'n, 
          t1.NETWORK, 
          t1.'Business Line'n AS Business_Line, 
          t1.'Analytics Grouping 1'n AS Analytics_Grouping1, 
          t1.'MAC Savings Grouping'n AS MACSavingsGrouping, 
          t1.IVLIndicator,
		  t1.Fund_Type,
		  t1.'GER Pharm'n,
		  t1.Instate,
		  t1.Code_MAC,
		  t1.MAC_List
      FROM EGTASK.DATA_WITH_MAC_GROUPING2 t1
      WHERE t1.NETWORK NOT = 'PAPERMATCH'
      GROUP BY t1.ApprovedPriceType, t1.CarrierID, t1.CLIENT, t1.FilledMonth, t1.FilledQuarter, t1.GPINumber, 
              t1.MailOrderIndicator, t1.MultiSourceCode, t1.PriceScheduleDefinition, t1.RxNetworkID, 
              t1.SuperPharmacyNetworkID, t1.AFF_NAME, t1.AFF_TYPE, t1.NETWORK, t1.'Business Line'n, t1.
              'Analytics Grouping 1'n, t1.'MAC Savings Grouping'n, t1.MAC_List, t1.IVLIndicator, t1.Fund_Type;
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I also get 2 million records. However if I add the DISCINT to the SELECT clause on this one (The ALL_COMBINED_TABLE), I get about 800,000 observations. I don't understand why this happens.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2016 16:05:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/298969#M60367</guid>
      <dc:creator>JediApprentice</dc:creator>
      <dc:date>2016-09-16T16:05:57Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL SELECT DISTINCT</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/299010#M60379</link>
      <description>&lt;P&gt;On second SQL program you have a GROUP statement.&lt;/P&gt;&lt;P&gt;With DISTINCT you got one obervation per group only.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Does it explain and give an answer to your question ?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2016 18:40:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/299010#M60379</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2016-09-16T18:40:01Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL SELECT DISTINCT</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/299014#M60381</link>
      <description>&lt;P&gt;Then with the DISTINCT on the second one, would I be getting rid of records I'd actually want to keep do you think?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2016 18:50:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/299014#M60381</guid>
      <dc:creator>JediApprentice</dc:creator>
      <dc:date>2016-09-16T18:50:21Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL SELECT DISTINCT</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/299019#M60384</link>
      <description>&lt;P&gt;Sorry, I don't understand what you mean by &lt;STRONG&gt;"getting rid of records I'd actually want to keep" ?&lt;/STRONG&gt;&lt;/P&gt;&lt;P&gt;Can you give a schematic example of one group input (detailed) and output expected ?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2016 19:01:06 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/PROC-SQL-SELECT-DISTINCT/m-p/299019#M60384</guid>
      <dc:creator>Shmuel</dc:creator>
      <dc:date>2016-09-16T19:01:06Z</dc:date>
    </item>
  </channel>
</rss>

