<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: PROC SQL match missing about 3% of cases even though they are in both files in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973358#M377677</link>
    <description>&lt;P&gt;Make sense.&amp;nbsp; Overwriting datasets:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data foo;
  set foo;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;can cause all kinds of confusion.&amp;nbsp; Glad you got it sorted.&lt;/P&gt;</description>
    <pubDate>Tue, 26 Aug 2025 20:57:54 GMT</pubDate>
    <dc:creator>Quentin</dc:creator>
    <dc:date>2025-08-26T20:57:54Z</dc:date>
    <item>
      <title>PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973322#M377670</link>
      <description>&lt;P&gt;I'm trying to match a file with each person's contract number with another file that has information about the contracts. The problem is that about 3% of records in the person file are not matching with the contract file, even though their contract ID is present in both files (verified by manual review).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;These are alphanumeric codes, and both variables are character. The person file variable is char9, while the contract file variable is char5. I thought there might be padding blanks that were preventing a match, but I added COMPRESS commands and it didn't make any difference. I ran frequencies of the results produced by the COMPRESS commands and they look right to me. All letters are capitalized in both files. Any other ideas about what might be preventing a match? Examples of the contracts that are not matching include 90091, H9585, and R5329.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Unfortunately, the data is confidential and I can't share any of it. I thought about creating example data, but this is only happening in about 3% of cases and I don't know how to make it happen with example data.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
	Create table want
		as Select a.*, b.*
	From person_file. a JOIN contract_file b
	On COMPRESS(a.person_contract) = COMPRESS(b.contract_id);
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 26 Aug 2025 17:12:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973322#M377670</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2025-08-26T17:12:01Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973324#M377671</link>
      <description>&lt;P&gt;Oops, made a typo... here is the corrected code (which still doesn't work!)&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
	Create table want
		as Select a.*, b.*
	From person_file a JOIN contract_file b
	On COMPRESS(a.person_contract) = COMPRESS(b.contract_id);
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 26 Aug 2025 17:16:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973324#M377671</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2025-08-26T17:16:41Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973329#M377672</link>
      <description>&lt;P&gt;Use the "kad" modifier in the COMPRESS function, so it keeps letters and digits; this will filter out stuff like carriage return and linefeed characters which are not visible.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 17:48:14 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973329#M377672</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2025-08-26T17:48:14Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973330#M377673</link>
      <description>&lt;P&gt;I'm still getting the same results. Here is my updated code:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;PROC SQL;
	Create table want
		as Select a.*, b.*
	From person_file a JOIN contract_file b
	On COMPRESS(a.person_contract,,'kad') = COMPRESS(b.contract_id,,'kad');
QUIT;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 18:00:53 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973330#M377673</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2025-08-26T18:00:53Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973344#M377674</link>
      <description>&lt;P&gt;One simple check is to see if a WHERE statement will find records in both datasets.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;If you run:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc print data=person_file;
  where person_contract='90091';
run;

proc print data=contract_file;
  where contract_id='90091';
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Do both PROC PRINT steps return rows?&amp;nbsp; Hopefully one of them does not, meaning the value '90091' is not in one of the datasets.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 19:34:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973344#M377674</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2025-08-26T19:34:54Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973353#M377676</link>
      <description>&lt;P&gt;Well I think I found the problem... the version of the file I was manually looking at was not the most recent version. The most recent version had been re-created and that re-introduced an earlier error I had already corrected. In my defense, I'm working with someone else's code -- I always create a new dataset name when I add or remove cases, whereas this code re-uses the same name. So it's a lot harder to tell if I'm working with a version of the dataset from before or after a critical step in the code.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 20:15:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973353#M377676</guid>
      <dc:creator>Wolverine</dc:creator>
      <dc:date>2025-08-26T20:15:23Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973358#M377677</link>
      <description>&lt;P&gt;Make sense.&amp;nbsp; Overwriting datasets:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data foo;
  set foo;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;can cause all kinds of confusion.&amp;nbsp; Glad you got it sorted.&lt;/P&gt;</description>
      <pubDate>Tue, 26 Aug 2025 20:57:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973358#M377677</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2025-08-26T20:57:54Z</dc:date>
    </item>
    <item>
      <title>Re: PROC SQL match missing about 3% of cases even though they are in both files</title>
      <link>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973469#M377694</link>
      <description>&lt;P&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/43822"&gt;@Wolverine&lt;/a&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Glad you identified the problem.&amp;nbsp; Mark your own explanation as the solution, so that this topic no longer appears as unsolved.&lt;/P&gt;</description>
      <pubDate>Thu, 28 Aug 2025 00:52:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/PROC-SQL-match-missing-about-3-of-cases-even-though-they-are-in/m-p/973469#M377694</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2025-08-28T00:52:21Z</dc:date>
    </item>
  </channel>
</rss>

