<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Detecting Which Merge Variable Causes a Mismatch in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Detecting-Which-Merge-Variable-Causes-a-Mismatch/m-p/741399#M231761</link>
    <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Interesting question.&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":thinking_face:"&gt;🤔&lt;/span&gt;. Had to think about it.&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":thinking_face:"&gt;🤔&lt;/span&gt;.&lt;/P&gt;
&lt;P&gt;In my opinion, you can only do a cartesian join of all the non-exactly-matches and then compare the by-variables with &lt;BR /&gt;COMPGED (r&lt;SPAN style="font-family: inherit;"&gt;eturns the generalized edit distance between two strings).&lt;/SPAN&gt;&lt;/P&gt;
&lt;DIV class="xis-refDictEntry"&gt;
&lt;P class="xis-shortDescription"&gt;If you see three zeros, then your mismatch is due to&amp;nbsp;the case,&amp;nbsp;leading blanks and/or quotation marks.&lt;/P&gt;
&lt;P class="xis-shortDescription"&gt;When you do not have three zeros, look for lines where you have 2 zeros and the other compged-value being very small. That can be a typo-error. And so on. You will get it when looking at the output table!!&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data list;     set list;     rownum=_N_; run;
data lab_file; set lab_file; rownum=_N_; run;
PROC SQL noprint;
 create table work.mismatch as
 select   t1.rownum as rownum_t1           , t1.sample_type as sample_type_t1 
        , t1.sample_name as sample_name_t1 , t1.temperature as temperature_t1
        , t2.rownum as rownum_t2           , t2.sample_type as sample_type_t2 
        , t2.sample_name as sample_name_t2 , t2.temperature as temperature_t2
 from   list     t1
      , lab_file t2
 where NOT 
 (    t1.sample_type = t2.sample_type 
  AND t1.sample_name = t2.sample_name 
  AND t1.temperature = t2.temperature )
; QUIT;
data work.mismatch;
 set work.mismatch;
 sample_type_match=compged(sample_type_t1,sample_type_t2,"iLN");
 sample_name_match=compged(sample_name_t1,sample_name_t2,"iLN");
 temperature_match=compged(temperature_t1,temperature_t2,"iLN");
run;
/* end of program */&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="xis-shortDescription"&gt;Cheers,&lt;/P&gt;
&lt;P class="xis-shortDescription"&gt;Koen&lt;/P&gt;
&lt;/DIV&gt;</description>
    <pubDate>Fri, 14 May 2021 12:21:57 GMT</pubDate>
    <dc:creator>sbxkoenk</dc:creator>
    <dc:date>2021-05-14T12:21:57Z</dc:date>
    <item>
      <title>Detecting Which Merge Variable Causes a Mismatch</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Detecting-Which-Merge-Variable-Causes-a-Mismatch/m-p/741144#M231643</link>
      <description>&lt;P&gt;I have an unusual assignment where I have a list of expected lab tests categorized by "Sample Type", "Sample Name", and "Temperature".&amp;nbsp; I then receive a file from the lab denoting what lab samples were received with its temperature.&amp;nbsp; I am supposed to detect if anything in the lab file is not found in the expected list.&amp;nbsp; That part would be easy to just merge by the 3 variables.&amp;nbsp; The hard part comes in when they want me to detail which variable was the cause of not finding a match.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For example, the SAS code below shows a small sample of expected labs and a small sample of received labs.&amp;nbsp; Because there are some "typos" or a different sample received, I want to retain those mismatches, but also show why they mismatched.&amp;nbsp; So, in the LAB_FILE dataset, rows 3, 6, 7, and 8 should be retained.&amp;nbsp; However, I am not sure how to denote which variable caused it to not be found.&amp;nbsp; Line 3 should return unexpected Type and Name.&amp;nbsp; Line 6 should return unexpected Name.&amp;nbsp; Line 7 should return unexpected Type.&amp;nbsp; Line 8 should return unexpected Temperature.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have noticed that when I can't figure out how to do something, someone always comes to my rescue with SQL, so maybe I just need to use a different approach.&amp;nbsp; Does anyone have any helpful ideas?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data list;
	input Sample_Type $ 1-16 Sample_Name $ 17-28 Temperature $ 29-36;
	datalines;
Plasma EDTA K2	BIO PLSM 1	Frozen
Plasma EDTA K2	BIO PLSM 2	Frozen
WB EDTA K2		139 VARIANT	Frozen
WB EDTA K2		39 VARIANT	Frozen
WB EDTA K2		FLSEQ		Frozen
WB EDTA K2		HEMOR		Ambient
;
run;

data lab_file;
	input Sample_Type $ 1-16 Sample_Name $ 17-28 Temperature $ 29-36;
	datalines;
Plasma EDTA K2	BIO PLSM 1	Frozen
Plasma EDTA K2	BIO PLSM 2	Frozen
Plasma EDTA K3	BIO PLSM 3	Frozen
WB EDTA K2		139 VARIANT	Frozen
WB EDTA K2		39 VARIANT	Frozen
WB EDTA K2		29 VARIANT	Frozen
WB EDTA			HEMOR		Ambient
WB EDTA K2		FLSEQ		Ambient
;
run;

proc sort data=list;
	by sample_type sample_name temperature;
proc sort data=lab_file;
	by sample_type sample_name temperature;
run;

/* I want to keep rows 3, 6, 7, 8 from LAB_FILE but denote which variable causes a mismatch */
data compare;
	merge list (in=a) lab_file (in=b);
	by sample_type sample_name temperature;
	if a &amp;amp; b then delete;
	if b;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 13 May 2021 15:15:42 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Detecting-Which-Merge-Variable-Causes-a-Mismatch/m-p/741144#M231643</guid>
      <dc:creator>djbateman</dc:creator>
      <dc:date>2021-05-13T15:15:42Z</dc:date>
    </item>
    <item>
      <title>Re: Detecting Which Merge Variable Causes a Mismatch</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Detecting-Which-Merge-Variable-Causes-a-Mismatch/m-p/741399#M231761</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Interesting question.&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":thinking_face:"&gt;🤔&lt;/span&gt;. Had to think about it.&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":thinking_face:"&gt;🤔&lt;/span&gt;.&lt;/P&gt;
&lt;P&gt;In my opinion, you can only do a cartesian join of all the non-exactly-matches and then compare the by-variables with &lt;BR /&gt;COMPGED (r&lt;SPAN style="font-family: inherit;"&gt;eturns the generalized edit distance between two strings).&lt;/SPAN&gt;&lt;/P&gt;
&lt;DIV class="xis-refDictEntry"&gt;
&lt;P class="xis-shortDescription"&gt;If you see three zeros, then your mismatch is due to&amp;nbsp;the case,&amp;nbsp;leading blanks and/or quotation marks.&lt;/P&gt;
&lt;P class="xis-shortDescription"&gt;When you do not have three zeros, look for lines where you have 2 zeros and the other compged-value being very small. That can be a typo-error. And so on. You will get it when looking at the output table!!&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data list;     set list;     rownum=_N_; run;
data lab_file; set lab_file; rownum=_N_; run;
PROC SQL noprint;
 create table work.mismatch as
 select   t1.rownum as rownum_t1           , t1.sample_type as sample_type_t1 
        , t1.sample_name as sample_name_t1 , t1.temperature as temperature_t1
        , t2.rownum as rownum_t2           , t2.sample_type as sample_type_t2 
        , t2.sample_name as sample_name_t2 , t2.temperature as temperature_t2
 from   list     t1
      , lab_file t2
 where NOT 
 (    t1.sample_type = t2.sample_type 
  AND t1.sample_name = t2.sample_name 
  AND t1.temperature = t2.temperature )
; QUIT;
data work.mismatch;
 set work.mismatch;
 sample_type_match=compged(sample_type_t1,sample_type_t2,"iLN");
 sample_name_match=compged(sample_name_t1,sample_name_t2,"iLN");
 temperature_match=compged(temperature_t1,temperature_t2,"iLN");
run;
/* end of program */&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P class="xis-shortDescription"&gt;Cheers,&lt;/P&gt;
&lt;P class="xis-shortDescription"&gt;Koen&lt;/P&gt;
&lt;/DIV&gt;</description>
      <pubDate>Fri, 14 May 2021 12:21:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Detecting-Which-Merge-Variable-Causes-a-Mismatch/m-p/741399#M231761</guid>
      <dc:creator>sbxkoenk</dc:creator>
      <dc:date>2021-05-14T12:21:57Z</dc:date>
    </item>
    <item>
      <title>Re: Detecting Which Merge Variable Causes a Mismatch</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Detecting-Which-Merge-Variable-Causes-a-Mismatch/m-p/741451#M231775</link>
      <description>&lt;P&gt;Koen,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks for your help.&amp;nbsp; I never would have thought of this method.&amp;nbsp; It may still take some modifying to get what I fully need, but you gave me a good start.&amp;nbsp; I was informed that the main variable of interest is Sample_Name, so I started merging by just Sample_Name. If it is not found in the lab file, then I know it is unexpected.&amp;nbsp; Then I compare the associated Sample_Type and Temperature.&amp;nbsp; This seems to be working for what I need at the moment, but I'm not sure how reliable it will be.&amp;nbsp; Still, thanks for giving me more to think about if I need further manipulation.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Dallas&lt;/P&gt;</description>
      <pubDate>Fri, 14 May 2021 13:57:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Detecting-Which-Merge-Variable-Causes-a-Mismatch/m-p/741451#M231775</guid>
      <dc:creator>djbateman</dc:creator>
      <dc:date>2021-05-14T13:57:15Z</dc:date>
    </item>
  </channel>
</rss>

