<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Case Control Matched Population - Avoiding Duplicates in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645163#M192822</link>
    <description>Hello Reinhard!  Thank you. I will try this tomorrow and keep you posted. &lt;BR /&gt;</description>
    <pubDate>Tue, 05 May 2020 04:54:39 GMT</pubDate>
    <dc:creator>anissak1</dc:creator>
    <dc:date>2020-05-05T04:54:39Z</dc:date>
    <item>
      <title>Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644573#M192547</link>
      <description>&lt;P&gt;Hello - I've created a case control matched population with clinical trial data. &amp;nbsp;Even if you are not familiar with clinical trials, I think the SAS coding should be relatively simple - please help if you can. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&amp;nbsp;I have been able to undertake all steps except one. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I have are 602 unique cases. &amp;nbsp;I have a dataset that includes possible matching controls to these cases (68K options as many controls match each possible case). I would like to select one unique control for each of the possible cases. &amp;nbsp;My current code, below, returns a match for each case, but it sometimes gives me a control that has already been selected for another case. &amp;nbsp;If a control has already been selected for a case, I want SAS to "move on" to the next control that is an option and not include any duplicated controls in my final dataset. &amp;nbsp;So in sum, I want 602 cases and 602 controls.&lt;/P&gt;
&lt;P&gt;Partial dataset attached.&lt;/P&gt;
&lt;P&gt;Thank you for any guidance!&lt;/P&gt;
&lt;P&gt;Anissa&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;data random;&lt;BR /&gt;set controls_usubjid;&lt;BR /&gt;call streaminit(12345);&lt;BR /&gt;random= rand('uniform');&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;proc sort data=random ; by cases_usubjid random;&lt;BR /&gt;run;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;data controls_usubjid2 not_enough; set random;&lt;BR /&gt;by cases_usubjid ;&lt;BR /&gt;retain num;&lt;BR /&gt;if first.cases_usubjid then num=1; if num le 1 then do;&lt;BR /&gt;output controls_usubjid2;&lt;BR /&gt;num=num+1;&lt;BR /&gt;end;&lt;BR /&gt;if last.cases_usubjid then do;&lt;BR /&gt;if num le 2 then output not_enough; end;&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 01 May 2020 21:10:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644573#M192547</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-01T21:10:02Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644654#M192585</link>
      <description>&lt;P&gt;Don't you want this rather?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data CONTROLS_USUBJID2 NOT_ENOUGH; 
  set RANDOM;
  by CASES_USUBJID ;
  if first.CASES_USUBJID then NUM=1; 
  if NUM eq 1 then output CONTROLS_USUBJID2;
  NUM+1;
  if last.CASES_USUBJID &amp;amp; NUM le 2 then output NOT_ENOUGH; 
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 02 May 2020 04:02:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644654#M192585</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-05-02T04:02:39Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644715#M192619</link>
      <description>&lt;P&gt;Thank you for the note and code. &amp;nbsp;Indeed you fixed my code for the "not enough" dataset - I hadn't notice that error because I am obsessing over how to ensure I don't have duplicate controls selected. &amp;nbsp;&lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;The code you provided still gives me duplicate controls for some cases. &amp;nbsp;Any ideas on how I can fix this? &amp;nbsp;Seems such a simple concept! &amp;nbsp;Like an "if then" clause. But I can't figure how to say in SAS "if you have already picked a certain control as a match, then go on to the next available control". &amp;nbsp;Maybe there is another way to think about the problem?&lt;/P&gt;</description>
      <pubDate>Sat, 02 May 2020 14:23:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644715#M192619</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-02T14:23:59Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644787#M192653</link>
      <description>&lt;P&gt;I don't see CONTROL in the code, only CASE. And only the first CASE record will be saved. There should be no duplicate.&lt;/P&gt;</description>
      <pubDate>Sat, 02 May 2020 23:39:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644787#M192653</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-05-02T23:39:54Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644793#M192656</link>
      <description>&lt;P&gt;Apologies for the confusion. &amp;nbsp;The dataset I created has 602 cases and all possible matching controls. &amp;nbsp;Cases and controls are in the same observation/row. &amp;nbsp;I ordered the controls randomly. &amp;nbsp;Then I selected one control for each case by selecting the first case. The code you sent works well.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I want to know is, how I can tell SAS not to "pick" a particular control for a case if that control was already selected.&lt;/P&gt;
&lt;P&gt;When I run the code as is, I get duplicate controls, even though the cases are unique.&lt;/P&gt;
&lt;P&gt;Hopefully that makes more sense. &amp;nbsp;Thank you!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Anissa&lt;/P&gt;</description>
      <pubDate>Sun, 03 May 2020 01:26:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644793#M192656</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-03T01:26:43Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644795#M192657</link>
      <description>&lt;P&gt;You have to store the controls as they are saved, and check the new ones against that store.&lt;/P&gt;
&lt;P&gt;For 600 values, an array is enough, though a hash table can be used too.&lt;/P&gt;
&lt;P&gt;Something like:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data CONTROLS_USUBJID2 NOT_ENOUGH; 
  set RANDOM;
  by CASES_USUBJID ;
  array CONTROLS [602] $20 _temporary_;
  if first.CASES_USUBJID then do;   
    NUM=1; 
    FOUND=0;
  end; 
  if ^FOUND &amp;amp; ^whichc(CONTROL, of CONTROLS[*]) then do;
    output CONTROLS_USUBJID2;
    CONTROLS[whichc(' ', of CONTROLS[*]) = CONTROL;
    FOUND+1;
  end;
  NUM+1;
  if last.CASES_USUBJID &amp;amp; NUM le 2 then output NOT_ENOUGH; 
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Not too sure how you want to treat the NUM variable.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 03 May 2020 05:00:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644795#M192657</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-05-03T05:00:10Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644856#M192695</link>
      <description>&lt;P&gt;Hi Chris - Good morning! &amp;nbsp;Thanks again! &amp;nbsp;I was able to run the code - had to fix one missing bracket. &amp;nbsp;But the resulting dataset still has multiple duplicate controls assigned to particular cases. &lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt; &amp;nbsp;See screenshot. &amp;nbsp;Any other ideas? &amp;nbsp;Really appreciate you working through this with me. More than you know!!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Anissa&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 03 May 2020 16:10:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644856#M192695</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-03T16:10:10Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644866#M192698</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/264621"&gt;@anissak1&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I saw your initial post, but I haven't started looking deeper into your or &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/16961"&gt;@ChrisNZ&lt;/a&gt;'s code because, contrary to your own assessment, I don't think that your task is "relatively simple" &lt;EM&gt;in general&lt;/EM&gt;. (You &lt;EM&gt;may be&lt;/EM&gt; lucky and your specific dataset is amenable to a "relatively simple" heuristic.) Instead I read up a bit on the topic of finding a&amp;nbsp;&lt;A href="https://en.wikipedia.org/wiki/Maximum_cardinality_matching" target="_blank" rel="noopener"&gt;maximum (cardinality) matching&lt;/A&gt;&amp;nbsp;in a &lt;A href="https://en.wikipedia.org/wiki/Bipartite_graph" target="_blank" rel="noopener"&gt;bipartite graph&lt;/A&gt; -- which is exactly the mathematical description of the problem you want to solve. (Regrettably, I skipped graph theory in my study back in the 1990s ...)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The good news is: There are algorithms for this task (see the first of the two Wikipedia links above). It depends on the data, though, &lt;EM&gt;if&lt;/EM&gt; a solution exists at all. In terms of SAS, I'm sure that suitable algorithms are available in one or more SAS/OR procedures. For example, &lt;A href="https://documentation.sas.com/?docsetId=procgralg&amp;amp;docsetTarget=procgralg_optgraph_overview.htm&amp;amp;docsetVersion=14.3&amp;amp;locale=en" target="_blank" rel="noopener"&gt;PROC OPTGRAPH&lt;/A&gt; should be able to solve an even more difficult problem: to find a maximum matching with an additional optimality criterion. This means you could assign "weights" to each admissible case-control pair (to indicate that one pair "fits better" than another) and the procedure would strive to find a "&lt;EM&gt;best&lt;/EM&gt;" maximum matching in this sense. See the paragraph about the case "|&lt;EM&gt;S&lt;/EM&gt;|&amp;lt;|&lt;EM&gt;T&lt;/EM&gt;|" in&amp;nbsp;&lt;A href="https://documentation.sas.com/?docsetId=procgralg&amp;amp;docsetTarget=procgralg_optgraph_details60.htm&amp;amp;docsetVersion=14.3&amp;amp;locale=en" target="_blank" rel="noopener"&gt;Linear Assignment (Matching)&lt;/A&gt;.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However, I don't have a SAS/OR license (nor any experience with this SAS module). Do you? If so, the expert(s) in the&amp;nbsp;&lt;A href="https://communities.sas.com/t5/Mathematical-Optimization/bd-p/operations_research" target="_blank" rel="noopener"&gt;Mathematical Optimization, Discrete-Event Simulation, and OR&lt;/A&gt; subforum can most likely help you with code.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Otherwise, it might be an idea to prevent the situation of having multiple case-control pairs to choose from. How did you create your current dataset? There's a SAS/STAT procedure called &lt;A href="https://documentation.sas.com/?docsetId=statug&amp;amp;docsetVersion=14.3&amp;amp;docsetTarget=statug_psmatch_overview.htm&amp;amp;locale=en" target="_blank" rel="noopener"&gt;PSMATCH&lt;/A&gt; which creates an equivalent type of dataset from input data containing cases and controls in separate observations with variables containing the characteristics such as your variables &lt;FONT face="courier new,courier"&gt;agegrp&lt;/FONT&gt;, &lt;FONT face="courier new,courier"&gt;sex&lt;/FONT&gt; and &lt;FONT face="courier new,courier"&gt;baseline&lt;/FONT&gt;. I'm not familiar with this procedure, but I saw in the documentation that the relevant &lt;A href="https://documentation.sas.com/?docsetId=statug&amp;amp;docsetTarget=statug_psmatch_syntax07.htm&amp;amp;docsetVersion=14.3&amp;amp;locale=en#statug.psmatch.psmmatch" target="_blank" rel="noopener"&gt;MATCH statement&lt;/A&gt; has options to specify how many controls (e.g., 1) are to be assigned to each case (depending on the &lt;A href="https://documentation.sas.com/?docsetId=statug&amp;amp;docsetTarget=statug_psmatch_syntax07.htm&amp;amp;docsetVersion=14.3&amp;amp;locale=en#statug.psmatch.methodopt" target="_blank" rel="noopener"&gt;METHOD&lt;/A&gt;&amp;nbsp;used). So, perhaps you can apply PROC PSMATCH to your original "unmatched" data and thus obtain a unique ("optimal") control for each case to begin with.&lt;/P&gt;</description>
      <pubDate>Sun, 03 May 2020 19:00:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644866#M192698</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-05-03T19:00:17Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644884#M192704</link>
      <description>&lt;P&gt;It works for me.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data CONTROLS_USUBJID;
  do CASES_USUBJID=1 to 602;
    do CONTROL=1 to 1000;
      if ranuni(1) &amp;gt; .95 then output;
    end;
  end;
run;

data RANDOM;
  set CONTROLS_USUBJID;
  call streaminit(12345);
  RANDOM= rand('uniform');
run;
 
proc sort data=RANDOM ; 
  by CASES_USUBJID RANDOM;
run;

data CONTROLS_USUBJID2 
     NOT_ENOUGH; 
  set RANDOM;
  by CASES_USUBJID ;
  array CONTROLS [602] _temporary_ (602*-1) ; 
  if first.CASES_USUBJID then do;   
    NUM=1; 
    FOUND=0;
  end; 
  if ^FOUND &amp;amp; ^whichn(CONTROL, of CONTROLS[*]) then do;
    output CONTROLS_USUBJID2;
    CONTROLS[whichn(-1, of CONTROLS[*])] = CONTROL;
    FOUND+1;
  end;
  NUM+1;
  if last.CASES_USUBJID &amp;amp; NUM le 2 then output NOT_ENOUGH; 
run;

proc sql; 
  select CONTROL, count(*) from CONTROLS_USUBJID2 group by CONTROL having count(*)&amp;gt;1;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;NOTE: No rows were selected.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 03 May 2020 23:06:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644884#M192704</guid>
      <dc:creator>ChrisNZ</dc:creator>
      <dc:date>2020-05-03T23:06:44Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644900#M192717</link>
      <description>&lt;P&gt;Hi Reinhard - Two additional comments are that 1) I don't need an "ideal" match from a mathematical perspective. &amp;nbsp;I'm choosing exact matches (gender, age, disease type) and already have a dataset wherein these matches are contained. &amp;nbsp;So no "judgment" needs to be applied in selection. 2) I understand that my random ordering of controls should be sufficient in terms of selection criteria.&lt;/P&gt;
&lt;P&gt;If this makes any SAS code aha's come to light for you or anyone else in the community, I'd be delighted. &amp;nbsp;But I can also stick to my more limited dataset. This is a secondary analysis for a paper and I can tell from re-running the possible matches after eliminating duplicates that probably not many more possibilities remain.&lt;/P&gt;
&lt;P&gt;Thanks again.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Anissa&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 04 May 2020 00:56:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/644900#M192717</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-04T00:56:38Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645054#M192746</link>
      <description>&lt;P&gt;Hello Reinhard - I posted this message to the wrong board....maybe the below simplification will help you or others with ideas...?!&lt;/P&gt;
&lt;DIV&gt;The selection and ordering is already done, so I just would like to tell SAS to do the following...?! &amp;nbsp;There are 2 simple columns of data. I want a new dataset containing first observation for each value in column 1 selected based on the value in column B not having been selected for a previous observation.&lt;/DIV&gt;
&lt;DIV&gt;&amp;nbsp;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#339966"&gt;&lt;SPAN&gt;1 A &amp;nbsp;(Selected)&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;1 B&lt;/DIV&gt;
&lt;DIV&gt;1 C&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#FF0000"&gt;&lt;SPAN&gt;2 A &amp;nbsp;(Not selected because A already "used")&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#339966"&gt;&lt;SPAN&gt;2 D &amp;nbsp;(Selected because D has not been "used")&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;SPAN&gt;2 R&amp;nbsp;&lt;/SPAN&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#339966"&gt;&lt;SPAN&gt;3 B (Selected because B has not been "used")&lt;/SPAN&gt;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;3&amp;nbsp; F&lt;/DIV&gt;
&lt;DIV&gt;3 G&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#FF0000"&gt;4 D (Not selected because D has been "used")&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#FF0000"&gt;4 A (Not selected because A has been "used")&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#339966"&gt;4 M (Selected because M has not been "used)&lt;BR /&gt;&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#333333"&gt;4 X&lt;/FONT&gt;&lt;/DIV&gt;
&lt;DIV&gt;&lt;FONT color="#333333"&gt;4 H&lt;/FONT&gt;&lt;/DIV&gt;</description>
      <pubDate>Mon, 04 May 2020 17:28:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645054#M192746</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-04T17:28:58Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645080#M192760</link>
      <description>&lt;P&gt;Thanks for the clarification and the new problem description.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Try this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
input case control $ @@;
cards;
1 A 1 B 1 C 2 A 2 D 2 R 3 B 3 F 3 G
4 D 4 A 4 M 4 X 4 H
;

data want(drop=rc);
if _n_=1 then do;
  dcl hash h();
  h.definekey('control');
  h.definedone();
end;
do until(last.case);
  set have;
  by case;
  if rc=. &amp;amp; h.check() then do;
    rc=h.add();
    output;
  end;
end;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The IF condition means:&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;&lt;FONT face="courier new,courier"&gt;rc=.&lt;/FONT&gt;: For the current BY group (i.e. case) no control has been selected yet.&lt;/LI&gt;
&lt;LI&gt;&lt;FONT face="courier new,courier"&gt;h.check()&lt;/FONT&gt;: The control of the current observation has not been selected before (and hence is not found in the hash table).&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;If both parts of the condition are met, the control is added (&lt;FONT face="courier new,courier"&gt;rc=h.add()&lt;/FONT&gt;) to the hash table storing the selected controls and the observation is written to dataset WANT (&lt;FONT face="courier new,courier"&gt;output&lt;/FONT&gt;).&lt;/P&gt;</description>
      <pubDate>Mon, 04 May 2020 19:21:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645080#M192760</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2020-05-04T19:21:05Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645163#M192822</link>
      <description>Hello Reinhard!  Thank you. I will try this tomorrow and keep you posted. &lt;BR /&gt;</description>
      <pubDate>Tue, 05 May 2020 04:54:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645163#M192822</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-05T04:54:39Z</dc:date>
    </item>
    <item>
      <title>Re: Case Control Matched Population - Avoiding Duplicates</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645384#M192917</link>
      <description>&lt;P&gt;Dear Reinhard - A million thanks. &amp;nbsp;I applied your code to my dataset and it worked perfectly. &amp;nbsp;&lt;/P&gt;
&lt;P&gt;Absolutely perfectly. &amp;nbsp;I was able to find a total of 579 matching controls for the 602 cases. &amp;nbsp;And no duplicates!&lt;/P&gt;
&lt;P&gt;Wow, that was so much better than my manual approach. &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Much appreciation! &amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Anissa&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 05 May 2020 19:10:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Case-Control-Matched-Population-Avoiding-Duplicates/m-p/645384#M192917</guid>
      <dc:creator>anissak1</dc:creator>
      <dc:date>2020-05-05T19:10:44Z</dc:date>
    </item>
  </channel>
</rss>

