<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Identifying Unique ID Numbers in Several Datasets in SAS Enterprise Guide in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955952#M42957</link>
    <description>&lt;P&gt;how about linear count with Hash table?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data One;
input ID $12.	Var1	Var2	Var3;
cards;
111-111-1111	0	1	0
222-222-2222	0	1	1
333-333-3333	1	1	1
444-444-4444	0	1	0
555-555-5555	1	1	0
141-547-7810	1	1	0
;
run;
proc print data=One;
run;

Data Two;
input ID $12.	Var1	Var2	Var3;
cards;
111-111-1111	0	1	0
666-666-6666	0	1	1
333-333-3333	0	1	1
444-444-4444	0	1	0
555-555-5555	0	1	0
777-777-8888	0	0	0
;
run;
proc print data=Two;
run;


data _null_;
dcl hash H(ordered:"A");
H.defineKey("ID");
H.defineData("ID","inONE","inTWO");
H.defineDone();

do until(_E_);
  set 
    ONE(keep=ID in=in1) 
    TWO(keep=ID in=in2) 
  end=_E_;

  rc=H.find();
  inONE+in1;
  inTWO+in2;
  /*put _ALL_;*/ /* just for preview */
  rc=H.replace();
  call missing(inONE,inTWO);
end;


h.Output(dataset:"unique");
stop;
run;

proc print data=unique;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
    <pubDate>Mon, 13 Jan 2025 18:43:07 GMT</pubDate>
    <dc:creator>yabwon</dc:creator>
    <dc:date>2025-01-13T18:43:07Z</dc:date>
    <item>
      <title>Identifying Unique ID Numbers in Several Datasets in SAS Enterprise Guide</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955758#M42951</link>
      <description>&lt;P&gt;I am using SAS Enterprise Guide to join datasets. There are &amp;gt;100K observations in each dataset. Prior to joining the datasets, I need to determine the unique IDs that are in each respective dataset. For example, hypothetically, dataset One:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;ID&lt;/TD&gt;&lt;TD&gt;Var1&lt;/TD&gt;&lt;TD&gt;Var2&lt;/TD&gt;&lt;TD&gt;Var3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;111-111-1111&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;222-222-2222&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;333-333-3333&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;444-444-4444&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;555-555-5555&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;141-547-7810&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and Dataset Two:&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;ID&lt;/TD&gt;&lt;TD&gt;Var1&lt;/TD&gt;&lt;TD&gt;Var2&lt;/TD&gt;&lt;TD&gt;Var3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;111-111-1111&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;666-666-6666&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;333-333-3333&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;444-444-4444&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;555-555-5555&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;777-777-8888&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;How do I get a list of the IDs that are duplicates &amp;amp; a list that are unique?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've tried using 'join tables' under Query Builder to look for a 'not equal' operator; I tried to use the 'Sort and Filter' task -&amp;gt; copying &amp;amp; pasting IDs from one dataset to another using 'in a list' operator, but the number of IDs in each dataset are greater than the list allows for. I've tried the above with the hyphens &amp;amp; without the hyphens in the ID variable.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Is there an 'easy' way to do this on SAS EG or do I need to use code? If the later, is the&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks ahead for any help!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 10 Jan 2025 17:53:20 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955758#M42951</guid>
      <dc:creator>AJ17</dc:creator>
      <dc:date>2025-01-10T17:53:20Z</dc:date>
    </item>
    <item>
      <title>Re: Identifying Unique ID Numbers in Several Datasets in SAS Enterprise Guide</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955760#M42952</link>
      <description>&lt;P&gt;You could make a dataset INDICATORS with each ID and an INDICATOR variable (with values "DATA1 Only", "DATA2 Only", and "DATA 1 and 2"), with SQL such as:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql;
  create table indicators
  as select
   coalesce(l.id,r.id) as ID
   ,case when missing(R.ID) then 'DATA1 Only'
         when missing(L.ID) then 'DATA2 Only'
         else 'DATA 1 and 2'
   end as indicator

  from data1 as L full join data2 as R
  on L.ID=R.ID ;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Fri, 10 Jan 2025 18:28:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955760#M42952</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2025-01-10T18:28:22Z</dc:date>
    </item>
    <item>
      <title>Re: Identifying Unique ID Numbers in Several Datasets in SAS Enterprise Guide</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955792#M42953</link>
      <description>&lt;P&gt;Check PROC SORT 's option&amp;nbsp; &amp;nbsp;DUPOUT= (contains the duplcated value)&amp;nbsp; UNIQUEOUT= (contains uniques value):&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;proc sort data=sashelp.class &lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;nodupkey dupout=duplicated&lt;/STRONG&gt;&lt;/FONT&gt; out=sorted;
by sex;
run;

proc sort data=sashelp.class &lt;FONT color="#FF0000"&gt;&lt;STRONG&gt;nouniquekey uniqueout=unique&lt;/STRONG&gt;&lt;/FONT&gt; out=sorted;
by age;
run;&lt;/PRE&gt;</description>
      <pubDate>Sat, 11 Jan 2025 07:44:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955792#M42953</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2025-01-11T07:44:12Z</dc:date>
    </item>
    <item>
      <title>Re: Identifying Unique ID Numbers in Several Datasets in SAS Enterprise Guide</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955949#M42956</link>
      <description>&lt;P&gt;Thanks for your reply. I had to change the code to fit my dataset names &amp;amp; the ID variable (SSN) to enter into SAS EG:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;proc sql;
create table indicators
as select
COALESCEC(t1.ssn,t2.ssn) as SSN
,case when missing(t1.SSN) then work.dataset1 
when missing(t2.SSN) then work.Dataset1 
else work.DATASET2and work.Dataset1 
end as indicator

from work.DATASET2 as t1 full join work.Dataset1 as t2
on t1.ssn=t2.ssn ;
quit;&lt;/PRE&gt;&lt;P&gt;I'm getting these error messages, with the log indicating that all errors stem from program line "on t1.ssn=t2.ssn".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;ERROR: Unresolved reference to table/correlation name work.
ERROR: Unresolved reference to table/correlation name WORK.
ERROR: Unresolved reference to table/correlation name WORK.
ERROR: Unresolved reference to table/correlation name WORK.
ERROR: Result of WHEN clause 3 is not the same data type as the preceding results.&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I understand the COALESCEC command is available under 'Query Builder' &amp;amp; setting up an Advanced Expression, but I'm not sure if going that route would work.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Would you please give me any recommendations on how to fix the errors?&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jan 2025 18:12:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955949#M42956</guid>
      <dc:creator>AJ17</dc:creator>
      <dc:date>2025-01-13T18:12:57Z</dc:date>
    </item>
    <item>
      <title>Re: Identifying Unique ID Numbers in Several Datasets in SAS Enterprise Guide</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955952#M42957</link>
      <description>&lt;P&gt;how about linear count with Hash table?&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data One;
input ID $12.	Var1	Var2	Var3;
cards;
111-111-1111	0	1	0
222-222-2222	0	1	1
333-333-3333	1	1	1
444-444-4444	0	1	0
555-555-5555	1	1	0
141-547-7810	1	1	0
;
run;
proc print data=One;
run;

Data Two;
input ID $12.	Var1	Var2	Var3;
cards;
111-111-1111	0	1	0
666-666-6666	0	1	1
333-333-3333	0	1	1
444-444-4444	0	1	0
555-555-5555	0	1	0
777-777-8888	0	0	0
;
run;
proc print data=Two;
run;


data _null_;
dcl hash H(ordered:"A");
H.defineKey("ID");
H.defineData("ID","inONE","inTWO");
H.defineDone();

do until(_E_);
  set 
    ONE(keep=ID in=in1) 
    TWO(keep=ID in=in2) 
  end=_E_;

  rc=H.find();
  inONE+in1;
  inTWO+in2;
  /*put _ALL_;*/ /* just for preview */
  rc=H.replace();
  call missing(inONE,inTWO);
end;


h.Output(dataset:"unique");
stop;
run;

proc print data=unique;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jan 2025 18:43:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Identifying-Unique-ID-Numbers-in-Several-Datasets-in-SAS/m-p/955952#M42957</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2025-01-13T18:43:07Z</dc:date>
    </item>
  </channel>
</rss>

