<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: trying to find the variables/set of variables that causes multiple duplicates for id in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/561110#M10546</link>
    <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
input id x1 x2 x3 x4;
datalines;
1 0 1 3 4
1 1 1 3 4
2 2 4 6 8
2 2 6 6 8 
3 3 3 3 3 
3 3 3 3 3 
;
run;
proc sql;
create table temp as
 select id,count(distinct x1) as x1, count(distinct x2) as x2,
 count(distinct x3) as x3,count(distinct x4) as x4
  from test 
   group by id;
quit;

proc transpose data=temp out=temp1;
by id;
run;
proc sql noprint;
select distinct _name_ into : names separated by ','
from temp1
 where col1&amp;gt;1;

create table want as
 select id,&amp;amp;names from test;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
    <pubDate>Thu, 23 May 2019 12:09:41 GMT</pubDate>
    <dc:creator>Ksharp</dc:creator>
    <dc:date>2019-05-23T12:09:41Z</dc:date>
    <item>
      <title>trying to find the variables/set of variables that causes multiple duplicates for id</title>
      <link>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/560929#M10520</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I am trying to find the variables or group of variables which cause the duplicates for id in the data during extraction. In my original data set, I have 100 variables and 500 duplicates for id. Here is a sample data set:&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
input id x1 x2 x3 x4;
datalines;
1 0 1 3 4
1 1 1 3 4
2 2 4 6 8
2 2 6 6 8 
3 3 3 3 3 
3 3 3 3 3 

;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;I want this data set: because id 1 has only discrepancy for x1 and id 2 has discrepancy for x2 so that would like to get table like this.&amp;nbsp;&lt;/P&gt;&lt;TABLE border="0" cellspacing="0" cellpadding="0"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;id&lt;/TD&gt;&lt;TD&gt;x1&lt;/TD&gt;&lt;TD&gt;x2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;4&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;2&lt;/TD&gt;&lt;TD&gt;6&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;TD&gt;3&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you for your help!!!&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Bikash&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 18:29:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/560929#M10520</guid>
      <dc:creator>bikashten</dc:creator>
      <dc:date>2019-05-22T18:29:52Z</dc:date>
    </item>
    <item>
      <title>Re: trying to find the variables/set of variables that causes multiple duplicates for id</title>
      <link>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/560959#M10521</link>
      <description>&lt;P&gt;You may to create a macro and pass variable to it to find duplicates. I have created a macro to do the same on your dummy table &amp;amp; it's working.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data test;&lt;BR /&gt;input id x1 x2 x3 x4;&lt;BR /&gt;datalines;&lt;BR /&gt;1 0 1 3 4&lt;BR /&gt;1 1 1 2 3&lt;BR /&gt;2 2 4 6 8&lt;BR /&gt;2 2 6 5 5&lt;BR /&gt;3 3 3 7 3&lt;BR /&gt;3 3 3 3 7&lt;/P&gt;&lt;P&gt;;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;proc contents&lt;BR /&gt;data = test(drop=id)&lt;BR /&gt;noprint&lt;BR /&gt;out = data_info&lt;BR /&gt;(keep = name varnum);&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt;select name into: name1-&lt;BR /&gt;from data_info;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;%put &amp;amp;name1 &amp;amp;name2 &amp;amp;name3 &amp;amp;name4;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;%macro find_dup_var(tab=,key=,var=);&lt;BR /&gt;proc sql noprint;&lt;BR /&gt;create table tab&amp;amp;var. as select count(distinct &amp;amp;var.) as &amp;amp;var._cnt from &amp;amp;tab group by &amp;amp;key;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;proc sql;&lt;BR /&gt;select "&amp;amp;var." as duplicate_variable from tab&amp;amp;var. having min(&amp;amp;var._cnt) =1;&lt;BR /&gt;quit;&lt;/P&gt;&lt;P&gt;%mend find_dup_var;&lt;/P&gt;&lt;P&gt;%macro run1();&lt;BR /&gt;%do i=1 %to 4;&lt;BR /&gt;%find_dup_var(tab=test,key=id,var=&amp;amp;&amp;amp;name&amp;amp;i.)&lt;BR /&gt;%end;&lt;/P&gt;&lt;P&gt;%mend;&lt;/P&gt;&lt;P&gt;%run1;&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 19:55:00 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/560959#M10521</guid>
      <dc:creator>kulbshar</dc:creator>
      <dc:date>2019-05-22T19:55:00Z</dc:date>
    </item>
    <item>
      <title>Re: trying to find the variables/set of variables that causes multiple duplicates for id</title>
      <link>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/560985#M10524</link>
      <description>&lt;P&gt;Here's an approach for numeric variables.&amp;nbsp; It can be expanded to include character variables as well if that is important.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
   set have (drop=id);
   array nums {*} _numeric_;
   call symputx('n_nums', dim(nums));
   stop;
run;

data _null_;
   if 5=4 then set have (drop=id);
   array nums {&amp;amp;n_nums} _numeric_;
   array lastv {&amp;amp;n_nums} _temporary_;
   array keepyn {&amp;amp;n_nums} $ 1 _temporary_;
   set have end=done;
   by id;
   do _n_=1 to &amp;amp;n_nums;
      if first.id=0 and lastv{_n_} ne nums{_n_} then keepyn{_n_} = 'Y';
      lastv{_n_} = nums{_n_};
   end;
   length keeplist $ 20000;
   if done;
   do _n_=1 to &amp;amp;n_nums;
      if keepyn{_n_} = 'Y' then keeplist = catx(' ', keeplist, vname(nums{_n_}) ) ;
   end;
   call symputx('keeplist', keeplist);
run;

data want;
   set have (keep=id &amp;amp;keeplist);
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The code is untested, so you will have to try it.&amp;nbsp; That's probably a good idea before trying to expand it to include character variables as well.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Once it is debugged, the major advantage is that it only takes 2 passes through the data to handle everything that needs to be done.&lt;/P&gt;</description>
      <pubDate>Wed, 22 May 2019 23:13:40 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/560985#M10524</guid>
      <dc:creator>Astounding</dc:creator>
      <dc:date>2019-05-22T23:13:40Z</dc:date>
    </item>
    <item>
      <title>Re: trying to find the variables/set of variables that causes multiple duplicates for id</title>
      <link>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/561110#M10546</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
input id x1 x2 x3 x4;
datalines;
1 0 1 3 4
1 1 1 3 4
2 2 4 6 8
2 2 6 6 8 
3 3 3 3 3 
3 3 3 3 3 
;
run;
proc sql;
create table temp as
 select id,count(distinct x1) as x1, count(distinct x2) as x2,
 count(distinct x3) as x3,count(distinct x4) as x4
  from test 
   group by id;
quit;

proc transpose data=temp out=temp1;
by id;
run;
proc sql noprint;
select distinct _name_ into : names separated by ','
from temp1
 where col1&amp;gt;1;

create table want as
 select id,&amp;amp;names from test;
quit;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Thu, 23 May 2019 12:09:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/trying-to-find-the-variables-set-of-variables-that-causes/m-p/561110#M10546</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2019-05-23T12:09:41Z</dc:date>
    </item>
  </channel>
</rss>

