<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Can Array be used to group distinct linked-numbers from two columns  of a Data Set? in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Can-Array-be-used-to-group-distinct-linked-numbers-from-two/m-p/556014#M154818</link>
    <description>&lt;P&gt;Bharath_aavas has opened a thread with the subject "create common group id for &lt;BR /&gt;multiple link id's".&lt;/P&gt;
&lt;P&gt;He solved the problem using 8 steps, answered his question&lt;BR /&gt;himself and closed the thread.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In reply to me &lt;BR /&gt;he wanted to solve the problem using Arrays.&lt;/P&gt;
&lt;P&gt;He said later:&lt;BR /&gt;" I have inserted my SAS Code and tested and it is working in &lt;BR /&gt;all scenarios. Yet to automate for N number of level in macro &lt;BR /&gt;this is working until 6 relation id's."&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since his thread is marked as "Answered", my&lt;BR /&gt;Array solution offered to him seems to have not been&lt;BR /&gt;seen by others. For the benefit of him &lt;BR /&gt;and others, I am posting the array solution using this thread.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is his Data Set:&lt;/P&gt;
&lt;PRE&gt;data have; 
input p1 p2;
cards;
9528 9534
9534 9056
8675 7890
8675 87677
7002 7001
7001 7005
7005 7009
5422 22333
22333 78878
78878 21121
21121 787999
787999 3422
1 2
1 3
1 4
1 11
5 7
6 7
8 7
;
run;&lt;/PRE&gt;
&lt;P&gt;This is the expected answer I arrived.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;group_id 	Loan1 	Loan2 	Loan3 	Loan4 	Loan5 	Loan6
1 	9528 	9534 	9056 	. 	. 	.
2 	8675 	7890 	87677 	. 	. 	.
3 	7002 	7001 	7005 	7009 	. 	.
4 	5422 	22333 	78878 	21121 	787999 	3422
5 	1 	2 	3 	4 	11 	.
6 	5 	7 	6 	8 	. 	.&lt;/PRE&gt;
&lt;P&gt;The correct specification will be to get the list of&lt;BR /&gt;distinct numbers from both P1 and P2 subject to&lt;BR /&gt;having a linkage between elements of both columns.&lt;BR /&gt;Such elements are placed into one group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;9528 9534 9056 form one group as other numbers are far apart.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The search then continues for distinct-linked numbers&lt;BR /&gt;for another group.&lt;/P&gt;
&lt;P&gt;Initially we don't know how many groups are there and how many&lt;BR /&gt;distinct numbers will fall into a group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Approach:&lt;/P&gt;
&lt;P&gt;We need to decide when a group begins and when it ends.&lt;/P&gt;
&lt;P&gt;Externally we found the number of observations of the Data Set.&lt;/P&gt;
&lt;P&gt;Two temporary arrays are created with size equal to the&lt;BR /&gt;number of observations. They are named as PID[ ] and LID[ ].&lt;BR /&gt;Elements from P1 are copied to PID and from P2 to LID.&lt;/P&gt;
&lt;P&gt;A character variable (LIST) with an initial length 100 is&lt;BR /&gt;used. It must be increased to larger size when needed. One&lt;BR /&gt;can use a length of 32767 initially but for this example it&lt;BR /&gt;is a waste of memory.&lt;/P&gt;
&lt;P&gt;Elements PID[1] and LID[1], if they are different, are placed&lt;BR /&gt;in the LIST. Next, the end of group (or beginning of group)&lt;BR /&gt;is decided by the following 2 conditions:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;[1] PID[i] is compared with either of LID[i-1] or of PID[i-1].&lt;BR /&gt;and&lt;BR /&gt;[2] LID[i] is compared with either of PID[i-1] or of LID[i-1].&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Their values are negated to get COND1 and COND2. If they&lt;BR /&gt;take 1 each, then PID[i] and LID[i] are no longer linked&lt;BR /&gt;to previous elements. It is the beginning of next group.&lt;BR /&gt;If one of them is zero, then the same group continues.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Here are two&amp;nbsp; conditions:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;cond1 = ^(pid[i] = lid[i-1] or pid[i] = pid[i-1]);&lt;BR /&gt;cond2 = ^(lid[i] = pid[i-1] or lid[i] = lid[i-1]);&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also we count the number of elements placed in the LIST&lt;BR /&gt;and the maximum number of elements is found. This Maximum&lt;BR /&gt;number will be used in the next Data Step to size an array.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is the Data Step which gets the LIST of distinct&lt;BR /&gt;elements for each group in TEMP Data Set:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data _null_;
   call symputx('size', num);
   if 0 then set have nobs = num;
   stop;
run;


data temp;
   array pid[&amp;amp;size] _temporary_;
   array lid[&amp;amp;size] _temporary_;
   length list $100;
   if _n_ = 1 then do i = 1 by 1 until(eof);
      set have end = eof;
      pid[i] = p1;
      lid[i] = p2;
   end;
   if pid[1] = lid[1] then list = catx(' ',list, pid[1]);
   else list = catx(' ',list, pid[1], lid[1]);
   do i = 2 to dim(pid);
      cond1 = ^(pid[i] = lid[i-1] or pid[i] = pid[i-1]);
      cond2 = ^(lid[i] = pid[i-1] or lid[i] = lid[i-1]);
      if cond1 * cond2 then do; count = countw(list); output; list = ' '; end; 
      if cond1 then list = catx(' ',list, pid[i]);
      if cond2 then list = catx(' ',list, lid[i]);
      Max = max(Max, count);
      if i = dim(pid) then do; 
         Max = max(Max,countw(list)); 
         output;
         call symputx('Max', Max);
         stop; 
      end;
   end;
keep list;
run;&lt;/PRE&gt;
&lt;P&gt;The following Data Step copies the elements from the LIST,&lt;BR /&gt;converting them into numbers, into an array L[ ] sized to&lt;BR /&gt;the Maximum found in the previous Step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data want;
   retain group_id 0;
   array L Loan1 - Loan&amp;amp;Max;
   set temp;
   group_id + 1;
   n = countw(list);
   do i = 1 to n;
      L[i] = input(scan(list,i), best.);
   end;
keep group_id Loan:;
run;&lt;/PRE&gt;
&lt;P&gt;Everything is dynamically done except the length of LIST.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;DataSP&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 03 May 2019 17:18:11 GMT</pubDate>
    <dc:creator>KachiM</dc:creator>
    <dc:date>2019-05-03T17:18:11Z</dc:date>
    <item>
      <title>Can Array be used to group distinct linked-numbers from two columns  of a Data Set?</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Can-Array-be-used-to-group-distinct-linked-numbers-from-two/m-p/556014#M154818</link>
      <description>&lt;P&gt;Bharath_aavas has opened a thread with the subject "create common group id for &lt;BR /&gt;multiple link id's".&lt;/P&gt;
&lt;P&gt;He solved the problem using 8 steps, answered his question&lt;BR /&gt;himself and closed the thread.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;In reply to me &lt;BR /&gt;he wanted to solve the problem using Arrays.&lt;/P&gt;
&lt;P&gt;He said later:&lt;BR /&gt;" I have inserted my SAS Code and tested and it is working in &lt;BR /&gt;all scenarios. Yet to automate for N number of level in macro &lt;BR /&gt;this is working until 6 relation id's."&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since his thread is marked as "Answered", my&lt;BR /&gt;Array solution offered to him seems to have not been&lt;BR /&gt;seen by others. For the benefit of him &lt;BR /&gt;and others, I am posting the array solution using this thread.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is his Data Set:&lt;/P&gt;
&lt;PRE&gt;data have; 
input p1 p2;
cards;
9528 9534
9534 9056
8675 7890
8675 87677
7002 7001
7001 7005
7005 7009
5422 22333
22333 78878
78878 21121
21121 787999
787999 3422
1 2
1 3
1 4
1 11
5 7
6 7
8 7
;
run;&lt;/PRE&gt;
&lt;P&gt;This is the expected answer I arrived.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;group_id 	Loan1 	Loan2 	Loan3 	Loan4 	Loan5 	Loan6
1 	9528 	9534 	9056 	. 	. 	.
2 	8675 	7890 	87677 	. 	. 	.
3 	7002 	7001 	7005 	7009 	. 	.
4 	5422 	22333 	78878 	21121 	787999 	3422
5 	1 	2 	3 	4 	11 	.
6 	5 	7 	6 	8 	. 	.&lt;/PRE&gt;
&lt;P&gt;The correct specification will be to get the list of&lt;BR /&gt;distinct numbers from both P1 and P2 subject to&lt;BR /&gt;having a linkage between elements of both columns.&lt;BR /&gt;Such elements are placed into one group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;9528 9534 9056 form one group as other numbers are far apart.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The search then continues for distinct-linked numbers&lt;BR /&gt;for another group.&lt;/P&gt;
&lt;P&gt;Initially we don't know how many groups are there and how many&lt;BR /&gt;distinct numbers will fall into a group.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Approach:&lt;/P&gt;
&lt;P&gt;We need to decide when a group begins and when it ends.&lt;/P&gt;
&lt;P&gt;Externally we found the number of observations of the Data Set.&lt;/P&gt;
&lt;P&gt;Two temporary arrays are created with size equal to the&lt;BR /&gt;number of observations. They are named as PID[ ] and LID[ ].&lt;BR /&gt;Elements from P1 are copied to PID and from P2 to LID.&lt;/P&gt;
&lt;P&gt;A character variable (LIST) with an initial length 100 is&lt;BR /&gt;used. It must be increased to larger size when needed. One&lt;BR /&gt;can use a length of 32767 initially but for this example it&lt;BR /&gt;is a waste of memory.&lt;/P&gt;
&lt;P&gt;Elements PID[1] and LID[1], if they are different, are placed&lt;BR /&gt;in the LIST. Next, the end of group (or beginning of group)&lt;BR /&gt;is decided by the following 2 conditions:&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;[1] PID[i] is compared with either of LID[i-1] or of PID[i-1].&lt;BR /&gt;and&lt;BR /&gt;[2] LID[i] is compared with either of PID[i-1] or of LID[i-1].&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Their values are negated to get COND1 and COND2. If they&lt;BR /&gt;take 1 each, then PID[i] and LID[i] are no longer linked&lt;BR /&gt;to previous elements. It is the beginning of next group.&lt;BR /&gt;If one of them is zero, then the same group continues.&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;Here are two&amp;nbsp; conditions:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;cond1 = ^(pid[i] = lid[i-1] or pid[i] = pid[i-1]);&lt;BR /&gt;cond2 = ^(lid[i] = pid[i-1] or lid[i] = lid[i-1]);&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also we count the number of elements placed in the LIST&lt;BR /&gt;and the maximum number of elements is found. This Maximum&lt;BR /&gt;number will be used in the next Data Step to size an array.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here is the Data Step which gets the LIST of distinct&lt;BR /&gt;elements for each group in TEMP Data Set:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data _null_;
   call symputx('size', num);
   if 0 then set have nobs = num;
   stop;
run;


data temp;
   array pid[&amp;amp;size] _temporary_;
   array lid[&amp;amp;size] _temporary_;
   length list $100;
   if _n_ = 1 then do i = 1 by 1 until(eof);
      set have end = eof;
      pid[i] = p1;
      lid[i] = p2;
   end;
   if pid[1] = lid[1] then list = catx(' ',list, pid[1]);
   else list = catx(' ',list, pid[1], lid[1]);
   do i = 2 to dim(pid);
      cond1 = ^(pid[i] = lid[i-1] or pid[i] = pid[i-1]);
      cond2 = ^(lid[i] = pid[i-1] or lid[i] = lid[i-1]);
      if cond1 * cond2 then do; count = countw(list); output; list = ' '; end; 
      if cond1 then list = catx(' ',list, pid[i]);
      if cond2 then list = catx(' ',list, lid[i]);
      Max = max(Max, count);
      if i = dim(pid) then do; 
         Max = max(Max,countw(list)); 
         output;
         call symputx('Max', Max);
         stop; 
      end;
   end;
keep list;
run;&lt;/PRE&gt;
&lt;P&gt;The following Data Step copies the elements from the LIST,&lt;BR /&gt;converting them into numbers, into an array L[ ] sized to&lt;BR /&gt;the Maximum found in the previous Step.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;data want;
   retain group_id 0;
   array L Loan1 - Loan&amp;amp;Max;
   set temp;
   group_id + 1;
   n = countw(list);
   do i = 1 to n;
      L[i] = input(scan(list,i), best.);
   end;
keep group_id Loan:;
run;&lt;/PRE&gt;
&lt;P&gt;Everything is dynamically done except the length of LIST.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;DataSP&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 03 May 2019 17:18:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Can-Array-be-used-to-group-distinct-linked-numbers-from-two/m-p/556014#M154818</guid>
      <dc:creator>KachiM</dc:creator>
      <dc:date>2019-05-03T17:18:11Z</dc:date>
    </item>
  </channel>
</rss>

