BookmarkSubscribeRSS Feed
sas_Forum
Calcite | Level 5

actually my concept is to get the linkages if pan1 is again repeated in pan2 pan3 or add1 then it should get one Hid,not only that

ex  obs  pan1 pan2 pan3 add1  hid

          1.    aaa   bbb    ccc     ddd    1

           2.  qqq   rrr       www   aaa   1

           3.   rrr     ppp    mmm lll       1

           4.   uuu   zzz    ffff      ppp     1

          5     p       l        m          n      2

         6     jjjj    eee     rrr       ooo    1

         7   <all    blanks  >              3

        8    sss   www  .    .             1

        9  .         .        .        eee       1

in this example the    first obs is hid is 1

in obs2  add1 aaa is matching with pan1  aaa of obs1

in obs 3  pan1  rrr is matching in pan2 rrr of obs 2

in obs 4 add1  ppp is matching with pan2 ppp of obs 3

in obs 5 it is unique with no matching

in obs 6 pan3 rrr is matcing with pan1 rrr of obs 3

in obs 7 is having no values it got Hid as 3

obs 8 is having pan2 www matching with pan2 www of obs 2 so it got Hid 1

obs 9 add1 eee is matching with obs6 pan2 ee so it got Hid 1

Like this i want the linkages  

4 REPLIES 4
DLing
Obsidian | Level 7

The approach illustrated above is sequential in nature, but I don't think it covers all potential problems, in particular, what happens when the "linkage info" shows up later in the process?  Example:

k1 k2 k3 k4 hid

aa bb cc dd  1     all new

ee ff gg hh  2     no ovlap with previous

ii jj kk ll  3     no overlap yet

aa ff cc hh  ??    record 1 & 2 that was distinct now share linkage due to this record

When the last record shows up binding record 1,2,4 (they are all connected now), do you want hid for all 3 records to be hid 1 or not?  I would assume you do want them all to be hid 1.

This issue of linkage showing up after assignment is typical in finding connected subgraphs (householding, connections,...etc).  This is not solvable in general through sorting, that's why this is always an iterative process until the assignments become stable, i.e., no more group formation is possible.

If you can describe what you desire in more detail - specifically covering all possible cases - then perhaps some sharp Smiley Wink person familiar with hash objects can point you in the right direction.

sas_Forum
Calcite | Level 5

obs k1 k2 k3 k4 hid

1.  aa bb cc dd  1    

2.  ee ff gg hh  1    

3.  ii jj kk ll  2    

4.  aa ff cc hh  1

All shoudl get Hid as 1 in the first row it is 1 in the obs 4th aa if repeating in k1 and next to aa ff is there so it has a linkage to aa so it should also get 1 in obs 2

DLing
Obsidian | Level 7

Suggest this thread is not necessary since you've got two other discussions on exactly the same thing.

art297
Opal | Level 21

Interesting thought though!  Could one of the clustering algorithms be used to help solve the problem?  Given the number of clusters that would be needed, I doubt if any of us have a sufficiently powerfull machine to find out, but still an interesting thought.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 706 views
  • 0 likes
  • 3 in conversation