Processing by Group

Accepted Solution Solved
Reply
Contributor
Posts: 50
Accepted Solution

Processing by Group

Hello,

 

I wanted to get your opinion on how some of you would handle the problem I have. I have grouped data that i need processed a particular way. For each group (name), i would like to always retain the key associated with the best match type for that group, as well as the corresponding match type.

 

The match types for each of the "keys" are ordered from 1 to 10, with 1 being the best match for that key, and 10 the worst. So basically I want to over-write any record that isn't the "best match" within each group for each key type.

 

Please let me know if you need any further explanation of what I'm tying to accomplish...Thanks in advance!

 

data have1;

input name $ 3. order 2. key_a 4. match_type_a 2. key_b 4. match_type_b 2.;

datalines ;

aaa 1 111 1 . 10

aaa 2 . 10 222 3

bbb 1 333 1 . 10

bbb 2 . 10 444 3

;

run;

 

Want:

name order key_a match_type_a key_b match_type_b
aaa 1 111 1 222 3
aaa 2 111 1 222 3
bbb 1 333 1 444 3
bbb 2 333 1 444 3

Accepted Solutions
Solution
‎06-15-2017 08:10 AM
PROC Star
Posts: 1,561

Re: Processing by Group

Like this?


proc sql;
   select a.NAME
        , a.ORDER
        , b.KEY_A 
        , b.MATCH_TYPE_A
        , c.KEY_B 
        , c.MATCH_TYPE_B
   from   HAVE                       a
        left join
          (select NAME, KEY_A, MATCH_TYPE_A  
           from HAVE
           group by NAME
           having MATCH_TYPE_A=max(MATCH_TYPE_A)
          )                         b
        on a.NAME  = b.NAME
        left join
          (select NAME, KEY_B, MATCH_TYPE_B  
           from HAVE
           group by NAME
           having MATCH_TYPE_B=max(MATCH_TYPE_B)
           )                         c
         on a.NAME  = c.NAME ;
; quit;
NAME ORDER KEY_A MATCH_TYPE_A KEY_B MATCH_TYPE_B
aaa 1 111 1 222 3
aaa 2 111 1 222 3
bbb 1 333 1 444 3
bbb 2 333 1 444 3

 

View solution in original post


All Replies
Super User
Posts: 10,483

Re: Processing by Group

How do we know which is "BEST" match for a key? A rule please as I doubt that your example covered all of the cases.

Contributor
Posts: 50

Re: Processing by Group

The lowest number would be the best...so for the by group "name" where name = 'aaa', there are 2 match_type_a's (1 and 10) so for key_a I would always want to see where name = 'aaa' for there to be key_a = 111 and match_type_a = 1.

 

Essentially, 10 means there wasn't a match, which is why no key is associated with key_a or key_b where the match_type's = 10.

 

But there could be groups that have more than 1 non-null  key_a's for example , which is why they match_type is important (i would want to keep the key associated with the lowest match type for each name).

Super User
Posts: 17,784

Re: Processing by Group

I can't follow your logic with the data and rules you've shown. Can you try and make it more clear please?

I suspect this is why you haven't received an answer yet, as otherwise questions like this are answered fairly promptly. 

Super User
Posts: 5,256

Re: Processing by Group

If I understand you correctly, this type of logic is usually performed by first sorting the data set (by name and_match_type). Then use retain to propagate the correct let to all rows for that name.
Hint 1: use set by, and first. and last. logic.
Hint 2: you need to this twice since you have a and b set of variables.
Data never sleeps
Solution
‎06-15-2017 08:10 AM
PROC Star
Posts: 1,561

Re: Processing by Group

Like this?


proc sql;
   select a.NAME
        , a.ORDER
        , b.KEY_A 
        , b.MATCH_TYPE_A
        , c.KEY_B 
        , c.MATCH_TYPE_B
   from   HAVE                       a
        left join
          (select NAME, KEY_A, MATCH_TYPE_A  
           from HAVE
           group by NAME
           having MATCH_TYPE_A=max(MATCH_TYPE_A)
          )                         b
        on a.NAME  = b.NAME
        left join
          (select NAME, KEY_B, MATCH_TYPE_B  
           from HAVE
           group by NAME
           having MATCH_TYPE_B=max(MATCH_TYPE_B)
           )                         c
         on a.NAME  = c.NAME ;
; quit;
NAME ORDER KEY_A MATCH_TYPE_A KEY_B MATCH_TYPE_B
aaa 1 111 1 222 3
aaa 2 111 1 222 3
bbb 1 333 1 444 3
bbb 2 333 1 444 3

 

Contributor
Posts: 50

Re: Processing by Group

i had to change the "max" to "min" but overall that seemed to work. thanks

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 130 views
  • 0 likes
  • 5 in conversation