Hello,
I wanted to get your opinion on how some of you would handle the problem I have. I have grouped data that i need processed a particular way. For each group (name), i would like to always retain the key associated with the best match type for that group, as well as the corresponding match type.
The match types for each of the "keys" are ordered from 1 to 10, with 1 being the best match for that key, and 10 the worst. So basically I want to over-write any record that isn't the "best match" within each group for each key type.
Please let me know if you need any further explanation of what I'm tying to accomplish...Thanks in advance!
data have1;
input name $ 3. order 2. key_a 4. match_type_a 2. key_b 4. match_type_b 2.;
datalines ;
aaa 1 111 1 . 10
aaa 2 . 10 222 3
bbb 1 333 1 . 10
bbb 2 . 10 444 3
;
run;
Want:
name | order | key_a | match_type_a | key_b | match_type_b |
aaa | 1 | 111 | 1 | 222 | 3 |
aaa | 2 | 111 | 1 | 222 | 3 |
bbb | 1 | 333 | 1 | 444 | 3 |
bbb | 2 | 333 | 1 | 444 | 3 |
Like this?
proc sql;
select a.NAME
, a.ORDER
, b.KEY_A
, b.MATCH_TYPE_A
, c.KEY_B
, c.MATCH_TYPE_B
from HAVE a
left join
(select NAME, KEY_A, MATCH_TYPE_A
from HAVE
group by NAME
having MATCH_TYPE_A=max(MATCH_TYPE_A)
) b
on a.NAME = b.NAME
left join
(select NAME, KEY_B, MATCH_TYPE_B
from HAVE
group by NAME
having MATCH_TYPE_B=max(MATCH_TYPE_B)
) c
on a.NAME = c.NAME ;
; quit;
NAME | ORDER | KEY_A | MATCH_TYPE_A | KEY_B | MATCH_TYPE_B |
---|---|---|---|---|---|
aaa | 1 | 111 | 1 | 222 | 3 |
aaa | 2 | 111 | 1 | 222 | 3 |
bbb | 1 | 333 | 1 | 444 | 3 |
bbb | 2 | 333 | 1 | 444 | 3 |
How do we know which is "BEST" match for a key? A rule please as I doubt that your example covered all of the cases.
The lowest number would be the best...so for the by group "name" where name = 'aaa', there are 2 match_type_a's (1 and 10) so for key_a I would always want to see where name = 'aaa' for there to be key_a = 111 and match_type_a = 1.
Essentially, 10 means there wasn't a match, which is why no key is associated with key_a or key_b where the match_type's = 10.
But there could be groups that have more than 1 non-null key_a's for example , which is why they match_type is important (i would want to keep the key associated with the lowest match type for each name).
I can't follow your logic with the data and rules you've shown. Can you try and make it more clear please?
I suspect this is why you haven't received an answer yet, as otherwise questions like this are answered fairly promptly.
Like this?
proc sql;
select a.NAME
, a.ORDER
, b.KEY_A
, b.MATCH_TYPE_A
, c.KEY_B
, c.MATCH_TYPE_B
from HAVE a
left join
(select NAME, KEY_A, MATCH_TYPE_A
from HAVE
group by NAME
having MATCH_TYPE_A=max(MATCH_TYPE_A)
) b
on a.NAME = b.NAME
left join
(select NAME, KEY_B, MATCH_TYPE_B
from HAVE
group by NAME
having MATCH_TYPE_B=max(MATCH_TYPE_B)
) c
on a.NAME = c.NAME ;
; quit;
NAME | ORDER | KEY_A | MATCH_TYPE_A | KEY_B | MATCH_TYPE_B |
---|---|---|---|---|---|
aaa | 1 | 111 | 1 | 222 | 3 |
aaa | 2 | 111 | 1 | 222 | 3 |
bbb | 1 | 333 | 1 | 444 | 3 |
bbb | 2 | 333 | 1 | 444 | 3 |
i had to change the "max" to "min" but overall that seemed to work. thanks
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.