SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Dregerator
Obsidian | Level 7

Hi,

 

I have two datasets that have different b values. I want the nearest c value from the second dataset to be used for my first dataset when using b between two dataset 

 

For example:

 

Dataset 1

a b
1 702
1 788
1 1148
2 417
2 441
2 458
2 465
2 495
2 502
2 707
2 766
2 863
3 1547
4 977
4 1547

 

Dataset 2

a b c
1 598 13.1
1 652 12.9
1 701 12.5
1 766 12.3
1 807 14.6
1 1095 10.1
1 1142 .
1 1219 9.7
2 295 16.1
2 305 18.1
2 362 16.4
2 565 16.9
2 642 18.4
2 740 22.5
2 834 15.5
2 922 17.8
2 1007 27.6
2 1090 13.9
2 1286 14.7
2 1568 15.1
2 1561 11.9
3 903 10.5
3 1011 7.8
3 1099 8
3 1561 .
3 1656 5.4
4 862 1.7
4 906 1.1
4 990 1.8

 

Final Dataset:

 

a b c
1 702 12.5
1 788 14.6
1 1148 10.1
2 417 16.4
2 441 16.4
2 458 16.4
2 465 16.9
2 495 16.9
2 502 16.9
2 707 22.5
2 766 22.5
2 863 15.5
3 1547 5.4
4 977 1.8
4 1547 1.8

 

 

Any suggestions as to how I can create something similar to the final dataset given the values in dataset 1 and dataset 2?

 

Thank you in advance

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

This can get you started but doesn't handle the ties yet. Unfortunately I don't have time to resolve that, but I think you could figure it out or hopefully someone else can chime in. This only looks at absolute difference so it works for exactly the question asked but if you have ties you'll end up with duplicates in the output. 

 

proc sql;
create table want as
select t1.a, t1.b, t2.c, abs(t1.b - t2.b) as diff 
from data1 as t1
left join data2 as t2
on t1.a=t2.a
and not missing(t2.c)
group by t1.a, t1.b
having diff=min(diff)
;
quit;

View solution in original post

9 REPLIES 9
Reeza
Super User

I want the nearest c value from the second dataset to be used for my first dataset when using b between two dataset 

What are the rules for nearest? Does it need to be above or below or does it not matter? If it doesn't matter, what happens if there's a tie and you have one with a value exactly below and above?

 

Depending on the criteria this could be a simple merge or something more complicated.

 

Spoiler

@Dregerator wrote:

Hi,

 

I have two datasets that have different b values. I want the nearest c value from the second dataset to be used for my first dataset when using b between two dataset 

 

For example:

 

Dataset 1

a b
1 702
1 788
1 1148
2 417
2 441
2 458
2 465
2 495
2 502
2 707
2 766
2 863
3 1547
4 977
4 1547

 

Dataset 2

a b c
1 598 13.1
1 652 12.9
1 701 12.5
1 766 12.3
1 807 14.6
1 1095 10.1
1 1142 .
1 1219 9.7
2 295 16.1
2 305 18.1
2 362 16.4
2 565 16.9
2 642 18.4
2 740 22.5
2 834 15.5
2 922 17.8
2 1007 27.6
2 1090 13.9
2 1286 14.7
2 1568 15.1
2 1561 11.9
3 903 10.5
3 1011 7.8
3 1099 8
3 1561 .
3 1656 5.4
4 862 1.7
4 906 1.1
4 990 1.8

 

Final Dataset:

 

a b c
1 702 12.5
1 788 14.6
1 1148 10.1
2 417 16.4
2 441 16.4
2 458 16.4
2 465 16.9
2 495 16.9
2 502 16.9
2 707 22.5
2 766 22.5
2 863 15.5
3 1547 5.4
4 977 1.8
4 1547 1.8

 

 

Any suggestions as to how I can create something similar to the final dataset given the values in dataset 1 and dataset 2?

 

Thank you in advance


 

 

Dregerator
Obsidian | Level 7

It does not matter...I would just like the closest value to dataset b because I'm trying to superimpose values in dataset a on a graph where dataset b is already displayed, but need the nearest/closest values of c so they are on top of each other. Not sure if this answers your question

Reeza
Super User

Unfortunately not. What happens if there's a tie for the smallest difference and one is above and one is below? Then which value do you take?

 


@Dregerator wrote:

It does not matter...I would just like the closest value to dataset b because I'm trying to superimpose values in dataset a on a graph where dataset b is already displayed, but need the nearest/closest values of c so they are on top of each other. Not sure if this answers your question


 

 

Dregerator
Obsidian | Level 7

Can we do one above for example purposes?

Reeza
Super User

Data set A

a b
1 702

 

Dataset 2

a b c
1 598 13.1
1 652 12.9
1 701 12.5
1 703 12.3

 

Now what should data set 3 look like?

 

Dregerator
Obsidian | Level 7

It will be 12.5 

 

a b c
1 702 12.5
Reeza
Super User
So below then, 701 because 703 is only 1 away as well.
Reeza
Super User

This can get you started but doesn't handle the ties yet. Unfortunately I don't have time to resolve that, but I think you could figure it out or hopefully someone else can chime in. This only looks at absolute difference so it works for exactly the question asked but if you have ties you'll end up with duplicates in the output. 

 

proc sql;
create table want as
select t1.a, t1.b, t2.c, abs(t1.b - t2.b) as diff 
from data1 as t1
left join data2 as t2
on t1.a=t2.a
and not missing(t2.c)
group by t1.a, t1.b
having diff=min(diff)
;
quit;
PeterClemmensen
Tourmaline | Level 20

Here is an alternative

 


data one;
input a b;
datalines;
1 702
1 788
1 1148
2 417
2 441
2 458
2 465
2 495
2 502
2 707
2 766
2 863
3 1547
4 977
4 1547
;

data two;
input a b c;
datalines;
1 598 13.1
1 652 12.9
1 701 12.5
1 766 12.3
1 807 14.6
1 1095 10.1
1 1142 .
1 1219 9.7
2 295 16.1
2 305 18.1
2 362 16.4
2 565 16.9
2 642 18.4
2 740 22.5
2 834 15.5
2 922 17.8
2 1007 27.6
2 1090 13.9
2 1286 14.7
2 1568 15.1
2 1561 11.9
3 903 10.5
3 1011 7.8
3 1099 8
3 1561 .
3 1656 5.4
4 862 1.7
4 906 1.1
4 990 1.8
;

data want(keep = a b c);
 
   if _N_ = 1 then do;
      dcl hash h(dataset : "two(rename = b = bb where = (c))", multidata : "Y");
      h.definekey("a");
      h.definedata("bb", "c");
      h.definedone();
   end;
 
   set one;
   bb = .; c = .; con = constant("big");
 
   do while (h.do_over() = 0);
      if abs(bb - b) < con then do;
         con = abs(bb - b);
         cc = c;
      end;
   end;
 
   c = cc;
run;

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 9 replies
  • 2309 views
  • 3 likes
  • 3 in conversation