BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
nbs98495
Calcite | Level 5

Hello,

I have two datasets that look like the following:

 

Dataset 1:

sbdvnumloc
1500.8761
1600.9931
1710.9371

 

Dataset 2:

sbdvnumloc
1200.9870
1210.9650
1230.9450
1250.9120

 

I want to create a new dataset that contains all of dataset 1 plus a match of observations in dataset 2 of the observations with the minimum difference in 'num' between each obs in dataset 1 and all the obs in dataset 2.

 

So for sbdv = 150, I want to calculate the difference between its 'num' (0.876) and sbdv 120, 121, 123, and 125's respective 'num' values (0.987, 0.965, 0.945, 0.912). Then I want to save each 'match' to a new dataset containing all of dataset 1 and the matches for each observation.

 

So for this example, the new dataset would be:

sbdvnumloc
1500.8761
1600.9931
1710.9371
1250.9120
1200.9870
1230.9450

 

I am guessing there is a straight forward solution, but I haven't been able to find anything that explicitly will do this.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Assuming the pairing has to be one-to-one, this can be expressed as a linear assignment problem and solved easily with PROC OPTNET (a proc of SAS/OR):

 

data ds1;
input sbdv  num loc;
datalines;
150   0.876   1
160   0.993   1
171   0.937   1
;
data ds2;
input sbdv  num loc;
datalines;
120   0.987   0
121   0.965   0
123   0.945   0
125   0.912   0
;

proc sql;
create table pairs as
select 
    ds1.sbdv as from,
    ds2.sbdv as to,
    abs(ds1.num-ds2.num) as weight
from ds1, ds2;
quit;

proc optnet data_links=pairs direction=directed;
linear_assignment out=selected_pairs;
run;

proc sql;
create table want as 
select sbdv, num, loc
from
    (select sbdv, num, loc from selected_pairs, ds1 where from=sbdv) 
    union all
    (select sbdv, num, loc from selected_pairs, ds2 where to=sbdv);
quit;

 

PG

View solution in original post

7 REPLIES 7
PGStats
Opal | Level 21

You don't need to know which observation from ds 2 match each observation from ds 1?

PG
nbs98495
Calcite | Level 5

No, just match ds 1 obs to ds 2 based on ds1 num - ds2 num

PGStats
Opal | Level 21

When you say "minimum difference" do you mean absolute value of difference?

PG
nbs98495
Calcite | Level 5
Yes!
PGStats
Opal | Level 21

Assuming the pairing has to be one-to-one, this can be expressed as a linear assignment problem and solved easily with PROC OPTNET (a proc of SAS/OR):

 

data ds1;
input sbdv  num loc;
datalines;
150   0.876   1
160   0.993   1
171   0.937   1
;
data ds2;
input sbdv  num loc;
datalines;
120   0.987   0
121   0.965   0
123   0.945   0
125   0.912   0
;

proc sql;
create table pairs as
select 
    ds1.sbdv as from,
    ds2.sbdv as to,
    abs(ds1.num-ds2.num) as weight
from ds1, ds2;
quit;

proc optnet data_links=pairs direction=directed;
linear_assignment out=selected_pairs;
run;

proc sql;
create table want as 
select sbdv, num, loc
from
    (select sbdv, num, loc from selected_pairs, ds1 where from=sbdv) 
    union all
    (select sbdv, num, loc from selected_pairs, ds2 where to=sbdv);
quit;

 

PG
nbs98495
Calcite | Level 5
This works perfect, thanks!
Ksharp
Super User
data ds1;
input sbdv  num loc;
datalines;
150   0.876   1
160   0.993   1
171   0.937   1
;
data ds2;
input _sbdv  _num _loc;
datalines;
120   0.987   0
121   0.965   0
123   0.945   0
125   0.912   0
;
data want;
 if _n_=1 then do;
   if 0 then set ds2;
   declare hash h(dataset:'ds2');
   declare hiter hi('h');
   h.definekey('_sbdv');
   h.definedata(all:'y');
   h.definedone();
 end;
set ds1;
min=999999;
do while(hi.next()=0);
  dis=abs(num-_num);
  if dis<min then do;min=dis;sbdv1=_sbdv;num1=_num;loc1=_loc;end;
end;
h.remove(key:sbdv1);
drop _:  min dis;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 7 replies
  • 2200 views
  • 0 likes
  • 3 in conversation