Hi,
How can I merge two datasets where in first dataset I have only one variable city. Second dataset contains two variables (without city variable)- one of them was used to choose right values of city variable:
find(city, variable1)>0
How can I merge these datasets? By which variable and by which kind of join (inner/left/right)?
That's my input data:
place | rate |
london | 15 |
paris | 18 |
cracow | 10 |
city |
cool_london |
notknown |
paris |
That's what I want to obtain:
city | rate |
cool_london | 15 |
paris | 18 |
Untested, you may need to tweak the FIND() function but it should work.
proc sql;
create table want as
select t1.place, t1.rate, t2.city
from input_data as t1
cross join city_terms as t2
where find(t1.place, t2.city, 'i')>0;
quit;
@aaaaa34 wrote:
Hi,
How can I merge two datasets where in first dataset I have only one variable city. Second dataset contains three variables (without city variable)- one of them was used to choose right values of city variable:
find(city, variable1)>0
How can I merge these datasets? By which variable and by which kind of join (inner/left/right)?
That's my input data:
place rate london 15 paris 18 cracow 10
city cool_london notknown paris
That's what I want to obtain:
city rate cool_london 15 paris 18
I do not comprehend your question. Please provide sample data of the two data sets that you have, and sample data of what you want to produce.
If you know that you will never have more than one place found for any city, then this program would satisfy your objective.
data one;
input place :$10. rate;
datalines;
london 15
paris 18
cracow 10
run;
data two;
input city :$15. ;
datalines;
cool_london
notknown
paris
run;
data want (drop=rc);
if _n_=1 then do;
if 0 then set one;
declare hash d1 (dataset:'one');
d1.definekey('place');
d1.definedata('place','rate');
d1.definedone();
declare hiter i ('d1');
end;
set two;
rc=i.first();
do while (find(city,trim(place))=0 and rc=0);
rc=i.next();
end;
if rc=0;
run;
After establishing a lookup table base on ONE in the hash object D1, it reads each record in dataset TWO. Then it iterates through D1 until a match is found, or the hash object is exhausted. If the hash object is exhausted the return code variable RC is not a zero, and therefore you don't want an output record. This can be relatively efficient because you tell the loop to stop with a successful FIND function (i.e. the FIND function returns a position number representing the character at which a successful match for PLACE has been found in CITY).
So if you don't have many instances where an observations in TWO is never found in ONE, the above will be more efficient than the far simpler program below. The below does a cartesian comparison of every obs in TWO against every obs in ONE, even after a match has been found. Of course, this will discover instances in which a given PLACE is found in more than one CITY:
proc sql noprint;
create table want2 as
select * from one inner join two
on two.city like cats('%',one.place,'%') ;
quit;
Hi @aaaaa34 ,
Please have a look at below code and check whether this is what you are expecting.
data cityrate;
input place $6. rate;
datalines;
london 15
paris 18
cracow 10
;
run;
data city;
input place1 $11.;
datalines;
cool_london
notknown
paris
;
run;
data mergercity(keep=place1 rate) ;
merge cityrate(in=a) city(in=b) ;
if a and b;
run;
Hi @aaaaa34 ,
i ran the whole code and final merge dataset is showing 2 variables only-got output as below.please clearly suggest what you are expecting
rate place1
15 cool_london
18 notknown
10 paris
Untested, you may need to tweak the FIND() function but it should work.
proc sql;
create table want as
select t1.place, t1.rate, t2.city
from input_data as t1
cross join city_terms as t2
where find(t1.place, t2.city, 'i')>0;
quit;
@aaaaa34 wrote:
Hi,
How can I merge two datasets where in first dataset I have only one variable city. Second dataset contains three variables (without city variable)- one of them was used to choose right values of city variable:
find(city, variable1)>0
How can I merge these datasets? By which variable and by which kind of join (inner/left/right)?
That's my input data:
place rate london 15 paris 18 cracow 10
city cool_london notknown paris
That's what I want to obtain:
city rate cool_london 15 paris 18
Yes it worked. But is it possible to avoid find in proc sql (like using this function earlier)? And then merge city variable (right values) with rate? The problem is that I dont know how to merge this variables?
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.