Dear Experts,
I need to match string variables from two different datasets. Datasets look as follows:
Dataset one:
Name Rank Year
A.S. Goldmen 2.01 2002
A.S. Goldmen 3.01 2003
AG Edwards 2.51 2001
AG Edwards 2.61 2002
Credit Suisse 3.50 2002
Dataset two:
Name Year
A.S Goldmen 2002
Goldmen (xxx) 2003
AG Edwards 2001
AG Edwards 2002
A.S. GoldmenAG Edwards 2002
Credit Suisse 2002
A.S. GoldmenAG EdwardsCredit Suisse 2002
So how will I merge these datasets by Name and Year? The problem is in Name variable in dataset two.
Output should be like this:
Name Rank Year
A.S. Goldmen 2.01 2002
A.S. Goldmen 3.01 2003
AG Edwards 2.51 2001
AG Edwards 2.61 2002
A.S. GoldmenAG Edwards 2.31 2002
A.S. GoldmenAG EdwardsCredit Suisse 2.7067 2002
For the row A.S. GoldmenAG Edwards, the Rank is average rank of A.S. Goldmen and AG Edwards in 2002 ((2.01+2.61)/2 = 2.31). For the last row (A.S. GoldmenAG EdwardsCredit Suisse), the is average rank of A.S. Goldmen, AG Edwards and Credit Suisse ((2.01+2.61+ 3.50)/3 = 2.7067).
Kind regards,
Abu
This links to the current message...
Here is on approach to do this.
Hope this helps!
data one;
infile datalines dlm="09"x;
input Name:$50. Rank Year;
datalines;
A.S. Goldmen 2.01 2002
A.S. Goldmen 3.01 2003
AG Edwards 2.51 2001
AG Edwards 2.61 2002
Credit Suisse 3.50 2002
;
data two;
infile datalines dlm="09"x;
input Name:$50. Year;
datalines;
A.S. Goldmen 2002
Goldmen (xxx) 2003
AG Edwards 2001
AG Edwards 2002
A.S. GoldmenAG Edwards 2002
Credit Suisse 2002
A.S. GoldmenAG EdwardsCredit Suisse 2002
;
/* Put distinct 'units' of Name into macrovariables */
proc sql noprint;
select distinct quote(tranwrd(tranwrd(strip(Name),'.','\.'),' ','\s'),"'") into: units separated by "," from one;
select count(distinct name) into: total_units from one;
quit;
/* Separate Names into its distinct 'units' (one unit per variable) */
data two2;
set two;
if Name="Goldmen (xxx)" then Name="A.S. Goldmen"; /* Data management*/
id+1;
array _name(&total_units) $ 50;
array _unit(&total_units) $ 50 _temporary_ (&units);
do i=1 to &total_units;
if prxmatch(cats('/^.*(',_unit(i),').*$/'),Name) then
_name(i)=prxchange(cats('s/^.*(',_unit(i),').*$/$1/i'),i,Name);
end;
drop i;
proc sort data=two2;
by id name year;
run;
proc transpose data=two2 out=two2_tr (drop=_name_ where=(unit ne "") rename=(col1=unit));
var _name:;
by id name year;
run;
/* Merge with ranks */
proc sql;
create table one_two as
select b.name, b.unit, b.year, a.rank
from one as a inner join two2_tr as b
on a.name = b.unit and a.year=b.year;
quit;
/* Compute mean */
proc sql;
create table want as
select distinct name,year, avg(rank) as rank
from one_two
group by name, year;
quit;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Select SAS Training centers are offering in-person courses. View upcoming courses for: