BookmarkSubscribeRSS Feed
AbuChowdhury
Fluorite | Level 6

Dear Experts,

I need to match string variables from two different datasets. Datasets look as follows:

 

Dataset one:

Name                            Rank            Year

A.S. Goldmen                2.01             2002

A.S. Goldmen                3.01             2003

AG Edwards                  2.51             2001

AG Edwards                  2.61             2002

Credit Suisse                 3.50             2002

 

Dataset two:

Name                                                            Year

A.S Goldmen                                                2002

Goldmen (xxx)                                              2003

AG Edwards                                                 2001

AG Edwards                                                 2002  

A.S. GoldmenAG Edwards                           2002

Credit Suisse                                                2002

A.S. GoldmenAG EdwardsCredit Suisse     2002

 

So how will I merge these datasets by Name and Year? The problem is in Name variable in dataset two.

Output should be like this:

Name                                                                 Rank            Year

A.S. Goldmen                                                    2.01             2002

A.S. Goldmen                                                    3.01             2003

AG Edwards                                                      2.51             2001

AG Edwards                                                      2.61             2002

A.S. GoldmenAG Edwards                                2.31            2002

A.S. GoldmenAG EdwardsCredit Suisse          2.7067         2002

For the row A.S. GoldmenAG Edwards, the Rank is average rank of A.S. Goldmen and AG Edwards in 2002 ((2.01+2.61)/2 = 2.31). For the last row (A.S. GoldmenAG EdwardsCredit Suisse), the is average rank of A.S. Goldmen, AG Edwards and Credit Suisse ((2.01+2.61+ 3.50)/3 = 2.7067).

 

Kind regards,

Abu

 

3 REPLIES 3
PGStats
Opal | Level 21

This links to the current message...

PG
ed_sas_member
Meteorite | Level 14

Hi @AbuChowdhury 

 

Here is on approach to do this.

Hope this helps!

data one;
	infile datalines dlm="09"x;
	input Name:$50. Rank  Year;
	datalines;
A.S. Goldmen	2.01	2002
A.S. Goldmen	3.01	2003
AG Edwards	2.51	2001
AG Edwards	2.61	2002
Credit Suisse	3.50	2002
;

data two;
	infile datalines dlm="09"x;
	input Name:$50. Year;
	datalines;
A.S. Goldmen	2002
Goldmen (xxx)	2003
AG Edwards	2001
AG Edwards	2002  
A.S. GoldmenAG Edwards	2002
Credit Suisse	2002
A.S. GoldmenAG EdwardsCredit Suisse	2002
; 

/* Put distinct 'units' of Name into macrovariables */
proc sql noprint;
	select distinct quote(tranwrd(tranwrd(strip(Name),'.','\.'),' ','\s'),"'") into: units separated by "," from one;
	select count(distinct name) into: total_units from one;
quit;

/* Separate Names into its distinct 'units' (one unit per variable) */
data two2;
	set two;
	if Name="Goldmen (xxx)" then Name="A.S. Goldmen"; /* Data management*/
	id+1;
	array _name(&total_units) $ 50;
	array _unit(&total_units) $ 50 _temporary_ (&units);
	do i=1 to &total_units;
		if prxmatch(cats('/^.*(',_unit(i),').*$/'),Name) then
		_name(i)=prxchange(cats('s/^.*(',_unit(i),').*$/$1/i'),i,Name);
	end;
	drop i;

proc sort data=two2;
	by id name year;
run;
proc transpose data=two2 out=two2_tr (drop=_name_ where=(unit ne "") rename=(col1=unit));
	var _name:;
	by id name year;
run;

/* Merge with ranks */
proc sql;
	create table one_two as
	select b.name, b.unit, b.year, a.rank
	from one as a inner join two2_tr as b
	on a.name = b.unit and a.year=b.year;
quit;

/* Compute mean */
proc sql;
	create table want as
	select distinct name,year, avg(rank) as rank
	from one_two
	group by name, year;
quit;


Capture d’écran 2020-02-15 à 14.34.55.png

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 454 views
  • 0 likes
  • 4 in conversation