10-11-2014 01:07 PM
I have simple one-to-many merge code like this...
merge baseball.players (in=InMaster) baseball.salaries (in=InSalaries);
if InMaster and InSalaries;
keep playerID birthYear birthMonth birthCountry salary;
Salaries dataset contains a row for each player for each year, so there are multiple rows for most players.
I simply want my merged dataset to include one row per player, including the max salary that player ever received (from the Salaries dataset).
How to do it?
I'm a newbie working on a final project in a SAS programming class, so I appreciate any solution or quick tip. First time trying to get help on this community!
10-11-2014 01:49 PM
1. Limit dataset before merge to include the max salary or only one record per player
2. I don't think a data step merge can handle a many2many merge so you may want to use SQL
10-11-2014 06:23 PM
Have your Salaries data set sorted by "playerID Salary".
Amend your IF-condition to: if inMaster and inSalaries and last.playerID;
I wouldn't use "and inSalaries" in your condition as this means you would remove players from your result for which there is no record in "Salaries".
10-11-2014 07:48 PM
create table baseball.playershighestsalary as
select a.playerid,birthYear, birthMonth, birthCountry, b.salary
from baseball.players a left join (select playerid,max(salary) as salary from baseball.salaries group by playerid) b
10-12-2014 08:42 AM
If there was a tie for max salary ?
data players; input id ; cards; 1 2 3 ; run; data salaries; input id salaries; cards; 1 23 1 43 2 43 2 21 3 45 4 34 ; run; proc sort data=players;by id;run; proc sort data=salaries;by id descending salaries;run; data playershighestsalary; InMaster=0; merge players (in=InMaster) salaries ; by id; if InMaster ; run;