DATA Step, Macro, Functions and more

Data step: using variable from 1 data set to create new var in another set

Reply
Contributor
Posts: 58

Data step: using variable from 1 data set to create new var in another set

Hello,
I have 2 data sets. One is called Def_Total, the other called PlayerIDs. In the PlayerIDs data set is a variable called "Bats" which corresponds to the player in question, denoted by the variable "PlayerID".

In the Def_Total data set, there is a variable called "BatterID". This corresponds to the player (variable "PlayerID") in the PlayerIDs dataset. I want to create a new variable in the Def_Total data set which corresponds to the "Bats" variable from PlayerIDs.

So it looks like this:
Take the "BatterID", match it to "PlayerID", then have the new variable, "BatterHand" = "Bats" from the PlayerIDs data set.

Now, the following code isn't correct, but gives the gist of the idea:

[pre]
proc sort data=def_total;
by BatterID;
run;

data PlayerIDs Def_total;
set Def_total;
do BatterID = 1 to 10234;
if BatterID = PlayerID then
BatterHand = Bats;
end;
run;[/pre]

but the other thing is that this would take forever. It would sort through every PlayerID when setting the variable.

So 2 questions:
1.) Is there a better way to point to the PlayerID to set "BatterHand"?
2.) What's the correct syntax of the data step to make this work?

Thanks!!
Super User
Posts: 19,769

Re: Data step: using variable from 1 data set to create new var in another set

It sounds like you're doing a lookup. You have info in one table that you'd like to map to another based on a common variable.

If that's correct you can do it multiple ways:

1) Proc SQL Merge
2) Data Step Merge
3) Proc Format
4) More that I can't think of at the moment.

See an example of PROC SQL here
http://www.ats.ucla.edu/stat/sas/modules/sqlmerge.htm

HTH,
Reeza
SAS Employee
Posts: 174

Re: Data step: using variable from 1 data set to create new var in another set

Not sure if I get it correct, but I think your looking for a merge (or SQL join). Have a look at this sample.

[pre]
data Def_Total;
input BatterID;
datalines;
1
3
4
6
;
run;

data PlayerIDs;
input PlayerID Bats;
datalines;
1 11
2 22
3 33
4 44
5 55
6 66
7 77
8 88
9 99
;
run;

data Def_Total;
merge Def_Total(in=in_Def_Total
keep=BatterID)
PlayerIDs(in=in_PlayerIDs
keep=PlayerID Bats
rename=(PlayerID=BatterID bats=BatterHand));
by BatterID;
if in_PlayerIDs and in_Def_Total then output;
run;
[/pre]
Ask a Question
Discussion stats
  • 2 replies
  • 145 views
  • 0 likes
  • 3 in conversation