BookmarkSubscribeRSS Feed
BasketballSAS
Calcite | Level 5

This is my code so far:

/* Read in data */
options validvarname=v7;
libname NBA xlsx "~/L.E.B.R.O.N. Framework/data/2012_data.xlsx";
*libname NBA clear;

/* Test printing data */
*proc print data=nba.player_usage;
*run;

proc sql;
   create table player_traditional_usage_merged as
   select a.player, a.team, a.age, a.gp, a.w, a.l,
   		a.min, a.pts, a.oreb, a.dreb, a.ast,
   		a.stl, a.blk, b._pts, b._oreb, b._dreb,
   		b._ast, b._stl, b._blk
      from nba.player_traditional a inner join nba.player_usage b
           on a.player = b.player
           and a.team = b.team
           and a.age = b.age
           and a.gp = b.gp
           and a.w = b.w
           and a.l = b.l;
quit;

proc print data=player_traditional_usage_merged;
run;

data player_traditional_usage_merged;
	set player_traditional_usage_merged;
	'Similarity to LeBron'n = ?
run;

I am trying to calculate this Similarity to LeBron column. For this, I would need to subtract all other values from LeBron's values. 

 

In R, this would look something like the following:

player_traditional_usage_merged$`Similarity to LeBron` <- player_traditional_usage_merged$PTS - player_traditional_usage_merged$PTS[player_traditional_usage_merged$PLAYER == "LeBron James"]

What the above R code does is take LeBron's points value and subtract it from every player's points value in the dataset. I am trying to accomplish the same in SAS. Any insight is appreciated, thanks. 

8 REPLIES 8
Mitesh73941
Calcite | Level 5

To achieve the same operation in SAS that you described in R, you can follow a similar approach by first isolating LeBron James' statistics and then subtracting these values from every other player's corresponding values. Here's how you can do this step-by-step:

Step-by-Step Solution
1. Identify LeBron James' Row: Extract LeBron's statistics and store them in macro variables.
2. Subtract LeBron's Statistics from All Other Rows: Use a DATA step to subtract these values from each player's statistics.

 

/* Step 1: Extract LeBron's statistics into macro variables */
data _null_;
set player_traditional_usage_merged;
if player = "LeBron James" then do;
call symputx('lebron_pts', pts);
call symputx('lebron_oreb', oreb);
call symputx('lebron_dreb', dreb);
call symputx('lebron_ast', ast);
call symputx('lebron_stl', stl);
call symputx('lebron_blk', blk);
end;
run;

/* Step 2: Subtract LeBron's statistics from all other players */
data player_traditional_usage_merged;
set player_traditional_usage_merged;
similarity_to_lebron_pts = pts - &lebron_pts;
similarity_to_lebron_oreb = oreb - &lebron_oreb;
similarity_to_lebron_dreb = dreb - &lebron_dreb;
similarity_to_lebron_ast = ast - &lebron_ast;
similarity_to_lebron_stl = stl - &lebron_stl;
similarity_to_lebron_blk = blk - &lebron_blk;
run;

/* Print the resulting dataset */
proc print data=player_traditional_usage_merged;
var player team similarity_to_lebron_pts similarity_to_lebron_oreb similarity_to_lebron_dreb similarity_to_lebron_ast similarity_to_lebron_stl similarity_to_lebron_blk;
run;

 

Explanation of the Code
1. Extract LeBron's Statistics:
The first DATA _null_ step reads through the player_traditional_usage_merged dataset and identifies the row where player = "LeBron James".
The call symputx function is used to store LeBron's statistics (like pts, oreb, dreb, ast, stl, and blk) in macro variables (e.g., &lebron_pts, &lebron_oreb).
2. Subtract LeBron's Statistics from All Players:
The second DATA step iterates through the entire dataset again.
For each player, it creates new variables (similarity_to_lebron_pts, etc.) that store the difference between the player's stats and LeBron's stats.
3. Print the Results:
The proc print step outputs the final dataset showing each player's statistics subtracted from LeBron's statistics

 

Feel free to update logs if you get error. I have not tested the code. 🙂 

Astounding
PROC Star

A DATA step can do all the calculations, if you keep track of what information is where.  I would use arrays for this:

data want;
   array lebron {6} _temporary_;
   array stats {6} pts oreb dreb ast stl blk;
   array similar {6} similar_pts similar_oreb simil_dreb similar_ast similar_stl similar_blk;
   if _n_=1 then do;
      set player_traditional_usage_merged (where=(player="LeBron James"));
      lebron{1} = pts;
      lebron{2} = oreb;
      lebron{3} = dreb;
      lebron{4} = ast;
      lebron{5} = stl;
      lebron{6} = blk;
   end;
set player_traditional_usage_merged; do _n_=1 to 6; similar{_n_} = stats{_n_} - lebron{_n_}; end; run;

The temporary array cleans up after itself.  None of the variables fields get stored permanently once the DATA step ends.  You can pick any names you would like in the SIMILAR array.  

It's untested code, but you have the data so easy enough for you to try it.

PaigeMiller
Diamond | Level 26

It would be a lot less programming if you use PROC STDIZE with METHOD=IN. Here, in this example, using data set SASHELP.CLASS, we want to subtract the age and height and weight of Jane from all the other records. No macros needed. No arrays needed.

 

data one_obs;
    set sashelp.class(where=(name='Jane'));
    scale=1;
    keep age height weight scale;
run;

proc stdize data=sashelp.class method=in(one_obs) out=want;
	location age height weight;
	scale scale scale scale;
	var age height weight;
run;

 

--
Paige Miller
FreelanceReinh
Jade | Level 19

Minor variant of PaigeMiller's approach:

/* Indicate the observation containing the reference values */

data tmp;
set sashelp.class;
_w=(name='James');
run;

/* Perform the subtraction (add a VAR statement to restrict it to selected variables) */

proc stdize data=tmp method=mean out=want(drop=_w);
weight _w;
run;
PaigeMiller
Diamond | Level 26

That is brilliant, @FreelanceReinh . I'm starting to think that SAS is very powerful!

--
Paige Miller
Ksharp
Super User

I think it is a SAS/IML thing.

 

proc iml;
use sashelp.class;
read all var _num_ where (name='James') into one_obs;
read all var _num_ into all_obs[c=vname];
read all var _char_ into char_obs[c=vname2];
close;
want=all_obs-one_obs;
create want from char_obs want[c=(vname2||vname)];
append from char_obs want;
close;
quit;
BasketballSAS
Calcite | Level 5
Thank you for all the responses! I have not tested them yet as I am still new to SAS but will be sure to reply to each response, and will be sure to provide the data as a file attachment next time.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 426 views
  • 5 likes
  • 7 in conversation