Hello everybody! I created a table with hockey players and a running total of their career points using the DATA step and an input table, but I need to recreate the same output using proc SQL, and I'm having a bit of trouble.
Here's the code with the DATA step:
data mydata.career_points_data;
set mydata.player_datamerged;
where league="NHL";
drop place_of_birth g a pim league;
by player_id league_year;
if first.player_id=1 then
CAREER_POINTS=0;
CAREER_POINTS+P;
run;
And I tried to create code with the same output with proc SQL to no avail:
proc sql;
create table career_points_sql as select player_id, first_name, last_name,
date_of_birth, height_cm, weight_kg, shoots, primary_pos, gp, league_year, p,
ppg, CAREER_POINTS+P as CAREER_POINTS
from mydata.player_datamerged where league="NHL" order by player_id, league_year;
quit;
Based on the log, I'm pretty sure the text I made red is what's causing issues, but I'm not sure how to do what I want in proc SQL. I attached my log, the errors start at line 36, where I ran the code with "CAREER_POINTS+P as CAREER_POINTS" commented out.
So basically I'm trying to create a table with proc SQL that has an accumulating row that resets with every new player. I also attached the table that I created with the DATA step for reference; that is what I'm trying to recreate using proc SQL.
Any help is appreciated,
Thanks!!
Running totals are easier to code for and perform better using a SAS data step especially when compared to the SAS SQL flavour.
Other SQL implementations (i.e. for SQL Server or Oracle have language extensions like analytics functions which better allow to implement running totals).
Below a SAS SQL which will return the same result like your data step.
proc sql;
create table want as
select
*,
(select sum(P)
from mydata.career_points_data i
where i.player_id=o.player_id
and i.league_year<=o.league_year
/* and i.league="NHL"*/
)
as CAREER_POINTS_SQL
from mydata.career_points_data o
/* where league="NHL";*/
order by player_id, league_year
;
quit;
In future posts always post the log as text using the </> icon, screenshots are unnecessary hard to read. The code you have posted has no errors (afaik), but in the screenshot i see a comma before FROM - check the beginning of your comment!
Why do you need have to re-write the code as sql?
Creating a running total like that is very hard to do in SQL. Stay with your data step.
Creating summary observations only (one per player), OTOH, is easy in SQL.
While you can eventually create SQL that will do something like that it will not be easy and may be pretty fragile (as in unexpected values breaking code that works for some specific set).
SQL is designed around operations on SETS of values with operations like Intersect and Union (in set operation terms). Sets as such do not have any particular order so the process of using data in a specific order can be quite difficult.
Consider the tool and the job. If you want to cut down a tree you want a saw, not a hammer. You can eventually pound enough bits off of a tree that it will fall down but it won't go quickly. Here you are using the hammer (SQL).
On the other hand the Data step does a very poor job of working with many to many data set merges but SQL is designed for that.
Running totals are easier to code for and perform better using a SAS data step especially when compared to the SAS SQL flavour.
Other SQL implementations (i.e. for SQL Server or Oracle have language extensions like analytics functions which better allow to implement running totals).
Below a SAS SQL which will return the same result like your data step.
proc sql;
create table want as
select
*,
(select sum(P)
from mydata.career_points_data i
where i.player_id=o.player_id
and i.league_year<=o.league_year
/* and i.league="NHL"*/
)
as CAREER_POINTS_SQL
from mydata.career_points_data o
/* where league="NHL";*/
order by player_id, league_year
;
quit;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.