Hello and thank you beforehand for reading this, first time poster and avid SAS fan.
I am working with a very simple dataset to refresh my knowledge of how to work with SAS
The SAS code which I am working with is as follows
%let path = x.wbw;
/*output only team stats i.e team winner vs team loser matchresults for the season (no playoffs)*/
/*sort by team_w (winner)*/
%macro fs_season;
proc sql;
%do i=2007 %to 2017;
create table %sysfunc(cat(&path,&i,_fs_season))
as select 'Winner/tie'n as Team_W, 'Loser/tie'n as Team_L
from %sysfunc(cat(&path,&i))(obs=256)
order by team_w;
/*cutoff at row 256 to not include the playoff data*/
%end;
quit;
%mend;
%fs_season;
/*output team / win / lose stat*/
/*does not take into account teams with zero (0) wins*/
%macro fs_season_stats;
%do i=2007 %to 2017;
data %sysfunc(cat(&path,&i,_score))(drop=Team_L);
set %sysfunc(cat(&path,&i,_fs_season));
by Team_W;
if first.Team_W then win = 0;
win+1;
lose = 16 - win;
if last.Team_W then output;
run;
%end;
%mend;
%fs_season_stats;
I have week-by-week data from season 2007 to 2017 from the NFL (see attachment). and when I run my code I get that for season 2008 and 2017 I only have 31 teams in the league, i.e two teams have played 16 games with 16 loses.
I have written some extra code to find the team with 16 loses and add them into the table but the steps seems so redundant and I feel that I am missing something.
data test1 (drop=loser);
set ffeh75.wbw2017_fs_season;
run;
data test2 (drop=winner);
set ffeh75.wbw2017_fs_season;
run;
proc sql;
create table dup1
as select distinct(winner) as team_w
from test1
quit;
proc sql;
create table dup2
as select distinct(loser) as team_l
from test2
quit;
data one two ;
merge dup1(IN=in1) dup2(IN=in2);
if in1=1 then output one;
if in2=1 then output two;
run;
/*create a variable team which imports team name from team_l team_w*/
proc sql;
create table max_lose
as select team_l
from two
where team_l not in
(select team_w from two);
quit;
proc sql;
insert into win_sort2017_score
set team_w=(select team_l from max_lose), win=0, lose=16;
quit;
I see two solutions to this problem, either a datastep / proc sql step to check to see that the numbers of rows are 32 (the amount of teams in the entire NFL league) or to create a team variable which will insert the teamname from either the team_w or team_l column and do the calculations from there.
I haven't worked with SAS for several years and this project which I thought would take a couple of hours tops has taken me days.
Would appreciate any help and sorry for the wall of text/wall of code.
Working on SAS EG 7.13
Sincerly Fredrik
Let me edit this. Step 1 should be to read the data in one datastep for all the csv files and give a year variable from the filename.
data import_all; length filename txt_file_name $256; retain txt_file_name; infile "<path>\*.csv" eov=eov filename=filename truncover; input@; if _n_ eq 1 or eov then do; txt_file_name = scan(filename, -1, "\"); eov=0; end; else input ...; run;
Then with the one file, you can start processing the dataset, with the filename data as a column you can by group that, as an example:
proc means...; by txt_file_name;
...
run;
This will save you lots of writing datasets, macro code to process multiple datasets etc.
Sorry, probably didn't make it clear. I am just after a small sample of what you have - which details each of the steps, i.e. if there are multiple files, then a couple of line from a couple of files. The reason is the "first step" you talk about is the one I would drop - this is the step that creates xyz_2017, xyz_2018 etc. This would easier to work with as xyz with a column for year. So basically data in - at the start of the process - and examples of what you want out at the end.
Hi,
Sorry, can't download files, please post test data in the form of a datastep in a code window (its the {I} above post area) following:
If needed.
What I will say from just one glance at the code is that splitting the data up into lots of datasets is not a good idea (i.e. pretty much the whole of the first code window). It is far simpler and easier to code if you add your groups to the data then use by group processing to process them further. Can't really say much more without concrete input data/output required.
Sorry for the late response. Took a while to get the macro to work on a unix enviroment.
I am rookie when it comes to SAS so I am pretty locked into my mindset on how I should solve the issue. But seeing that you are my senior I will take anything you say into consideration and try to implement. If you have any good links to how I should procede I would appreciate that very much.
I am including the entire output from the first macro step fs_season as it is the team Detroit Lions who are missing from the Team_W column.
data WORK.WBW2008_FS_SEASON; infile datalines dsd truncover; input Team_W:$20. Team_L:$20.; datalines4; Arizona Cardinals,St. Louis Rams Arizona Cardinals,Miami Dolphins Arizona Cardinals,St. Louis Rams Arizona Cardinals,Buffalo Bills Arizona Cardinals,Dallas Cowboys Arizona Cardinals,Seattle Seahawks Arizona Cardinals,San Francisco 49ers Arizona Cardinals,San Francisco 49ers Arizona Cardinals,Seattle Seahawks Atlanta Falcons,Chicago Bears Atlanta Falcons,Oakland Raiders Atlanta Falcons,Green Bay Packers Atlanta Falcons,St. Louis Rams Atlanta Falcons,Kansas City Chiefs Atlanta Falcons,Detroit Lions Atlanta Falcons,New Orleans Saints Atlanta Falcons,Minnesota Vikings Atlanta Falcons,Tampa Bay Buccaneers Atlanta Falcons,Carolina Panthers Atlanta Falcons,San Diego Chargers Baltimore Ravens,Houston Texans Baltimore Ravens,Oakland Raiders Baltimore Ravens,Washington Redskins Baltimore Ravens,Cincinnati Bengals Baltimore Ravens,Cleveland Browns Baltimore Ravens,Cleveland Browns Baltimore Ravens,Philadelphia Eagles Baltimore Ravens,Jacksonville Jaguars Baltimore Ravens,Dallas Cowboys Baltimore Ravens,Cincinnati Bengals Baltimore Ravens,Miami Dolphins Buffalo Bills,St. Louis Rams Buffalo Bills,Jacksonville Jaguars Buffalo Bills,Oakland Raiders Buffalo Bills,San Diego Chargers Buffalo Bills,Denver Broncos Buffalo Bills,Seattle Seahawks Buffalo Bills,Kansas City Chiefs Carolina Panthers,Tampa Bay Buccaneers Carolina Panthers,Arizona Cardinals Carolina Panthers,Denver Broncos Carolina Panthers,New Orleans Saints Carolina Panthers,Chicago Bears Carolina Panthers,Green Bay Packers Carolina Panthers,Atlanta Falcons Carolina Panthers,San Diego Chargers Carolina Panthers,Detroit Lions Carolina Panthers,Oakland Raiders Carolina Panthers,New Orleans Saints Carolina Panthers,Kansas City Chiefs Chicago Bears,Jacksonville Jaguars Chicago Bears,Green Bay Packers Chicago Bears,Indianapolis Colts Chicago Bears,New Orleans Saints Chicago Bears,Minnesota Vikings Chicago Bears,Philadelphia Eagles Chicago Bears,Detroit Lions Chicago Bears,Detroit Lions Chicago Bears,St. Louis Rams Cincinnati Bengals,Kansas City Chiefs Cincinnati Bengals,Cleveland Browns Cincinnati Bengals,Washington Redskins Cincinnati Bengals,Jacksonville Jaguars Cleveland Browns,Jacksonville Jaguars Cleveland Browns,Cincinnati Bengals Cleveland Browns,Buffalo Bills Cleveland Browns,New York Giants Dallas Cowboys,Tampa Bay Buccaneers Dallas Cowboys,New York Giants Dallas Cowboys,Philadelphia Eagles Dallas Cowboys,Cleveland Browns Dallas Cowboys,Green Bay Packers Dallas Cowboys,San Francisco 49ers Dallas Cowboys,Washington Redskins Dallas Cowboys,Cincinnati Bengals Dallas Cowboys,Seattle Seahawks Denver Broncos,San Diego Chargers Denver Broncos,Kansas City Chiefs Denver Broncos,Atlanta Falcons Denver Broncos,New York Jets Denver Broncos,New Orleans Saints Denver Broncos,Oakland Raiders Denver Broncos,Tampa Bay Buccaneers Denver Broncos,Cleveland Browns Green Bay Packers,Indianapolis Colts Green Bay Packers,Detroit Lions Green Bay Packers,Detroit Lions Green Bay Packers,Seattle Seahawks Green Bay Packers,Minnesota Vikings Green Bay Packers,Chicago Bears Houston Texans,Tennessee Titans Houston Texans,Cincinnati Bengals Houston Texans,Green Bay Packers Houston Texans,Jacksonville Jaguars Houston Texans,Cleveland Browns Houston Texans,Chicago Bears Houston Texans,Miami Dolphins Houston Texans,Detroit Lions Indianapolis Colts,Jacksonville Jaguars Indianapolis Colts,Detroit Lions Indianapolis Colts,Baltimore Ravens Indianapolis Colts,Minnesota Vikings Indianapolis Colts,Cincinnati Bengals Indianapolis Colts,Cleveland Browns Indianapolis Colts,San Diego Chargers Indianapolis Colts,Tennessee Titans Indianapolis Colts,Houston Texans Indianapolis Colts,Pittsburgh Steelers Indianapolis Colts,Houston Texans Indianapolis Colts,New England Patriots Jacksonville Jaguars,Houston Texans Jacksonville Jaguars,Green Bay Packers Jacksonville Jaguars,Indianapolis Colts Jacksonville Jaguars,Denver Broncos Jacksonville Jaguars,Detroit Lions Kansas City Chiefs,Oakland Raiders Kansas City Chiefs,Denver Broncos Miami Dolphins,Denver Broncos Miami Dolphins,San Francisco 49ers Miami Dolphins,Buffalo Bills Miami Dolphins,Kansas City Chiefs Miami Dolphins,Buffalo Bills Miami Dolphins,New York Jets Miami Dolphins,St. Louis Rams Miami Dolphins,Oakland Raiders Miami Dolphins,New England Patriots Miami Dolphins,Seattle Seahawks Miami Dolphins,San Diego Chargers Minnesota Vikings,New York Giants Minnesota Vikings,Houston Texans Minnesota Vikings,Arizona Cardinals Minnesota Vikings,Detroit Lions Minnesota Vikings,Chicago Bears Minnesota Vikings,Jacksonville Jaguars Minnesota Vikings,Detroit Lions Minnesota Vikings,Carolina Panthers Minnesota Vikings,New Orleans Saints Minnesota Vikings,Green Bay Packers New England Patriots,Buffalo Bills New England Patriots,Arizona Cardinals New England Patriots,Oakland Raiders New England Patriots,St. Louis Rams New England Patriots,Seattle Seahawks New England Patriots,Kansas City Chiefs New England Patriots,Denver Broncos New England Patriots,Miami Dolphins New England Patriots,New York Jets New England Patriots,San Francisco 49ers New England Patriots,Buffalo Bills New Orleans Saints,Detroit Lions New Orleans Saints,San Diego Chargers New Orleans Saints,San Francisco 49ers New Orleans Saints,Atlanta Falcons New Orleans Saints,Green Bay Packers New Orleans Saints,Oakland Raiders New Orleans Saints,Kansas City Chiefs New Orleans Saints,Tampa Bay Buccaneers New York Giants,Dallas Cowboys New York Giants,Carolina Panthers New York Giants,Pittsburgh Steelers New York Giants,San Francisco 49ers New York Giants,Washington Redskins New York Giants,Arizona Cardinals New York Giants,Baltimore Ravens New York Giants,St. Louis Rams New York Giants,Philadelphia Eagles New York Giants,Cincinnati Bengals New York Giants,Washington Redskins New York Giants,Seattle Seahawks New York Jets,Buffalo Bills New York Jets,Kansas City Chiefs New York Jets,Buffalo Bills New York Jets,Arizona Cardinals New York Jets,Miami Dolphins New York Jets,Tennessee Titans New York Jets,Cincinnati Bengals New York Jets,New England Patriots New York Jets,St. Louis Rams Oakland Raiders,Tampa Bay Buccaneers Oakland Raiders,Houston Texans Oakland Raiders,New York Jets Oakland Raiders,Denver Broncos Oakland Raiders,Kansas City Chiefs Philadelphia Eagles,Dallas Cowboys Philadelphia Eagles,Cleveland Browns Philadelphia Eagles,Atlanta Falcons Philadelphia Eagles,New York Giants Philadelphia Eagles,Arizona Cardinals Philadelphia Eagles,Pittsburgh Steelers Philadelphia Eagles,San Francisco 49ers Philadelphia Eagles,Cincinnati Bengals Philadelphia Eagles,St. Louis Rams Philadelphia Eagles,Seattle Seahawks Pittsburgh Steelers,Cleveland Browns Pittsburgh Steelers,Cleveland Browns Pittsburgh Steelers,Baltimore Ravens Pittsburgh Steelers,Houston Texans Pittsburgh Steelers,Baltimore Ravens Pittsburgh Steelers,Dallas Cowboys Pittsburgh Steelers,Cincinnati Bengals Pittsburgh Steelers,New England Patriots Pittsburgh Steelers,Cincinnati Bengals Pittsburgh Steelers,San Diego Chargers Pittsburgh Steelers,Jacksonville Jaguars Pittsburgh Steelers,Washington Redskins San Diego Chargers,Tampa Bay Buccaneers San Diego Chargers,Oakland Raiders San Diego Chargers,Kansas City Chiefs San Diego Chargers,Oakland Raiders San Diego Chargers,New York Jets San Diego Chargers,New England Patriots San Diego Chargers,Kansas City Chiefs San Diego Chargers,Denver Broncos San Francisco 49ers,Washington Redskins San Francisco 49ers,St. Louis Rams San Francisco 49ers,Seattle Seahawks San Francisco 49ers,New York Jets San Francisco 49ers,Buffalo Bills San Francisco 49ers,Detroit Lions San Francisco 49ers,St. Louis Rams Seattle Seahawks,New York Jets Seattle Seahawks,San Francisco 49ers Seattle Seahawks,St. Louis Rams Seattle Seahawks,St. Louis Rams St. Louis Rams,Dallas Cowboys St. Louis Rams,Washington Redskins Tampa Bay Buccaneers,Kansas City Chiefs Tampa Bay Buccaneers,Green Bay Packers Tampa Bay Buccaneers,Atlanta Falcons Tampa Bay Buccaneers,Seattle Seahawks Tampa Bay Buccaneers,New Orleans Saints Tampa Bay Buccaneers,Detroit Lions Tampa Bay Buccaneers,Carolina Panthers Tampa Bay Buccaneers,Minnesota Vikings Tampa Bay Buccaneers,Chicago Bears Tennessee Titans,Baltimore Ravens Tennessee Titans,Green Bay Packers Tennessee Titans,Indianapolis Colts Tennessee Titans,Pittsburgh Steelers Tennessee Titans,Minnesota Vikings Tennessee Titans,Cleveland Browns Tennessee Titans,Kansas City Chiefs Tennessee Titans,Detroit Lions Tennessee Titans,Jacksonville Jaguars Tennessee Titans,Cincinnati Bengals Tennessee Titans,Jacksonville Jaguars Tennessee Titans,Chicago Bears Tennessee Titans,Houston Texans Washington Redskins,Philadelphia Eagles Washington Redskins,Dallas Cowboys Washington Redskins,Detroit Lions Washington Redskins,Cleveland Browns Washington Redskins,Seattle Seahawks Washington Redskins,New Orleans Saints Washington Redskins,Arizona Cardinals Washington Redskins,Philadelphia Eagles ;;;;
The issue I am having is when using my second macro it just takes the unique observations from the Team_W column and tallies a score. Seeing that Detroit Lions are missing from the Team_W column they do not display in the final output. My thought was doing a proc sql where statement saying that if team = X does not exist in column Team_W they will have 0 wins / 16 loses in the final macro step
If you have good work around from this I would be grateful
Let me edit this. Step 1 should be to read the data in one datastep for all the csv files and give a year variable from the filename.
data import_all; length filename txt_file_name $256; retain txt_file_name; infile "<path>\*.csv" eov=eov filename=filename truncover; input@; if _n_ eq 1 or eov then do; txt_file_name = scan(filename, -1, "\"); eov=0; end; else input ...; run;
Then with the one file, you can start processing the dataset, with the filename data as a column you can by group that, as an example:
proc means...; by txt_file_name;
...
run;
This will save you lots of writing datasets, macro code to process multiple datasets etc.
Sorry, probably didn't make it clear. I am just after a small sample of what you have - which details each of the steps, i.e. if there are multiple files, then a couple of line from a couple of files. The reason is the "first step" you talk about is the one I would drop - this is the step that creates xyz_2017, xyz_2018 etc. This would easier to work with as xyz with a column for year. So basically data in - at the start of the process - and examples of what you want out at the end.
Looking at the original code, it appears that it intends to summarize the data. Every team will be accounted for, but not every team will appear in the final data set. You would get similar output by simply counting:
%macro fs_season;
%do i=2007 %to 2017;
proc freq data=&path.&i._fs_season;
tables 'Winner/tie'n 'Loser/tie'n;
run;
%end;
%mend fs_season;
Of course, you could place the results into an output data set, and you could rename the variables. But other than that, it's difficult to guide you since you haven't actually explained what your code is intended to accomplish.
Since the Browns didn't win a game in 2017 (and 2016?)...
1) Your attachments are ; delimited, not CSV AND there is a not obvious field without a header between winner/tie and Loser/tie and another without header that seems to have only the value boxscor
2) they are not data sets as you make choices when bringing the data into SAS that means your data would look different than if I do it.
For example I might exclude the playoff data when reading the files.
I probably as a first step after reading would have restructured the data to
week day date time team win/loss flag (1=win 0=loss) Points Yards and whatever the heck TOW TOL might mean
The I would have all of the teams regardless and could
Then something like
Proc tabulate data=restructured;
class team winflag;
table team,
winflag*n
;
run;
would give me wins and loses. If the winflag has a value for tie then also would have ties.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.