I have several csv files of NBA stats with about 30 variables each and I am attempting to use proc import to create the data set, with SAS University Edition. SAS successfully reads the data, however, doesn't get names for the variables and instead assigns VAR1 VAR2 VAR3....
The actual data (mostly numeric, and a few character) sometimes begins on the 2nd row, other times it will start in row 3 or 4. The category names (which would become the variable names in the data set) are always in the previous row. Here's my code and three csv files for reference.
I've successfully imported nearly identical files (sports statistics) with proc import so I'm puzzled why it isn't working this time. (fyi - csv file name is in the respective proc import code)
proc import datafile='leagues_NBA_2014_team.csv' out=team replace;
datarow=2; getnames=yes;
run;
proc import datafile='leagues_NBA_2014_shooting.csv' out=shooting replace;
datarow=4; getnames=yes;
run;
proc import datafile='leagues_NBA_2014_misc.csv' out=misc replace;
datarow=3; getnames=yes;
run;
I don't think your problem is where the data starts but, rather, how many rows are needed to build the variable names.
The following will read your three example files:
%macro importc(type);
%if %upcase(&type.) eq TEAM %then %let nvar=26;
%else %if %upcase(&type.) eq MISC %then %let nvar=24;
%else %let nvar=28;
filename tempdata temp;
data _null_;
infile "c:\art\leagues_NBA_2014_&type..csv" delimiter=','
MISSOVER DSD LRECL=32760;
file tempdata dlm=',' lrecl=32760;
array vnames(&nvar.) $50.;
retain vnames: stopit;
informat var1-var&nvar. $50.;
format var1-var&nvar. $50.;
array _vnames(&nvar.) $50. var1-var&nvar.;
input var1-var&nvar.;
if _n_ eq 1 then do;
do i=1 to dim(vnames);
_vnames(i)=tranwrd(_vnames(i),'%','pct');
vnames(i)=compress(translate(strip(_vnames(i)),'___',' /-'),"'");
if substr(vnames(i),1,1) eq '_' then
vnames(i)=substr(vnames(i),2);
end;
if _vnames(1) eq 'Rk' then do;
stopit=1;
put ( vnames(*) ) (+0);
end;
else stopit=0;
end;
else if not stopit then do;
do i=1 to dim(vnames);
_vnames(i)=tranwrd(_vnames(i),'%','pct');
vnames(i)=catx('_',translate(strip(vnames(i)),'___',' /-'),
compress(translate(strip(_vnames(i)),'___',' /-'),"'."));
end;
if _vnames(1) eq 'Rk' then do;
stopit=1;
put ( vnames(*) ) (+0);
end;
end;
else put (_vnames(*)) (+0);
run;
proc import datafile=tempdata out=&type. dbms=csv replace;
run;
%mend;
%importc(team)
%importc(misc)
%importc(shooting)
try the namerow option to specify what rows the variable names are in
proc import datafile='leagues_NBA_2014_misc.csv' out=misc replace;
datarow=3; namerow=2; getnames=yes;
run;
I get the following error:
398 proc import datafile='/folders/myfolders/sports/nba/2013-14/leagues_NBA_2014_misc.csv' out=nba.misc replace;
NOTE: The previous statement has been deleted.
399 datarow=3; namerow=1; getnames=yes;
_______
180
ERROR 180-322: Statement is not valid or it is used out of proper order.
I'm out of ideas, hopefully someone else will have some.
dbms=csv
on the proc statement perhaps?
I added the dbms=csv although usually its not necessary because csv is part of the file extension. Unfortunately SAS still assigned VAR1, VAR2, VAR3....... So I just deleted the rows from the raw data file that I didn't want SAS to read. I try not make a habit of doing that since I would rather write code around how the raw data is formatted than rely on being able modify it. Thanks for your input.
I don't think your problem is where the data starts but, rather, how many rows are needed to build the variable names.
The following will read your three example files:
%macro importc(type);
%if %upcase(&type.) eq TEAM %then %let nvar=26;
%else %if %upcase(&type.) eq MISC %then %let nvar=24;
%else %let nvar=28;
filename tempdata temp;
data _null_;
infile "c:\art\leagues_NBA_2014_&type..csv" delimiter=','
MISSOVER DSD LRECL=32760;
file tempdata dlm=',' lrecl=32760;
array vnames(&nvar.) $50.;
retain vnames: stopit;
informat var1-var&nvar. $50.;
format var1-var&nvar. $50.;
array _vnames(&nvar.) $50. var1-var&nvar.;
input var1-var&nvar.;
if _n_ eq 1 then do;
do i=1 to dim(vnames);
_vnames(i)=tranwrd(_vnames(i),'%','pct');
vnames(i)=compress(translate(strip(_vnames(i)),'___',' /-'),"'");
if substr(vnames(i),1,1) eq '_' then
vnames(i)=substr(vnames(i),2);
end;
if _vnames(1) eq 'Rk' then do;
stopit=1;
put ( vnames(*) ) (+0);
end;
else stopit=0;
end;
else if not stopit then do;
do i=1 to dim(vnames);
_vnames(i)=tranwrd(_vnames(i),'%','pct');
vnames(i)=catx('_',translate(strip(vnames(i)),'___',' /-'),
compress(translate(strip(_vnames(i)),'___',' /-'),"'."));
end;
if _vnames(1) eq 'Rk' then do;
stopit=1;
put ( vnames(*) ) (+0);
end;
end;
else put (_vnames(*)) (+0);
run;
proc import datafile=tempdata out=&type. dbms=csv replace;
run;
%mend;
%importc(team)
%importc(misc)
%importc(shooting)
proc import datafile='E:\leagues_NBA_2014_team.csv' out=misc
dbms=csv
replace;
datarow=3;
getnames=yes;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.