Hi,
I import 2 csv files which has 3879 columns with column name format: AAAAAAA:BBBBB. SAS successfully imported both files BUT for 1 file, SAS gets the correct name for just a part of it. Hundreds of columns at the end have name of VAR_:
The notice shows in log is: "Number of names found is less than number of variables found."
Enclosed is my files and the SAS log for the problem file.
ROE_FQ12022.csv is NOT imported correctly.
Second_ROE_FQ12022.csv is imported correctly.
Can you please help to fix it?
Many thanks.
HHC
%let file =ROE_FQ12022;
proc import
datafile="\&FILE..csv"
out=input DBMS= csv replace;
DATAROW=3;
getnames= yes;
run;
proc import
datafile="\Second_&FILE..csv"
out=input DBMS= csv replace;
DATAROW=3;
getnames= yes;
run;
Do it with data step:
filename f1 "..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1000000 dlm=",";
input var : $ 32. @@;
var = cats("var",_N_,"=",translate(var,"_",":"));
output;
if _N_ = 3879 then stop;
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute('input var1-var3879;');
call execute('rename');
do until(eof);
set test end=eof;
call execute(var);
end;
call execute('; run;');
stop;
run;
[EDIT:] A bit more flexible version:
filename f1 "..\..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1 recfm=N;
input x $ char1. @@;
length var $ 64;
retain var;
keep var;
if (x in ("," '0a'x '0d'x)) then
do;
n+1;
var = cats("var",n,"=",translate(var,"_",":"));
output;
var = "";
if x in ('0a'x '0d'x) then stop;
end;
else var = cats(var, x);
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute(cats('input var1-var',nobs,';'));
call execute('rename');
do until(eof);
set test end=eof nobs=nobs;
call execute(var);
end;
call execute('; run;');
stop;
run;
Bart
Do it with data step:
filename f1 "..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1000000 dlm=",";
input var : $ 32. @@;
var = cats("var",_N_,"=",translate(var,"_",":"));
output;
if _N_ = 3879 then stop;
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute('input var1-var3879;');
call execute('rename');
do until(eof);
set test end=eof;
call execute(var);
end;
call execute('; run;');
stop;
run;
[EDIT:] A bit more flexible version:
filename f1 "..\..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1 recfm=N;
input x $ char1. @@;
length var $ 64;
retain var;
keep var;
if (x in ("," '0a'x '0d'x)) then
do;
n+1;
var = cats("var",n,"=",translate(var,"_",":"));
output;
var = "";
if x in ('0a'x '0d'x) then stop;
end;
else var = cats(var, x);
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute(cats('input var1-var',nobs,';'));
call execute('rename');
do until(eof);
set test end=eof nobs=nobs;
call execute(var);
end;
call execute('; run;');
stop;
run;
Bart
PROC IMPORT does not handle header lines that are longer than 32K bytes. (Even if you modified your code to tell it to read more than 32K bytes from the header line.)
Looks like all of your fields are numbers so just read them as numbers.
To get a list of names read the first line separately.
data names;
infile "&path/&fname" lrecl=1000000 obs=1 dsd ;
length old new $32 label $256 ;
input label @@;
old = cats('v',_n_);
new = translate(label,'_',':');
run;
filename rename temp;
filename label temp;
data _null_;
set names end=eof;
file rename;
if _n_=1 then put 'rename';
put old '=' new ;
if eof then put ';';
file label;
if _n_=1 then put 'label';
put old '=' label :$quote.;
if eof then put ';' ;
if eof then call symputx('nvar',_n_);
run;
Then you can just read the actual data as numbers.
data input ;
infile "&path/&fname" dsd truncover firstobs=3;
input v1-v3879;
%include rename;
%include label ;
run;
Hello @hhchenfx
your code worked as it is for me on SASONDemand for Academics. Reproducing the code for completeness
%let file=/home/myuid/second_ROE_FQ12022.csv;
proc import
datafile="&file."
out=input DBMS= csv replace;
DATAROW=3;
getnames= yes;
run;
The operative portion of the log is as follows. No error as mentioned by you seen.
NOTE: The infile '/home/myuid/second_ROE_FQ12022.csv' is:
Filename=/home/myuid/second_ROE_FQ12022.csv,
Owner Name=myuid,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=21Mar2022:10:08:38,
File Size (bytes)=166215
NOTE: 110 records were read from the infile '/home/myuid/second_ROE_FQ12022.csv'.
The minimum record length was 1298.
The maximum record length was 1592.
NOTE: The data set WORK.INPUT has 110 observations and 545 variables.
Thank you, Everyone for helping!
It is amazing to see how your code run SO FAST and correctly.
HHC
If you really have a CSV file with header line that is longer than 32K bytes and you actually need to GUESS how to define the variables (instead of just assuming everything is a number like in this case) then you might want to use a more flexible tool than PROC IMPORT.
Try this macro instead:
https://github.com/sasutils/macros/blob/master/csv2ds.sas
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.