- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I import 2 csv files which has 3879 columns with column name format: AAAAAAA:BBBBB. SAS successfully imported both files BUT for 1 file, SAS gets the correct name for just a part of it. Hundreds of columns at the end have name of VAR_:
The notice shows in log is: "Number of names found is less than number of variables found."
Enclosed is my files and the SAS log for the problem file.
ROE_FQ12022.csv is NOT imported correctly.
Second_ROE_FQ12022.csv is imported correctly.
Can you please help to fix it?
Many thanks.
HHC
%let file =ROE_FQ12022;
proc import
datafile="\&FILE..csv"
out=input DBMS= csv replace;
DATAROW=3;
getnames= yes;
run;
proc import
datafile="\Second_&FILE..csv"
out=input DBMS= csv replace;
DATAROW=3;
getnames= yes;
run;
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do it with data step:
filename f1 "..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1000000 dlm=",";
input var : $ 32. @@;
var = cats("var",_N_,"=",translate(var,"_",":"));
output;
if _N_ = 3879 then stop;
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute('input var1-var3879;');
call execute('rename');
do until(eof);
set test end=eof;
call execute(var);
end;
call execute('; run;');
stop;
run;
[EDIT:] A bit more flexible version:
filename f1 "..\..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1 recfm=N;
input x $ char1. @@;
length var $ 64;
retain var;
keep var;
if (x in ("," '0a'x '0d'x)) then
do;
n+1;
var = cats("var",n,"=",translate(var,"_",":"));
output;
var = "";
if x in ('0a'x '0d'x) then stop;
end;
else var = cats(var, x);
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute(cats('input var1-var',nobs,';'));
call execute('rename');
do until(eof);
set test end=eof nobs=nobs;
call execute(var);
end;
call execute('; run;');
stop;
run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Do it with data step:
filename f1 "..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1000000 dlm=",";
input var : $ 32. @@;
var = cats("var",_N_,"=",translate(var,"_",":"));
output;
if _N_ = 3879 then stop;
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute('input var1-var3879;');
call execute('rename');
do until(eof);
set test end=eof;
call execute(var);
end;
call execute('; run;');
stop;
run;
[EDIT:] A bit more flexible version:
filename f1 "..\..\..\..\ROE_FQ12022.csv";
data test;
infile f1 lrecl=1 recfm=N;
input x $ char1. @@;
length var $ 64;
retain var;
keep var;
if (x in ("," '0a'x '0d'x)) then
do;
n+1;
var = cats("var",n,"=",translate(var,"_",":"));
output;
var = "";
if x in ('0a'x '0d'x) then stop;
end;
else var = cats(var, x);
run;
data _null_;
call execute('data test2;');
call execute('infile f1 lrecl=1000000 dlm="," firstobs=3 dsd missover;');
call execute(cats('input var1-var',nobs,';'));
call execute('rename');
do until(eof);
set test end=eof nobs=nobs;
call execute(var);
end;
call execute('; run;');
stop;
run;
Bart
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug
"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings
SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
PROC IMPORT does not handle header lines that are longer than 32K bytes. (Even if you modified your code to tell it to read more than 32K bytes from the header line.)
Looks like all of your fields are numbers so just read them as numbers.
To get a list of names read the first line separately.
data names;
infile "&path/&fname" lrecl=1000000 obs=1 dsd ;
length old new $32 label $256 ;
input label @@;
old = cats('v',_n_);
new = translate(label,'_',':');
run;
filename rename temp;
filename label temp;
data _null_;
set names end=eof;
file rename;
if _n_=1 then put 'rename';
put old '=' new ;
if eof then put ';';
file label;
if _n_=1 then put 'label';
put old '=' label :$quote.;
if eof then put ';' ;
if eof then call symputx('nvar',_n_);
run;
Then you can just read the actual data as numbers.
data input ;
infile "&path/&fname" dsd truncover firstobs=3;
input v1-v3879;
%include rename;
%include label ;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello @hhchenfx
your code worked as it is for me on SASONDemand for Academics. Reproducing the code for completeness
%let file=/home/myuid/second_ROE_FQ12022.csv;
proc import
datafile="&file."
out=input DBMS= csv replace;
DATAROW=3;
getnames= yes;
run;
The operative portion of the log is as follows. No error as mentioned by you seen.
NOTE: The infile '/home/myuid/second_ROE_FQ12022.csv' is:
Filename=/home/myuid/second_ROE_FQ12022.csv,
Owner Name=myuid,Group Name=oda,
Access Permission=-rw-r--r--,
Last Modified=21Mar2022:10:08:38,
File Size (bytes)=166215
NOTE: 110 records were read from the infile '/home/myuid/second_ROE_FQ12022.csv'.
The minimum record length was 1298.
The maximum record length was 1592.
NOTE: The data set WORK.INPUT has 110 observations and 545 variables.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you, Everyone for helping!
It is amazing to see how your code run SO FAST and correctly.
HHC
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you really have a CSV file with header line that is longer than 32K bytes and you actually need to GUESS how to define the variables (instead of just assuming everything is a number like in this case) then you might want to use a more flexible tool than PROC IMPORT.
Try this macro instead:
https://github.com/sasutils/macros/blob/master/csv2ds.sas