PROC IMPORT will guess the length of a character variable is maximum length of value of the field on the line, including and quoting.
Try using this macro instead. https://github.com/sasutils/macros/blob/master/csv2ds.sas It will calculate the maximum length using the actual values, not the quoted values. But if you want that first variable to be defined as character instead of numeric you might need need to use the override feature to set the type and length.
options parmcards=csv;
filename csv temp;
parmcards;
"id",var1,var2
"0000",12,14
"1234",45,10
"5678",12,10
;
proc import dbms=csv datafile=csv out=import replace; run;
%csv2ds(csv,out=csv2ds,replace=1);
proc compare data=import compare=csv2ds;
run;
The COMPARE Procedure
Comparison of WORK.IMPORT with WORK.CSV2DS
(Method=EXACT)
Data Set Summary
Dataset Created Modified NVar NObs
WORK.IMPORT 14JAN22:18:50:19 14JAN22:18:50:19 3 3
WORK.CSV2DS 14JAN22:18:50:19 14JAN22:18:50:19 3 3
Variables Summary
Number of Variables in Common: 3.
Number of Variables with Conflicting Types: 1.
Number of Variables with Differing Attributes: 2.
Listing of Common Variables with Conflicting Types
Variable Dataset Type Length Format Informat
id WORK.IMPORT Char 6 $6. $6.
WORK.CSV2DS Num 8 Z4.
Listing of Common Variables with Differing Attributes
Variable Dataset Type Length Format Informat
var1 WORK.IMPORT Num 8 BEST12. BEST32.
WORK.CSV2DS Num 8
var2 WORK.IMPORT Num 8 BEST12. BEST32.
WORK.CSV2DS Num 8
So here is how you might force a variable to be character using %CSV2DS() macro.
Run it once and then find the maximum length from the generated _TYPES_ dataset and use that to create an OVERRIDES dataset. Setting FORMAT to a single underscore will remove any format that the macro might guess to attach.
Example:
%csv2ds(csv,out=csv2ds,replace=1);
data override;
set _types_;
where upcase(name)='ID';
length=cats('$',maxlength);
format='_';
keep varnum length format;
run;
%csv2ds(csv,out=csv2ds,replace=1,overrides=override);
So for this little example the macro ends up running this code to create the dataset from the CSV file.
1097 +data csv2ds;
1098 + infile CSV dlm=',' dsd truncover firstobs=2 ;
1099 + length id $4 var1 8 var2 8 ;
1100 + input id -- var2 ;
1101 +run;
... View more