BookmarkSubscribeRSS Feed
Chang
Quartz | Level 8

hi,

 

I have a macro that reads all CSV files in a folder and combine them into one SAS dataset. This macro is working well, except that trucation happens to same-named variables that vary in length in different CSV files. The FORCE option has to be specified; otherwise, an error would pop up. I was wondering if there is any work-around to avoid the truncation. My script is from a post which Tom solved the problem of reading multiple CSV files using PROC IMPORT and PROC APPEND. I am not sure if I can still find that post. But the macro script is not long:

 

%macro allCSV_to_1SAS(dir= /*path of the folder that contains CSV files with same data structure*/
											,out= /*name of output file, specified by libref.dataset */
											);
	/* Make sure output ds does not exist ;*/
	proc delete data=&out; run;
	* Read list of filenames and generate PROC IMPORT and PROC APPEND for each one ;
	filename code temp ;
	data _null_ ;
		infile "dir ""&dir\*.csv"" /b" pipe truncover;
		input filename $256.;
		file code ;
		put 'proc import datafile="&dir\' filename +(-1) '" out=onefile replace;'
			/ 'guessingrows=32767;' /*change default scanning first 20 rows to determine variable length, which truncates variables*/
	    	/ 'run;'
	    	/ "proc append base=&out data=onefile FORCE; run;"
	  		;
	run;

* Run the generated code ;
%inc code / source2 ;
%mend allCSV_to_1SAS ;

My CSV files are in a folder "binary01_modelFits." Are files have the same variable number and variable names. Dependent variables form the suffices of the files. You can find truncated depVar in the output SAS dataset, just by comparing the shortest and the longest depVar (first column). The macro script is also attached.

 

 

 

2 REPLIES 2
ballardw
Super User

The appropriate solution is going to be do not use Proc Import to read multiplie files with the same structure. SAS is going to be required to GUESS the lengths of variables in each data set. If in one date set the longest value for a character variable is 10 that is the length. If the same variable in the next file is 15 then that is length guessed. Also you may get changes from numeric to character depending on values encountered.

 

Better: Import one file with proc import.

Pull the datastep code generated from the proc import and paste into the editor. Edit it to reflect maximum lengths of the variables at the informat statements. This is also time to determine if some of the variables consisting of digits should really be numeric or not such as Zip code, phone number, product numbers, client accounts and such.

 

Now you have code that by pointing to the different file and changing the output data set name will read files and get the same structure/sizes every time.

 

If interested there are also some code elements that allow you to read multiple files in one data step using the infile option EOV.

Ksharp
Super User
You can use SQL to avoid truncate .

%macro allCSV_to_1SAS(dir= /*path of the folder that contains CSV files with same data structure*/
											,out= /*name of output file, specified by libref.dataset */
											);
	/* Make sure output ds does not exist ;*/
	proc delete data=&out; run;
	* Read list of filenames and generate PROC IMPORT and PROC APPEND for each one ;
	filename code temp ;
	data _null_ ;
		infile "dir ""&dir\*.csv"" /b" pipe truncover;
		input filename $256.;
		file code ;
		put 'proc import datafile="&dir\' filename +(-1) '" out=onefile'          _n_ +(-1)       'replace;'
			/ 'guessingrows=32767;' /*change default scanning first 20 rows to determine variable length, which truncates variables*/
	    	/ 'run;'
	    
	  		;
	run;

* Run the generated code ;
%inc code / source2 ;
%mend allCSV_to_1SAS ;





proc sql;
 create table want as
 select * from  onefile1
 union all corr
 select * from  onefile2
 union all corr
 select * from  onefile3
 union all corr
...............

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 3143 views
  • 0 likes
  • 3 in conversation