BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jordkeen
Calcite | Level 5

I am new to SAS (Stata and R user), but I have a very large dataset with SAS input statements provided so I want to use it at least to prepare and clean the data. I'm using SAS 9.4. 

 

I have a group of zipped text files that are of identical format but split into different files and zipped separately. I'm looking for an efficient way to read them all in together just stacked on top of one another so I can apply a set of input statements provided for them. 

 

Ex. of file naming structure

dme2010.file01.txt.gz

dme2010.file02.txt.gz

dme2010.file03.txt.gz

dme2011.file01.txt.gz

dme2011.file02.txt.gz

dme2011.file03.txt.gz

 

Each zip file contains just a single text file of the same name (ie dme2010.file01.txt.gz contains just dme2010.file01.txt). I am not able to extract the text files because they are read-only and this can't be changed.

 

I have tried using wildcards like below, but this just runs the input statements and doesn't grab any of the observations:

filename dmein pipe 'gzip -dc H:\Data\Seerm\Drive1\dme*.txt.gz';
filename dmeinput 'input lines';

data dme;
  infile dmein lrecl=635 missover pad;
  %include dmeinput;
run;

 

I have about 250 different text files to read in over different years over file numbers, but the data provider has provided SAS input statements that apply to all of them. 

 

What is the best way to read in all of these text files so I can apply the input statements code to them all at once?

 

Thanks in advance,

Jordan

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

1) is it even possible to have a macro like this?

Yes

 


2) how do I reference the path in the filename statement?

 


You're currently using single quotes, macro variables resolve in double quotes. Switch it to double quotes.

 


3) how do I have the macro loop through all of the paths in the list created by %list_files?

 


You don't. Look at Call Execute, it will call the macro for every entry in the data set. The documentation has an example or I think my sample code had examples.

 

Things you may want to consider:

 

  • Instead of using a data step to append, consider PROC APPEND. If your data set is not well structured and you're not defining the types/lengths explicitly you'll run into issues with mismatching lengths and types.
  • Drop the TEMP dataset at the end so if there's some error with reading the file you don't keep appending the old data.
  • To build a macro you should first have working SAS code (perhaps you already do) that's the best idea and then convert that to a macro.

 

 

View solution in original post

4 REPLIES 4
Reeza
Super User

You're going to need a macro so welcome to the deep end quickly 🙂

 

Once you get it working for one file, you can wrap that in a macro, create a list of the files, and call the macro multiple times using call execute. I'm going to point you to the different tools you may need and a example. If you need further help, post back with what your code looks like so far and any issues you're having.

 

You can use the FILENAME ZIP methods to read the file:

https://support.sas.com/documentation/cdl/en/lestmtsref/69738/HTML/default/viewer.htm#n1dn0f61yfyzto...

 

Regarding how to get list of files:

https://communities.sas.com/t5/SAS-Communities-Library/SAS-9-4-Macro-Language-Reference-Has-a-New-Ap...

 

You can combine some of these together to get your full solution.

 

An example of how that works when it's all put together is here:

https://github.com/statgeek/SAS-Tutorials/blob/master/Import_all_files_one_type

ChrisHemedinger
Community Manager

FILENAME ZIP doesn't work with GZ files...yet.  That's coming in SAS 9.4 Maint 5, which is due to be released Very Soon.  In the meantime, for gzip files you'll still need to rely on external tools via FILENAME PIPE.

It's time to register for SAS Innovate! Join your SAS user peers in Las Vegas on April 16-19 2024.
jordkeen
Calcite | Level 5

Thanks for the suggestions. I have been a little sidetracked with other work but it was helpful to get me started. 

 

I was able to implement the macro for the list of files very easily. I am stuck however with trying to implement a macro to append all of the files into one. 

 

Here is what I have so far:

%macro append_files(path);
	filename in pipe 'gzip -cd &path'; /* don't know how to reference this exactly */
	options nocenter validvarname=upcase;

	data temp;
		infile in lrecl=635 missover pad;
		%include input;
	run;

	data full;
		set full temp;
	run;

%mend;
	
%list_files(H:\Data\Seerm\Drive1, gz);

%append_files([go through all of the paths in the file list]);

My questions are: 

1) is it even possible to have a macro like this?

2) how do I reference the path in the filename statement?

3) how do I have the macro loop through all of the paths in the list created by %list_files?

 

Thanks!

Reeza
Super User

1) is it even possible to have a macro like this?

Yes

 


2) how do I reference the path in the filename statement?

 


You're currently using single quotes, macro variables resolve in double quotes. Switch it to double quotes.

 


3) how do I have the macro loop through all of the paths in the list created by %list_files?

 


You don't. Look at Call Execute, it will call the macro for every entry in the data set. The documentation has an example or I think my sample code had examples.

 

Things you may want to consider:

 

  • Instead of using a data step to append, consider PROC APPEND. If your data set is not well structured and you're not defining the types/lengths explicitly you'll run into issues with mismatching lengths and types.
  • Drop the TEMP dataset at the end so if there's some error with reading the file you don't keep appending the old data.
  • To build a macro you should first have working SAS code (perhaps you already do) that's the best idea and then convert that to a macro.

 

 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 2562 views
  • 0 likes
  • 3 in conversation