05-30-2013 02:36 PM
I have a folder full of text files, each with names that don't follow any kind of real convention, and each with a different amount of variables. However, each has one variable in common that they will be merged by. I'm trying to code something that will, given a certain folder or directory, will import however many text files happens to be in the folder, import each text file (with different numbers of variables) into SAS, and then merge on the common variable. So basically, the tricky part of this has been coding something flexible enough to handle an undetermined number of text files, each with an undetermined number of variables.
Has anyone coded something like this before, or have any suggestions?
05-30-2013 03:18 PM
You can easily get the portion about importing all text files in, SAS Support notes has several options, I've posted three below. If Proc import won't import the files appropriately, for example coding numbers as characters, then you'll also have issues.
If you use a specific library or a have a naming convention then you could query the names of the data sets from the dictionary table and easily put them into a merge. This would assume that the variable names are unique between datasets, excluding the merge variable, because if you have the same names between data sets then you'll need a workaround for those.
07-16-2013 01:39 PM
Hi Reeza, thanks for the suggestions! Sorry I've been inattentive for awhile. I had been distracted with something non-SAS related at work and completely forgot about this post I created. The first link looks like the best option for my particular issue. However, even though it's listed as being able to convert multiple .txt files to SAS files in the title, the description doesn't mention .txt files at all, and each of the programs on the ftp site look to be concerned with excel files only. Is there something I'm missing? I have several tab-delimited .txt files, each with a different number of variables. Then each of the files gets merged together on one common ID variable.
Thanks again for your help, and any further guidance you could provide.
07-16-2013 01:46 PM
Read the programs, it takes different parameters, the default set is xls, but you can change that.
**** >>>>>>>>>>>>>>>>>> Change the value here <<<<<<<<<<<<<<<<< *;
* change the value to what you want to read, tab, csv, dbf or xls *;
* make sure your extensions in the directory are LOWER case , that is *;
* xls , not XLS *;
The problem with these programs are they're generic so need to handle many different situations and yours is more simplistic.
You can try and simplify it by removing all the excel related code.
Or you could use the simplified program here and modify it for TAB/.TXT instead:
07-16-2013 04:12 PM
Ah, thanks Reeza. I gave it a go with the program you linked. Is there any way to get it to work with text files that have spaces in the name? Here's an example: "FFIEC CDR Call Bulk POR 12312011.txt". My goal was to download a directory full of files, convert and merge them without any other interventions necessary. But if not, I can modify the titles.Thanks again!
07-16-2013 04:15 PM
The spaces won't matter for the input file, but the output file, the sas dataset, does need a name that follows the SAS convention, up to 32 characters long, no spaces, can't start with a number.
07-16-2013 05:18 PM
I'm using the one you suggested in your second post. I made all the changes that it requested, and it seems to run fine until the macro. The import statement fails because the file it tries to import "POR12312011.txt" (the new name the file was given) does not exist.
07-16-2013 05:50 PM
Post the code as modified, My guess is the scan function at the top, where it gets the dataset name. It's using " " as the delimiter and you probably want a "\" or something instead.
07-17-2013 10:07 AM
Reeza, I'm having trouble pasting into the editor here. The only changes I made to the program were the two directory references at the top (keeping the backslash on one) and setiing "db='tab'". If you're at all interested, the dataset I'm trying to import can be found at https://cdr.ffiec.gov/public/PWS/DownloadBulkData.aspx. You would then select "Single Period" and "12/31/2011". It downloads a zip file that then must be exported.