I have a situation in which I need to read in an extremely large flatfile (on multiple tapes) into my SAS program. I need to compare the zipcodes in this flatfile with a list of zipcodes I have in a SAS file. If the zipcode in the input file appears on the SAS zipcode file, I want to keep that record for additional processing.
I was hoping to be able filter out the records I do not what while I am reading them and not read in the entire file into a SAS dataset to then determine which records a what to keep. Something like:
infile MASTER Length=Recl TRUNCOVER;
input @001 zip $ 5. @;
If zip is in ZIPCODE file, then keep record;
Any thoughts on how to do this in the Data Step that reads in the large file?
Consider using SAS PROC FORMAT to build a look-up data table, and you would use the SAS PUT function in your DATA step to determine if the input-file data contents is contained in your selection criteria (from your SAS file, after it has been converted to the SAS format). A SAS file can be used as input to the PROC FORMAT procedure, using the CNTLIN= keyword, when the SAS file is populated with specific-named SAS variables (FMTNAME, HLO, START, LABEL, at a minimum).
You can use the SAS support website http://support.sas.com/ to find SAS-hosted product documentation and also SAS user community technical papers on this topic.
You could also consider to use SET with the KEY option, if you are having your zipcode table indexed. Another approach could to read the file with a data step view, then do an inner join to the zipcode table.
You may want to create a format from your small SAS file. Your label can consist of 'matchyes' or 'matchno'. 'matchno' would have a value of 'other'. Then add a where statement when running against the large file, "where zipcode = 'matchyes'".