BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SuzanneDorinski
Lapis Lazuli | Level 10

I'll share part of the program in this post, and attach the whole thing.  I borrowed some code from the SAS Dummy blog, to figure out the name of the file within the zipped file.

 

I ran that, and then copied the really long file name into the INFILE statement in the data step.  

 

* borrowing code from https://blogs.sas.com/content/sasdummy/2014/01/29/using-filename-zip/ ;

%let ziploc = /folders/myfolders/NCHS Vital Statistics/Nat2018us.zip;

/* Assign a fileref wth the ZIP method */
filename inzip zip "&ziploc";
 
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname);
 length memname $200;
 fid=dopen("inzip");
 if fid=0 then
  stop;
 memcount=dnum(fid);
 do i=1 to memcount;
  memname=dread(fid,i);
  output;
 end;
 rc=dclose(fid);
run;
 
/* create a report of the ZIP contents */
title "Files in the ZIP file";

proc print data=contents noobs N;
run;

title;

*options obs=100 ;
options obs=max;
*options nocenter ;

**------------------------------------------------ ;
**  by Jean Roth	Thu Oct 12 11:09:27 EDT 2017
**  This program reads the 2017 NCHS Natality Detail Data File  ;
**  Report errors to jroth@nber.org ;
**  This program is distributed under the GNU GPL. ;
**  See end of this file and 
**  http://www.gnu.org/licenses/ for details.      ;
** ----------------------------------------------- ;

*  The following line should contain the directory
   where the SAS file is to be stored  ;

*libname library "/folders/myfolders/NCHS Vital Statistics/";

*  The following line should contain
   the complete path and name of the raw data file.
   On a PC, use backslashes in paths as in C:\  ;

*FILENAME datafile pipe "7z e /homes/data/natality/2017/natl2017.zip  -so ";

*  The following line should contain the name of the SAS dataset ;

%let dataset = natl2018;

DATA &dataset ;

INFILE inzip(Nat2018PublicUS.c20190509.r20190717.txt) zip truncover LRECL = 20000 ;
attrib  dob_yy       length=4     label="Birth Year";        
attrib  dob_mm       length=3     label="Birth Month 01 January";               
attrib  dob_tt       length=4     label="Time of Birth 0000-2359 Time of Birth";

When I tested the code, the options obs=100 statement was uncommented.  Once I was sure that the code was correct, I commented out that line, then typed in the options obs=max statement. 

 

Since the data set is 2 GB, I decided that I didn't want to have a permanent version, so I commented out the LIBNAME statement.  I commented out the FILENAME statement provided by NBER, because I don't  think it will work on my home computer.   I'm using the FILENAME statement from the SAS blog instead.  

 

I didn't have to change anything else in the NBER program after the INFILE statement in the data step.  

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 3958 views
  • 5 likes
  • 6 in conversation