BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SuzanneDorinski
Lapis Lazuli | Level 10

I'll share part of the program in this post, and attach the whole thing.  I borrowed some code from the SAS Dummy blog, to figure out the name of the file within the zipped file.

 

I ran that, and then copied the really long file name into the INFILE statement in the data step.  

 

* borrowing code from https://blogs.sas.com/content/sasdummy/2014/01/29/using-filename-zip/ ;

%let ziploc = /folders/myfolders/NCHS Vital Statistics/Nat2018us.zip;

/* Assign a fileref wth the ZIP method */
filename inzip zip "&ziploc";
 
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname);
 length memname $200;
 fid=dopen("inzip");
 if fid=0 then
  stop;
 memcount=dnum(fid);
 do i=1 to memcount;
  memname=dread(fid,i);
  output;
 end;
 rc=dclose(fid);
run;
 
/* create a report of the ZIP contents */
title "Files in the ZIP file";

proc print data=contents noobs N;
run;

title;

*options obs=100 ;
options obs=max;
*options nocenter ;

**------------------------------------------------ ;
**  by Jean Roth	Thu Oct 12 11:09:27 EDT 2017
**  This program reads the 2017 NCHS Natality Detail Data File  ;
**  Report errors to jroth@nber.org ;
**  This program is distributed under the GNU GPL. ;
**  See end of this file and 
**  http://www.gnu.org/licenses/ for details.      ;
** ----------------------------------------------- ;

*  The following line should contain the directory
   where the SAS file is to be stored  ;

*libname library "/folders/myfolders/NCHS Vital Statistics/";

*  The following line should contain
   the complete path and name of the raw data file.
   On a PC, use backslashes in paths as in C:\  ;

*FILENAME datafile pipe "7z e /homes/data/natality/2017/natl2017.zip  -so ";

*  The following line should contain the name of the SAS dataset ;

%let dataset = natl2018;

DATA &dataset ;

INFILE inzip(Nat2018PublicUS.c20190509.r20190717.txt) zip truncover LRECL = 20000 ;
attrib  dob_yy       length=4     label="Birth Year";        
attrib  dob_mm       length=3     label="Birth Month 01 January";               
attrib  dob_tt       length=4     label="Time of Birth 0000-2359 Time of Birth";

When I tested the code, the options obs=100 statement was uncommented.  Once I was sure that the code was correct, I commented out that line, then typed in the options obs=max statement. 

 

Since the data set is 2 GB, I decided that I didn't want to have a permanent version, so I commented out the LIBNAME statement.  I commented out the FILENAME statement provided by NBER, because I don't  think it will work on my home computer.   I'm using the FILENAME statement from the SAS blog instead.  

 

I didn't have to change anything else in the NBER program after the INFILE statement in the data step.  

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 15 replies
  • 5484 views
  • 5 likes
  • 6 in conversation