<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625441#M184338</link>
    <description>"Not working" meaning I can't open the file in notepad, notepad++ or wordpad, as they error message states file is "too big" or "failed to open". I am not sure what you mean by more capable editor. The url is &lt;A href="https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Tools" target="_blank"&gt;https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Tools&lt;/A&gt; and its the 2018 Birth data file.</description>
    <pubDate>Mon, 17 Feb 2020 20:56:11 GMT</pubDate>
    <dc:creator>Flexluthorella</dc:creator>
    <dc:date>2020-02-17T20:56:11Z</dc:date>
    <item>
      <title>Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Download</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625431#M184330</link>
      <description>&lt;P&gt;I've downloaded the 2018 Birth data files (US data files only) which is supposedly 223 mb. When the download was completed on my pc, its over 5GB. Notepad can't read it so I cant view the dataset/variables. I attempted to PROC IMPORT into SAS but that is not working.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 17 Feb 2020 20:26:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625431#M184330</guid>
      <dc:creator>Flexluthorella</dc:creator>
      <dc:date>2020-02-17T20:26:38Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625436#M184334</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/297482"&gt;@Flexluthorella&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I've downloaded the 2018 Birth data files (US data files only) which is supposedly 223 mb. When the download was completed on my pc, its over 5GB. Notepad can't read it so I cant view the dataset/variables. I attempted to PROC IMPORT into SAS but that is not working.&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;When you say "not working" we have virtually no information to provide advice.&amp;nbsp; Now NOTEPAD finds the downloaded file too big.&amp;nbsp; How about WORDPAD (use it to view, but not save), or you could download many other editors, like Notepad++.&amp;nbsp; These both likely have larger size limitations.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;And if you downloaded something sized 223mb and got a 5GB file, it was more that a simple download.&amp;nbsp; Try using a more capable editor to view the download.&amp;nbsp; BTW, what it the url of the downloaded file?&amp;nbsp; Maybe someone on this forum can take a quick look.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Feb 2020 20:49:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625436#M184334</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-02-17T20:49:10Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625437#M184335</link>
      <description>&lt;P&gt;Have you tried the tools provided on this &lt;A href="http://data.nber.org/data/vital-statistics-natality-data.html" target="_self"&gt;NCHS site&lt;/A&gt; ?&lt;/P&gt;</description>
      <pubDate>Mon, 17 Feb 2020 20:50:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625437#M184335</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2020-02-17T20:50:47Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625441#M184338</link>
      <description>"Not working" meaning I can't open the file in notepad, notepad++ or wordpad, as they error message states file is "too big" or "failed to open". I am not sure what you mean by more capable editor. The url is &lt;A href="https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Tools" target="_blank"&gt;https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Tools&lt;/A&gt; and its the 2018 Birth data file.</description>
      <pubDate>Mon, 17 Feb 2020 20:56:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625441#M184338</guid>
      <dc:creator>Flexluthorella</dc:creator>
      <dc:date>2020-02-17T20:56:11Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625442#M184339</link>
      <description>I do not know how to use the tools they provide. I did not think I could just start downloading tools to use with no idea how to use them.</description>
      <pubDate>Mon, 17 Feb 2020 20:57:10 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625442#M184339</guid>
      <dc:creator>Flexluthorella</dc:creator>
      <dc:date>2020-02-17T20:57:10Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625445#M184341</link>
      <description>&lt;P&gt;By more capable editor, I meant more capable than notepad, thinking either wordpad or notepad++ would do the job.&amp;nbsp; But I see you have tried that.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;BUT I also see you have unzipped the downloaded file, so you can do this to make a sample file to visually inspect:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename filein "C:\Users\…..\Downloads\Nat2018us\Nat2018PublicUS.c20190509.r20190717.txt";

data _null_;
  infile filein;
  file 'c:\temp\sampledata.txt';
  input;
  put _infile_;
  if _n_&amp;gt;=10 then stop;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then take a look at sampledata.txt.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Feb 2020 21:13:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625445#M184341</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-02-17T21:13:31Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625448#M184343</link>
      <description>This gives me the first 10 lines of data. The sampledata.txt did not give variable names. I can't tell what I need to do from here. I can see a small fraction of the data.</description>
      <pubDate>Mon, 17 Feb 2020 21:23:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625448#M184343</guid>
      <dc:creator>Flexluthorella</dc:creator>
      <dc:date>2020-02-17T21:23:15Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625451#M184345</link>
      <description>&lt;P&gt;If you go back to the url you provided, you will see a column to the left of your downloaded data.&amp;nbsp; The column name is titled "User's Guide (.pdf files)".&amp;nbsp; Clicking on the "&lt;EM&gt;&lt;STRONG&gt;2018 (1.7MB)&lt;/STRONG&gt;&lt;/EM&gt;" link in this self-descriptive column will provided a guide to the layout of the data in a pdf file.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This is a common practice with lots of demographic data files - one file with just data, and another file/codebook/user guide with the data layout description.&amp;nbsp; Welcome to the demographic data world.&lt;/P&gt;</description>
      <pubDate>Mon, 17 Feb 2020 21:36:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625451#M184345</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-02-17T21:36:04Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625461#M184352</link>
      <description>Right. Going back to my original issue, how do I get to read in ALL the data from the OG (large) file?</description>
      <pubDate>Mon, 17 Feb 2020 21:57:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625461#M184352</guid>
      <dc:creator>Flexluthorella</dc:creator>
      <dc:date>2020-02-17T21:57:05Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625483#M184363</link>
      <description>&lt;P&gt;The .pdf User Guide provides the data dictionary/data layout. Why isn't that sufficient for you to write the SAS data step to read the data in the .txt file into a SAS data set?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Capture.JPG" style="width: 600px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/36201iAF7840B9CC85EF63/image-size/large?v=v2&amp;amp;px=999" role="button" title="Capture.JPG" alt="Capture.JPG" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;There are text editors available which can also open .txt of multiple GB. Just Google for them.&lt;/P&gt;
&lt;P&gt;I've used UltraEdit (which doesn't come for free) to open the text file.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To get you started I've copied the first 30 lines into the attached sample_2018.txt file.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 18 Feb 2020 01:31:01 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625483#M184363</guid>
      <dc:creator>Patrick</dc:creator>
      <dc:date>2020-02-18T01:31:01Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625486#M184366</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/297482"&gt;@Flexluthorella&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Right. Going back to my original issue, how do I get to read in ALL the data from the OG (large) file?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;First thing is leave the file zipped.&amp;nbsp; No need to unzip it as SAS can unzip it on the file.&lt;/P&gt;
&lt;P&gt;Second look at the description of the file and use that to write the code to read it.&lt;/P&gt;
&lt;P&gt;So you will have something like this using column oriented reads.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  infile 'where I put the file.zip' zip truncover member='*' ;
  input var 1-10  var2 $11-12 .... ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Or perhaps you will want to use formatted mode instead.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
  infile 'where I put the file.zip' zip truncover member='*' ;
  input var 1-10 10.  var2 $2. .... ;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Or some mixture of the two.&lt;/P&gt;
&lt;P&gt;Remember look at the data description to understand what data is in which columns. Whether the data is numbers or strings. Some variables that are coded only as digits you might want to read as strings since they are really categorical values and not numbers you could use in operations like MEAN().&lt;/P&gt;</description>
      <pubDate>Tue, 18 Feb 2020 01:45:33 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625486#M184366</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2020-02-18T01:45:33Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625487#M184367</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/297482"&gt;@Flexluthorella&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Right. Going back to my original issue, how do I get to read in ALL the data from the OG (large) file?&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Use the layout in the pdf file to set up the necessary INPUT statement to read the data into a SAS data set.&amp;nbsp; You don't need to see the entire raw data set in any editor to do that.&amp;nbsp; And you could first do a test of your program using the 10-record (or some other small) subset of the original raw data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The full reference to the input statement, including examples, is at&amp;nbsp;&lt;A href="https://documentation.sas.com/?docsetId=lestmtsref&amp;amp;docsetTarget=n0oaql83drile0n141pdacojq97s.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_self"&gt;Input Statement&lt;/A&gt;. There's another possibly useful sas link at&amp;nbsp;&lt;A href="https://documentation.sas.com/?docsetId=lrcon&amp;amp;docsetTarget=n1w749t788cgi2n1txpuccsuqtro.htm&amp;amp;docsetVersion=9.4&amp;amp;locale=en" target="_self"&gt;Reading Raw Data with the SAS Input Statement&lt;/A&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you haven't done the INPUT statement before, this will be a (worthwhile) experience.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Good luck, and bring back your questions once you start trying to use it.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Feb 2020 02:00:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625487#M184367</guid>
      <dc:creator>mkeintz</dc:creator>
      <dc:date>2020-02-18T02:00:08Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625516#M184375</link>
      <description>&lt;P&gt;The page where you get the birth data files points users to a page on the National Bureau of Economic Research (NBER) website. If you go to &lt;A title="extremely helpful page on NBER website" href="http://data.nber.org/data/vital-statistics-natality-data.html" target="_blank" rel="noopener"&gt;that page on the NBER website&lt;/A&gt; and scroll down, you'll see a table for the United States birth data and documentation.&amp;nbsp; Jean Roth at NBER has posted SAS, Stata, and SPSS code to read the ASCII file.&amp;nbsp; She has also posted the file as a Stata file, a SAS data set. and CSV.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The bad news is that she has not done that for the 2018 file.&amp;nbsp; However, I think the record layout for the 2017 file is the same as the record layout for the 2018 file.&amp;nbsp; So,&amp;nbsp;&lt;A title="2017 SAS program to read the file posted on NBER website" href="http://data.nber.org/natality/2017/natl2017.sas" target="_blank" rel="noopener"&gt;http://data.nber.org/natality/2017/natl2017.sas&lt;/A&gt;&amp;nbsp;should help you get started reading the ASCII file into SAS.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I was able to modify the 2017 program to read the zipped version of the 2018 file.&amp;nbsp; One odd note:&amp;nbsp; while the PDF for the 2018 file shows the record length as 1330, the record length is really 1345.&amp;nbsp; The NBER program for 2017 shows 15 variables in columns 1330 to 1345, but those columns are all missing in the 2018 file.&lt;/P&gt;</description>
      <pubDate>Tue, 18 Feb 2020 05:27:18 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625516#M184375</guid>
      <dc:creator>SuzanneDorinski</dc:creator>
      <dc:date>2020-02-18T05:27:18Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625666#M184429</link>
      <description>thank you so much!</description>
      <pubDate>Tue, 18 Feb 2020 19:06:04 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625666#M184429</guid>
      <dc:creator>Flexluthorella</dc:creator>
      <dc:date>2020-02-18T19:06:04Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625717#M184459</link>
      <description>I am still having issues; I get 0 records read in. Can you share with me?</description>
      <pubDate>Tue, 18 Feb 2020 22:34:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625717#M184459</guid>
      <dc:creator>Flexluthorella</dc:creator>
      <dc:date>2020-02-18T22:34:11Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in 5GB birth dataset from https://www.cdc.gov/nchs/data_access/Vitalstatsonline.htm#Down</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625767#M184487</link>
      <description>&lt;P&gt;I'll share part of the program in this post, and attach the whole thing.&amp;nbsp; I borrowed some code from the SAS Dummy blog, to figure out the name of the file within the zipped file.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I ran that, and then copied the really long file name into the INFILE statement in the data step.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;* borrowing code from https://blogs.sas.com/content/sasdummy/2014/01/29/using-filename-zip/ ;

%let ziploc = /folders/myfolders/NCHS Vital Statistics/Nat2018us.zip;

/* Assign a fileref wth the ZIP method */
filename inzip zip "&amp;amp;ziploc";
 
/* Read the "members" (files) from the ZIP file */
data contents(keep=memname);
 length memname $200;
 fid=dopen("inzip");
 if fid=0 then
  stop;
 memcount=dnum(fid);
 do i=1 to memcount;
  memname=dread(fid,i);
  output;
 end;
 rc=dclose(fid);
run;
 
/* create a report of the ZIP contents */
title "Files in the ZIP file";

proc print data=contents noobs N;
run;

title;

*options obs=100 ;
options obs=max;
*options nocenter ;

**------------------------------------------------ ;
**  by Jean Roth	Thu Oct 12 11:09:27 EDT 2017
**  This program reads the 2017 NCHS Natality Detail Data File  ;
**  Report errors to jroth@nber.org ;
**  This program is distributed under the GNU GPL. ;
**  See end of this file and 
**  http://www.gnu.org/licenses/ for details.      ;
** ----------------------------------------------- ;

*  The following line should contain the directory
   where the SAS file is to be stored  ;

*libname library "/folders/myfolders/NCHS Vital Statistics/";

*  The following line should contain
   the complete path and name of the raw data file.
   On a PC, use backslashes in paths as in C:\  ;

*FILENAME datafile pipe "7z e /homes/data/natality/2017/natl2017.zip  -so ";

*  The following line should contain the name of the SAS dataset ;

%let dataset = natl2018;

DATA &amp;amp;dataset ;

INFILE inzip(Nat2018PublicUS.c20190509.r20190717.txt) zip truncover LRECL = 20000 ;
attrib  dob_yy       length=4     label="Birth Year";        
attrib  dob_mm       length=3     label="Birth Month 01 January";               
attrib  dob_tt       length=4     label="Time of Birth 0000-2359 Time of Birth";&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;When I tested the code, the options obs=100 statement was uncommented.&amp;nbsp; Once I was sure that the code was correct, I commented out that line, then typed in the options obs=max statement.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Since the data set is 2 GB, I decided that I didn't want to have a permanent version, so I commented out the LIBNAME statement.&amp;nbsp; I commented out the FILENAME statement provided by NBER, because I don't&amp;nbsp; think it will work on my home computer.&amp;nbsp; &amp;nbsp;I'm using the FILENAME statement from the SAS blog instead.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I didn't have to change anything else in the NBER program after the INFILE statement in the data step.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 19 Feb 2020 02:50:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Reading-in-5GB-birth-dataset-from-https-www-cdc-gov-nchs-data/m-p/625767#M184487</guid>
      <dc:creator>SuzanneDorinski</dc:creator>
      <dc:date>2020-02-19T02:50:09Z</dc:date>
    </item>
  </channel>
</rss>

