<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic import a large csv in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/import-a-large-csv/m-p/808164#M318659</link>
    <description>&lt;P&gt;I have a csv file of 232,190 rows and 9 columns. When I import data into sas, sas only extracts the first 1,011 rows and 9 columns. How can I import all the rows available without losing any data from the csv file?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the log result:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;237 data WORK.announcement ;&lt;BR /&gt;238 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */&lt;BR /&gt;239 infile 'D:\Dropbox\Dataset\9ee779962de5b464.csv'&lt;BR /&gt;239! delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;&lt;BR /&gt;240 informat CompanyName $38. ;&lt;BR /&gt;241 informat DirectorName $19. ;&lt;BR /&gt;242 informat CommitteeName $12. ;&lt;BR /&gt;243 informat JobName $27. ;&lt;BR /&gt;244 informat Description $99. ;&lt;BR /&gt;245 informat AnnouncementDate best32. ;&lt;BR /&gt;246 informat CompanyID best32. ;&lt;BR /&gt;247 informat DirectorID best32. ;&lt;BR /&gt;248 informat EffectDate best32. ;&lt;BR /&gt;249 format CompanyName $38. ;&lt;BR /&gt;250 format DirectorName $19. ;&lt;BR /&gt;251 format CommitteeName $12. ;&lt;BR /&gt;252 format JobName $27. ;&lt;BR /&gt;253 format Description $99. ;&lt;BR /&gt;254 format AnnouncementDate best12. ;&lt;BR /&gt;255 format CompanyID best12. ;&lt;BR /&gt;256 format DirectorID best12. ;&lt;BR /&gt;257 format EffectDate best12. ;&lt;BR /&gt;258 input&lt;BR /&gt;259 CompanyName $&lt;BR /&gt;260 DirectorName $&lt;BR /&gt;261 CommitteeName $&lt;BR /&gt;262 JobName $&lt;BR /&gt;263 Description $&lt;BR /&gt;264 AnnouncementDate&lt;BR /&gt;265 CompanyID&lt;BR /&gt;266 DirectorID&lt;BR /&gt;267 EffectDate&lt;BR /&gt;268 ;&lt;BR /&gt;269 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */&lt;BR /&gt;270 run;&lt;/P&gt;&lt;P&gt;NOTE: The infile 'D:\Dropbox\Dataset\9ee779962de5b464.csv' is:&lt;BR /&gt;Filename=D:\Dropbox\Dataset\9ee779962de5b464.csv,&lt;BR /&gt;RECFM=V,LRECL=32767,File Size (bytes)=39898957,&lt;BR /&gt;Last Modified=10,November,2020 13:48:22,&lt;BR /&gt;Create Time=08,February,2022 15:21:23&lt;/P&gt;&lt;P&gt;NOTE: 1011 records were read from the infile&lt;BR /&gt;'D:\Dropbox\Dataset\9ee779962de5b464.csv'.&lt;BR /&gt;The minimum record length was 98.&lt;BR /&gt;The maximum record length was 312.&lt;BR /&gt;NOTE: The data set WORK.ANNOUNCEMENT has 1011 observations and 9 variables.&lt;BR /&gt;NOTE: DATA statement used (Total process time):&lt;BR /&gt;real time 0.11 seconds&lt;BR /&gt;cpu time 0.07 seconds&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;1011 rows created in WORK.announcement from&lt;BR /&gt;D:\Dropbox\Dataset\9ee779962de5b464.csv.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NOTE: WORK.ANNOUNCEMENT data set was successfully created.&lt;BR /&gt;NOTE: The data set WORK.ANNOUNCEMENT has 1011 observations and 9 variables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Sat, 16 Apr 2022 13:08:45 GMT</pubDate>
    <dc:creator>Jarvin99</dc:creator>
    <dc:date>2022-04-16T13:08:45Z</dc:date>
    <item>
      <title>import a large csv</title>
      <link>https://communities.sas.com/t5/SAS-Programming/import-a-large-csv/m-p/808164#M318659</link>
      <description>&lt;P&gt;I have a csv file of 232,190 rows and 9 columns. When I import data into sas, sas only extracts the first 1,011 rows and 9 columns. How can I import all the rows available without losing any data from the csv file?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the log result:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;237 data WORK.announcement ;&lt;BR /&gt;238 %let _EFIERR_ = 0; /* set the ERROR detection macro variable */&lt;BR /&gt;239 infile 'D:\Dropbox\Dataset\9ee779962de5b464.csv'&lt;BR /&gt;239! delimiter = ',' MISSOVER DSD lrecl=32767 firstobs=2 ;&lt;BR /&gt;240 informat CompanyName $38. ;&lt;BR /&gt;241 informat DirectorName $19. ;&lt;BR /&gt;242 informat CommitteeName $12. ;&lt;BR /&gt;243 informat JobName $27. ;&lt;BR /&gt;244 informat Description $99. ;&lt;BR /&gt;245 informat AnnouncementDate best32. ;&lt;BR /&gt;246 informat CompanyID best32. ;&lt;BR /&gt;247 informat DirectorID best32. ;&lt;BR /&gt;248 informat EffectDate best32. ;&lt;BR /&gt;249 format CompanyName $38. ;&lt;BR /&gt;250 format DirectorName $19. ;&lt;BR /&gt;251 format CommitteeName $12. ;&lt;BR /&gt;252 format JobName $27. ;&lt;BR /&gt;253 format Description $99. ;&lt;BR /&gt;254 format AnnouncementDate best12. ;&lt;BR /&gt;255 format CompanyID best12. ;&lt;BR /&gt;256 format DirectorID best12. ;&lt;BR /&gt;257 format EffectDate best12. ;&lt;BR /&gt;258 input&lt;BR /&gt;259 CompanyName $&lt;BR /&gt;260 DirectorName $&lt;BR /&gt;261 CommitteeName $&lt;BR /&gt;262 JobName $&lt;BR /&gt;263 Description $&lt;BR /&gt;264 AnnouncementDate&lt;BR /&gt;265 CompanyID&lt;BR /&gt;266 DirectorID&lt;BR /&gt;267 EffectDate&lt;BR /&gt;268 ;&lt;BR /&gt;269 if _ERROR_ then call symputx('_EFIERR_',1); /* set ERROR detection macro variable */&lt;BR /&gt;270 run;&lt;/P&gt;&lt;P&gt;NOTE: The infile 'D:\Dropbox\Dataset\9ee779962de5b464.csv' is:&lt;BR /&gt;Filename=D:\Dropbox\Dataset\9ee779962de5b464.csv,&lt;BR /&gt;RECFM=V,LRECL=32767,File Size (bytes)=39898957,&lt;BR /&gt;Last Modified=10,November,2020 13:48:22,&lt;BR /&gt;Create Time=08,February,2022 15:21:23&lt;/P&gt;&lt;P&gt;NOTE: 1011 records were read from the infile&lt;BR /&gt;'D:\Dropbox\Dataset\9ee779962de5b464.csv'.&lt;BR /&gt;The minimum record length was 98.&lt;BR /&gt;The maximum record length was 312.&lt;BR /&gt;NOTE: The data set WORK.ANNOUNCEMENT has 1011 observations and 9 variables.&lt;BR /&gt;NOTE: DATA statement used (Total process time):&lt;BR /&gt;real time 0.11 seconds&lt;BR /&gt;cpu time 0.07 seconds&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;1011 rows created in WORK.announcement from&lt;BR /&gt;D:\Dropbox\Dataset\9ee779962de5b464.csv.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;NOTE: WORK.ANNOUNCEMENT data set was successfully created.&lt;BR /&gt;NOTE: The data set WORK.ANNOUNCEMENT has 1011 observations and 9 variables.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sat, 16 Apr 2022 13:08:45 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/import-a-large-csv/m-p/808164#M318659</guid>
      <dc:creator>Jarvin99</dc:creator>
      <dc:date>2022-04-16T13:08:45Z</dc:date>
    </item>
    <item>
      <title>Re: import a large csv</title>
      <link>https://communities.sas.com/t5/SAS-Programming/import-a-large-csv/m-p/808170#M318663</link>
      <description>&lt;P&gt;The usual issue that causes that is embedded "DOS" end of file character.&amp;nbsp; Use the IGNOREDOSEOF option on the INFILE statement.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;
&lt;H4 class="xisDoc-argument"&gt;IGNOREDOSEOF&lt;/H4&gt;
&lt;DIV class="xisDoc-argumentDescription"&gt;
&lt;P class="xisDoc-paraSimpleFirst"&gt;is used in the context of&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN class="xisDoc-nobr"&gt;I/O&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;operations on variable record format files. When this option is specified, any occurrence of ^Z is interpreted as character data and not as an end-of-file marker.&lt;/P&gt;
&lt;/DIV&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Why use PROC IMPORT to GUESS how to read a file that only has NINE variables?&lt;/P&gt;
&lt;P&gt;Just write your own data step and you will have full control over how the variables are named, defined, labeled and whether or not any formats need to be attached.&amp;nbsp;&amp;nbsp;Are those last four variables really just plain numbers?&amp;nbsp; Why aren't the two DATE variables using a date type informat to create actual date values?&amp;nbsp; Why are the two ID variables being read as numbers instead of character strings?&amp;nbsp; You do not need to perform arithmetic with ID variables.&amp;nbsp;What do the lines in the file actually have for those fields?&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data announcement ;
  infile 'D:\Dropbox\Dataset\9ee779962de5b464.csv' dsd ignoredoseof truncover firstobs=2;
  length
 CompanyName $50
 DirectorName $30
 CommitteeName $20
 JobName $50
 Description $200
 AnnouncementDate 8
 CompanyID 8
 DirectorID 8
 EffectDate 8
;
  input CompanyName -- EffectDate ;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;To see some example values from the file use a simple data step.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data _null_;
  infile 'D:\Dropbox\Dataset\9ee779962de5b464.csv' obs=5 ;
  input;
  list;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Sat, 16 Apr 2022 14:11:59 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/import-a-large-csv/m-p/808170#M318663</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2022-04-16T14:11:59Z</dc:date>
    </item>
  </channel>
</rss>

