<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Reading in multiple csv files...column names appearing in SAS dataset in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276993#M58700</link>
    <description>&lt;P&gt;Here is a simplified version of the technique that was used in the linked article.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Use the EOV option to name a flag variable that will let you detect when a new file starts.&lt;/LI&gt;
&lt;LI&gt;Use the FIRSTOBS=2 option to skip the first record since the EOV flag will not be set until you start a NEW file. &amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;Read the line and hold it with the trailing&amp;nbsp;@ to allow SAS to be able to set the EOV flag.&lt;/LI&gt;
&lt;LI&gt;When it is the beginning of a new file&amp;nbsp;then use another INPUT statement to throw away the&amp;nbsp;line.&lt;/LI&gt;
&lt;LI&gt;Reset the EOV variable to false because SAS will not&amp;nbsp;reset it for you.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then the rest of the data step is the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;
  infile "....*.csv" firstobs=2 dsd truncover eov=eov ;
  input @;
  if eov then input;
  eov=0;
  input .... ;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note also the use of the TRUNCOVER option instead of MISSOVER option. &amp;nbsp;This will make writing your INPUT statement easier since you do not need to worry about reading past the end of the line. With MISSOVER if&amp;nbsp;you ask SAS to read 10 characters and only 5 more exist on the line then you get a missing value, but with TRUNCOVER you get the 5 characters that do exist.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note this will work with wildcards in the INFILE statement or when using a FILENAME statement that references multiple files.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename indata
('file1.txt'
,'file2.txt'
,'file3.txt'
);

data want ;
   infile indata eov=eov ....&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Mon, 13 Jun 2016 16:38:21 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2016-06-13T16:38:21Z</dc:date>
    <item>
      <title>Reading in multiple csv files...column names appearing in SAS dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276944#M58693</link>
      <description>&lt;P&gt;I have written SAS code to import three .csv files into a SAS dataset. &amp;nbsp;These csv files contain the exact same column names. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have coded my program so the first row is read as the column name (firstobs=2). &amp;nbsp;However when the program goes to the second and third csv files the first row of these are being read in as records. &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is the code:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;data work.firm_count;&lt;BR /&gt;format Ref_Date $7. GEO $16. 'Geographical Classification'n $2. SIZE $21. Description $110. Vector $9. Coordinate $8. Value comma9. ;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;infile "c:\user\homareau\PredictiveModels\FirmCounts\Firm_count_NM_*.csv" firstobs=2 dlm=',' Missover DSD;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;informat Ref_Date $7. GEO $16. 'Geographical Classification'n $2. SIZE $21. Description $110. Vector $9. Coordinate $8. Value comma9. ;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;input Ref_Date GEO 'Geographical Classification'n SIZE Description Vector Coordinate Value;&lt;BR /&gt;label Ref_Date = 'Date'&lt;BR /&gt;GEO = 'Prov_Name'&lt;BR /&gt;'Geographical Classification'n = 'Prov_Code'&lt;BR /&gt;SIZE = 'Firm_Size'&lt;BR /&gt;Description = 'NAICS_Description'&lt;BR /&gt;Vector = 'CANSIM_V_Code'&lt;BR /&gt;Coordinate = 'Coordinate'&lt;BR /&gt;Value = 'Count';&lt;BR /&gt;drop Vector Coordinate;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;My question: What modificaiton do I need to make to this code so that &amp;nbsp;don't end up with column names ending up in my dataset? &amp;nbsp;I thought "firstobs" would have done the trick?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Jack&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jun 2016 14:25:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276944#M58693</guid>
      <dc:creator>Jack1</dc:creator>
      <dc:date>2016-06-13T14:25:43Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in multiple csv files...column names appearing in SAS dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276969#M58698</link>
      <description>&lt;P&gt;Hi Jack,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Last month I saw a clever solution of this problem (using a combination of FIRSTOBS= and EOV= options) in this forum in a &lt;A href="https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-import-and-merge-all-TSV-files-in-a-folder/m-p/273297#M18944" target="_blank"&gt;post written by ballardw&lt;/A&gt;.&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jun 2016 15:39:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276969#M58698</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2016-06-13T15:39:48Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in multiple csv files...column names appearing in SAS dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276970#M58699</link>
      <description>&lt;P&gt;Sorry, not seeing why you are mixing things up:&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;input Ref_Date GEO 'Geographical Classification'n SIZE Description Vector Coordinate Value;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Why do you specify 'Geographicl Classification'n ? This doesn't make sense in this context. &amp;nbsp;The ''n is for select long names from a database for instance, your specifying a name for an input statement?&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Also check your file, I can't see the file, so make sure the data does start on row two. &amp;nbsp;This in cludes having a return at the end of each row - if they all appear in one row firstobs will not work.&lt;/SPAN&gt;&lt;/P&gt;
&lt;PRE&gt;data work.firm_count;
  infile  "c:\user\homareau\PredictiveModels\FirmCounts\Firm_count_NM_*.csv" firstobs=2 dlm=',' missover dsd;
  format    ref_date $7. 
            geo $16. 
            classification $2. 
            size $21. 
            description $110. 
            vector $9. 
            coordinate $8. 
            value comma9.;
  informat  ref_date $7. 
            geo $16. 
            classification $2. 
            size $21. 
            description $110. 
            vector $9. 
            coordinate $8. 
            value comma9.;
  input     ref_date $
            geo $
            classification $
            size $
            description $
            vector $
            coordinate $
            value;
  label     ref_date='Date'
            geo='Prov_Name'
            classification='Prov_Code'
            size='Firm_Size'
            description='NAICS_Description'
            vector='CANSIM_V_Code'
            coordinate='Coordinate'
            value='Count';
  drop vector coordinate;
run;
 
&lt;/PRE&gt;</description>
      <pubDate>Mon, 13 Jun 2016 15:42:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276970#M58699</guid>
      <dc:creator>RW9</dc:creator>
      <dc:date>2016-06-13T15:42:25Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in multiple csv files...column names appearing in SAS dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276993#M58700</link>
      <description>&lt;P&gt;Here is a simplified version of the technique that was used in the linked article.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;UL&gt;
&lt;LI&gt;Use the EOV option to name a flag variable that will let you detect when a new file starts.&lt;/LI&gt;
&lt;LI&gt;Use the FIRSTOBS=2 option to skip the first record since the EOV flag will not be set until you start a NEW file. &amp;nbsp;&lt;/LI&gt;
&lt;LI&gt;Read the line and hold it with the trailing&amp;nbsp;@ to allow SAS to be able to set the EOV flag.&lt;/LI&gt;
&lt;LI&gt;When it is the beginning of a new file&amp;nbsp;then use another INPUT statement to throw away the&amp;nbsp;line.&lt;/LI&gt;
&lt;LI&gt;Reset the EOV variable to false because SAS will not&amp;nbsp;reset it for you.&lt;/LI&gt;
&lt;/UL&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then the rest of the data step is the same.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want ;
  infile "....*.csv" firstobs=2 dsd truncover eov=eov ;
  input @;
  if eov then input;
  eov=0;
  input .... ;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note also the use of the TRUNCOVER option instead of MISSOVER option. &amp;nbsp;This will make writing your INPUT statement easier since you do not need to worry about reading past the end of the line. With MISSOVER if&amp;nbsp;you ask SAS to read 10 characters and only 5 more exist on the line then you get a missing value, but with TRUNCOVER you get the 5 characters that do exist.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Note this will work with wildcards in the INFILE statement or when using a FILENAME statement that references multiple files.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename indata
('file1.txt'
,'file2.txt'
,'file3.txt'
);

data want ;
   infile indata eov=eov ....&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jun 2016 16:38:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/276993#M58700</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2016-06-13T16:38:21Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in multiple csv files...column names appearing in SAS dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/277003#M58702</link>
      <description>&lt;P&gt;Hi Tom,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I made the modification you suggested and I couldn't get it to work.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is my original SAS code:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data work.firm_count;
format Ref_Date $7. GEO $16. 'Geographical Classification'n $2. SIZE $21. Description $110. Vector $9. Coordinate $8. Value comma9. ;
infile "c:\user\homareau\PredictiveModels\FirmCounts\Firm_count_NM_*.csv" firstobs=2 dlm=',' missover dsd ;
informat Ref_Date $7. GEO $16. 'Geographical Classification'n $2. SIZE $21. Description $110. Vector $9. Coordinate $8. Value comma9. ;
input Ref_Date GEO 'Geographical Classification'n SIZE Description Vector Coordinate Value;
label Ref_Date = 'Date'
      GEO = 'Prov_Name'
      'Geographical Classification'n = 'Prov_Code'
      SIZE = 'Firm_Size'
      Description = 'NAICS_Description'
      Vector = 'CANSIM_V_Code'
      Coordinate = 'Coordinate'
      Value = 'Count';
drop Vector Coordinate;
run;&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;How would the code above need adjusting to make it work so its not reading in headers from other sheets? &amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks..&lt;/P&gt;&lt;P&gt;Jack&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jun 2016 17:47:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/277003#M58702</guid>
      <dc:creator>Jack1</dc:creator>
      <dc:date>2016-06-13T17:47:36Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in multiple csv files...column names appearing in SAS dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/277007#M58703</link>
      <description>&lt;P&gt;Doesn't look like you made any modifications?&lt;/P&gt;
&lt;P&gt;Any way try this.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data work.firm_count;
format Ref_Date $7. GEO $16. 'Geographical Classification'n $2. SIZE $21. Description $110. Vector $9. Coordinate $8. Value comma9. ;
informat Ref_Date $7. GEO $16. 'Geographical Classification'n $2. SIZE $21. Description $110. Vector $9. Coordinate $8. Value comma9. ;

infile "c:\user\homareau\PredictiveModels\FirmCounts\Firm_count_NM_*.csv" firstobs=2 dsd truncover eov=eov  ;

input @;
if eov then input;
eov=0;

input Ref_Date GEO 'Geographical Classification'n SIZE Description Vector Coordinate Value;
label Ref_Date = 'Date'
      GEO = 'Prov_Name'
      'Geographical Classification'n = 'Prov_Code'
      SIZE = 'Firm_Size'
      Description = 'NAICS_Description'
      Vector = 'CANSIM_V_Code'
      Coordinate = 'Coordinate'
      Value = 'Count';
drop Vector Coordinate;
run; 

&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 13 Jun 2016 18:01:38 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/277007#M58703</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2016-06-13T18:01:38Z</dc:date>
    </item>
    <item>
      <title>Re: Reading in multiple csv files...column names appearing in SAS dataset</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/277011#M58704</link>
      <description>&lt;P&gt;OK Tom, I didn't enter all the code correctly. &amp;nbsp;I ran it as you suggested and it works great! &amp;nbsp;Thanks so much...&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Jack&lt;/P&gt;</description>
      <pubDate>Mon, 13 Jun 2016 18:05:32 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/Reading-in-multiple-csv-files-column-names-appearing-in-SAS/m-p/277011#M58704</guid>
      <dc:creator>Jack1</dc:creator>
      <dc:date>2016-06-13T18:05:32Z</dc:date>
    </item>
  </channel>
</rss>

