<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Importing multiple large CSV files with varying data formats in one column in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/972010#M377433</link>
    <description>&lt;P&gt;A FILENAME statement is used to define a FILEREF, that is a nickname you can use to point to an actual file.&lt;/P&gt;
&lt;P&gt;A FILE statement is used to tell the DATA step what file to read from.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The FILENAME statement I used&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;filename code temp;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;would define the fileref CODE pointing to a&amp;nbsp;temporary file. Essentially a file in the WORK directory that SAS will delete when you end your SAS session.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Then the FILE statement will use that fileref instead of a quoted physical filename.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;When you say "&lt;SPAN&gt;imported into SAS from a spreadsheet" do you mean you used PROC IMPORT to read from an XLSX file?&amp;nbsp; If so then the only reason MEMNAME would be numeric in that dataset would be if none of the cells where character strings.&amp;nbsp; &amp;nbsp;In which case the issue is in the data step.&amp;nbsp; When compiling a data step SAS will define the TYPE of a variable as soon as it has to.&amp;nbsp; Usually when it first SEES the variable.&amp;nbsp; If SAS cannot tell from the context that the variable should be character it will define it as numeric.&amp;nbsp;&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;SPAN&gt;Is it possible your metadata does not have a variable named MEMNAME?&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;</description>
    <pubDate>Sat, 02 Aug 2025 16:54:15 GMT</pubDate>
    <dc:creator>Tom</dc:creator>
    <dc:date>2025-08-02T16:54:15Z</dc:date>
    <item>
      <title>Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969092#M376731</link>
      <description>&lt;P&gt;I have a bunch of large CSV files to import, each of which well exceeds the maximum number of rows an Excel spreadsheet can hold. The data are further complicated by a mixture of different formats of the same column. For instance, dates are sometimes recorded as "16/12/20" and sometimes "16 12 20". SAS fails to correctly import the data, seemingly missing those observations with atypical data formats along the course of import. Moreover, as the files are CSV's, handy options of PROC IMPORT like MIXED and SCANTEXT are not available.&lt;/P&gt;
&lt;P&gt;My objectives of import are as follows:&lt;/P&gt;
&lt;P&gt;(1) Correctly import the data into SAS;&lt;/P&gt;
&lt;P&gt;(2) If not possible without the formidable task of finding the atypical observations in the original dataset and changing their formats, then set the columns housing the data with atypical formats as missing values instead of simply skipping them.&lt;/P&gt;
&lt;P&gt;Any suggestions? Thank you!&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 07:45:15 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969092#M376731</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T07:45:15Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969094#M376733</link>
      <description>&lt;P&gt;Read the field into a temporary character variable and add code for detecting the format and doing an appropriate conversion.&lt;/P&gt;
&lt;P&gt;If it's only slashes vs. blanks, that can be handled with a single TRANSLATE call.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 08:01:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969094#M376733</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2025-06-16T08:01:03Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969101#M376736</link>
      <description>&lt;P&gt;If you are using PROC IMPORT or the Import Wizard to import the CSV files, you can use the GUESSINGROWS= statement (Number of rows to guess in the Import Wizard Options box) to tell SAS to look beyond the first 20 rows (the default) to determine a variable's type. If you have different formats in a column, you need it to be imported as character and then you can make changes after the data is read in.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc import datafile='c:\temp\test.csv'
 out=data_one dbms=csv replace;
guessingrows=1000;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 16 Jun 2025 10:58:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969101#M376736</guid>
      <dc:creator>Kathryn_SAS</dc:creator>
      <dc:date>2025-06-16T10:58:44Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969106#M376738</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13770"&gt;@Kathryn_SAS&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;...If you have different formats in a column, you need it to be imported as character and then you can make changes after the data is read in.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I've had fairly good luck with SAS using the ANYDTDTE , ANYDTDTM or ANYDTTME with Proc Import generated code when date, datetime or time values appear in different layouts for a single variable. But I don't remember any where space was the delimiter between the data elements.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;However that is best when the years are 4 digits. With 2 digit years you have to pray that the order of values entered does match the national language settings.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 12:16:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969106#M376738</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2025-06-16T12:16:07Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969113#M376745</link>
      <description>Thank you! Could you please provide some hint on how to construct temporary variables? Hopefully this does not entail the specification of each and every column in the CSV file like I see in many SAS codes importing CSV's. I have too many columns to specify them one by one.</description>
      <pubDate>Mon, 16 Jun 2025 15:28:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969113#M376745</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T15:28:36Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969115#M376747</link>
      <description>&lt;P&gt;Thank you so much! That is a very handy option.&lt;/P&gt;
&lt;P&gt;By the way, in addition to the GUESSINGROWS statement, are there any useful options and/or statements available in importing CSV's? As I have said in my original question, many of the options and/or statements of PROC IMPORT are not available for CSV's, which is frustrating.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:33:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969115#M376747</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T13:33:36Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969117#M376748</link>
      <description>&lt;P&gt;Thanks for the input! Yes, you are right. Space serves as the delimiter in none of the three informats you mentioned in your post.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;However that is best when the years are 4 digits. With 2 digit years you have to pray that the order of values entered does match the national language settings.&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I am also rather apprehensive on this issue. With no further information in my dataset, it is likely that I end up designating the digits with the wrong meaning. However, as dates are not the primary concern for the time being, I have to focus on the goal of successful import right now and think about the issue of dates when conditions are permitted.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:38:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969117#M376748</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T13:38:25Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969118#M376749</link>
      <description>&lt;P&gt;What do you mean by varying?&amp;nbsp; Do you mean that it changes from file to file?&amp;nbsp; Or do you mean it changes within a single file?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Your example is not actually a problem for SAS to read (no idea if it is problem for PROC IMPORT to guess how to handle).&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
  infile cards dsd truncover;
  input date :ddmmyy.;
  format date yymmdd10.;
cards;
"16/12/20"
"16 12 20"
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt; Obs          date

  1     2020-12-16
  2     2020-12-16

&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:47:23 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969118#M376749</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2025-06-16T13:47:23Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969119#M376750</link>
      <description>&lt;P&gt;When importing delimited files, the options and statements that are available are documented here:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/proc/n1qn5sclnu2l9dn1w61ifw8wqhts.htm" target="_self"&gt;PROC IMPORT&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:39:08 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969119#M376750</guid>
      <dc:creator>Kathryn_SAS</dc:creator>
      <dc:date>2025-06-16T13:39:08Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969120#M376751</link>
      <description>&lt;P&gt;Do you know the structure of the files?&amp;nbsp; If so just write the data step to read it.&lt;/P&gt;
&lt;P&gt;Are the files all using the same structure? If so you can write a data step that reads all of the file at once.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you need a better tool for GUESSING how to read the file try this macro instead of PROC IMPORT.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/sasutils/macros/blob/master/csv2ds.sas" target="_self"&gt;%CSV2DS()&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:43:02 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969120#M376751</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2025-06-16T13:43:02Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969121#M376752</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;What do you mean by varying?&amp;nbsp; Do you mean that it changes from file to file?&amp;nbsp; Or do you mean it changes within a single file?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I mean "within a single file".&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;&amp;nbsp;wrote:
&lt;P&gt;Your example is not actually a problem for SAS to read (no idea if it is problem for PROC IMPORT to guess how to handle).&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
  infile cards dsd truncover;
  input date :ddmmyy10.;
  format date yymmdd10.;
cards;
"16/12/20"
"16 12 20"
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt; Obs          date

  1     2020-12-16
  2     2020-12-16

&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Yes, I did try PROC IMPORT. Does importing the file with the DATA step necessitates specification of the name and informat of each of every variable in the CSV? That would be a very formidable job as I have possibly thousands of columns in all.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:43:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969121#M376752</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T13:43:50Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969122#M376753</link>
      <description>&lt;P&gt;Thanks for your response. But I beg to point out that the options listed in SAS Help are not very informative as many of those listed there are not available for CSV's.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:44:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969122#M376753</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T13:44:44Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969124#M376755</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:
&lt;P class="1750081691970"&gt;..&lt;/P&gt;
Yes, I did try PROC IMPORT. Does importing the file with the DATA step necessitates specification of the name and informat of each of every variable in the CSV? That would be a very formidable job as I have possibly thousands of columns in all.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;PROC IMPROT will not handle a delimited text file with thousands of columns. It can only read the first 32K bytes of the header row.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Use the %CSV2DS() macro instead if you need help guessing how to read a delimited file.&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/sasutils/macros/blob/master/csv2ds.sas" target="_blank"&gt;https://github.com/sasutils/macros/blob/master/csv2ds.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:49:36 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969124#M376755</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2025-06-16T13:49:36Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969125#M376756</link>
      <description>&lt;P&gt;The documentation link that I sent is specifically for delimited files - csv, tab, dlm - so it does not include statements/options that are not available for you to use for your CSV files.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another thought would be that you can run PROC IMPORT because it uses DATA step code behind the scenes. Then you could grab that code either from the log or using Run -&amp;gt; Recall Last Submit and then you could make adjustments to that DATA step code as others have suggested. You just wouldn't have to start from scratch.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 13:50:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969125#M376756</guid>
      <dc:creator>Kathryn_SAS</dc:creator>
      <dc:date>2025-06-16T13:50:54Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969129#M376759</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13770"&gt;@Kathryn_SAS&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;The documentation link that I sent is specifically for delimited files - csv, tab, dlm - so it does not include statements/options that are not available for you to use for your CSV files.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Thank you for your reminder! I should had taken a second look.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13770"&gt;@Kathryn_SAS&lt;/a&gt;&amp;nbsp;wrote:
&lt;P&gt;Another thought would be that you can run PROC IMPORT because it uses DATA step code behind the scenes. Then you could grab that code either from the log or using Run -&amp;gt; Recall Last Submit and then you could make adjustments to that DATA step code as others have suggested. You just wouldn't have to start from scratch.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Thank you for your valuable piece of information! So that makes DATA step-based import of CSV files easier. Starting from scratch is so painful and boring for a data analyst who wish to concentrate on data analysis instead of such dull, monotonic work.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 14:36:35 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969129#M376759</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T14:36:35Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969135#M376760</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;Thank you! Could you please provide some hint on how to construct temporary variables? Hopefully this does not entail the specification of each of every column in the CSV file like I see in many SAS codes importing CSV's. I have too many columns to specify them one by one.&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;First of all: do NOT use PROC IMPORT in production for text files. NEVER. EVER. I mean it. It is much too prone to create inconsistent results, and will sometimes result in a seemingly successful code that created garbage in reality.&lt;/P&gt;
&lt;P&gt;You can use PROC IMPORT to create a basic DATA step, which you grab from the log and then modify/expand as needed. You will find that this code is typical for machine-created code. Clumsy, hard to read, and hard to maintain. But it is a beginning.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Here a quick example for converting your strings:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data want;
infile datalines dsd;
input _date :$10.;
format date yymmdd10.;
date = input(translate(strip(_date),"/"," "),ddmmyy10.);
datalines;
16/12/20
16 12 20
;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Note that this is not necessary for the examples you posted. DDMMYY10. will read a date with blanks the same as with slashes.&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 15:08:41 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969135#M376760</guid>
      <dc:creator>Kurt_Bremser</dc:creator>
      <dc:date>2025-06-16T15:08:41Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969136#M376761</link>
      <description>Thank you so much for your reminder as well as your code! I will try modifying the DATA step codes as recommended.</description>
      <pubDate>Mon, 16 Jun 2025 15:27:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969136#M376761</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-16T15:27:52Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969156#M376773</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Does importing the file with the DATA step necessitates specification of the name and informat of each of every variable in the CSV? That would be a very formidable job as I have possibly thousands of columns in all.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Proc Import will write a basic data step program to read a text file. The code will be in the log and can be copied from the log to the editor, cleaned up and rerun (or issue a RECALL command immediately after the Proc Import to bring the code into the editor.)&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The types of informats that often need to be addressed are those where the value contains all digits but you want to maintain leading zeros, such as account numbers. Change the informat to character long enough to hold the value (a $20. or similar). In most cases when modifying a Proc Import generated data step you can drop the FORMAT statements for variables except date, time and datetime variables unless you want to assign custom formats. Also to look out for are columns with mixed use of negative signs and () for negative values, or currency and percent signs that aren't on every value. These may require additional coding as well as in read as character and parse. If you have multiple currency symbols such as dollar, Yen, Pound, Franc and such this might be a very import consideration if you want to manipulate the currency values in any consistent manner.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Check on the assigned informat for your problem variables. If they were read as character but should be dates that is an indication that you may need to create new variables by parsing the values. Check on your national language settings (NLS) to see what order dates are read. OR if you see lots of invalid data messages involving those variables it is one indicator that the order may be different than your NLS and override to read as character and parse.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If you have variables that would best be considered Boolean, i.e. Yes/No, True/False, and such it may be worth creating and using a custom informat so that the results are numeric 1/0 as that will be much easier to work with in most cases going forward instead of a hodgepodge of Y/N T/F character values.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Another consideration not mentioned yet, is if these files are supposed to be of the same layout you should be able to use the same data step to read all of them by changing name of input file and output data set. But it is very likely that lengths of character variables will differ between files. So modify any of the $w. informats to allow for this. I generally start at 15% or so longer than the generated data step. And then check after reading that the values look right. If not make the informat wider and re-read.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A last issue relates to variable names generated from column headings that are either very long (will get truncated at 32 characters) or identical in the source file. If column headings are identical for the first 32 characters of a longer heading the first will get part of the text as the variable name. The others will get VARxxxx where xxxx may be the column number in the file. Identical shorter heading may get numeric suffixes added. Example a file with multiple headings of "Total", the first will have a variable name of Total, the next Total2 (or Total1 been awhile) with incremented numbers for each following.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Recommend setting option VALIDVARNAME=V7 before Proc Import. Dealing with variable names with spaces and non-standard characters gets old real quick having to use the name literal such as 'Stupid variable name'n every where. The V7 option will replace all the special characters with _ and be easier to type (or rename as desired).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;One tool to help deal with some of this if you don't have good documentation is to copy the header row of the CSV, assuming it has column headers, and Paste that TRANSPOSED into a spreadsheet. That will give you one "row" per variable to do such things as examine how long the variable names might be, whether different files have different headers (paste into a different column in the spreadsheet and run a comparison of values). If you have a source that has narrative column headings you can with a little work in the spreadsheet get it to create LABEL assignment statements for variables by pasting the variable names from the proc import generated data set into another column (either using the INPUT statement from the code or Proc Contents output) and use spreadsheet functions to create text like&amp;nbsp; &amp;nbsp;varname ="original column heading text goes here".&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Any data source that may have "thousands of columns" and doesn't provide documentation as to content of the file, such as expected lengths of character variables and layouts of date, time or datetime values needs to be considered with great suspicion. Without documentation how do you know what anything represents?&lt;/P&gt;</description>
      <pubDate>Mon, 16 Jun 2025 22:10:58 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969156#M376773</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2025-06-16T22:10:58Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969167#M376777</link>
      <description>&lt;P&gt;Thank you so much for your very detailed summarization of your experiences in dealing with real-world datasets! I think your response deserves much more than a mere like, but unforunately, that is all I can offer. Still, now that you have invested plenty of your time on composing these words, I strongly suggest that you go a step further and transform them into a paper that may eventually appear in SAS User Group or academic journals like &lt;A href="https://www.jstatsoft.org/index" target="_blank"&gt;Journal of Statistical Software&lt;/A&gt;. Your experience is invaluable for both SAS users and beyond.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/437457"&gt;@Season&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Does importing the file with the DATA step necessitates specification of the name and informat of each of every variable in the CSV? That would be a very formidable job as I have possibly thousands of columns in all.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Check on the assigned informat for your problem variables. If they were read as character but should be dates that is an indication that you may need to create new variables by parsing the values. Check on your national language settings (NLS) to see what order dates are read. OR if you see lots of invalid data messages involving those variables it is one indicator that the order may be different than your NLS and override to read as character and parse.&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Still, I would like to further consult on NLS. I searched on the web and saw Microsoft having a webpage on NLS, but with a slightly different meaning- national language service. I am not sure if the two NLS's are the same, but anyway, could you please brief introduce what&amp;nbsp;national language settings is and what impact does it have on importing datasets into statistical softwares like SAS? I used to think that the mere difference in the language of the interface and log of SAS does not really have an impact on its core capabilities such as loading and editing datasets.&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/13884"&gt;@ballardw&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Any data source that may have "thousands of columns" and doesn't provide documentation as to content of the file, such as expected lengths of character variables and layouts of date, time or datetime values needs to be considered with great suspicion. Without documentation how do you know what anything represents?&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Finally, I would like to make a clarification. Your reminder is of great value and I thank you for it. However, the datasets I use does have a documentation. In fact, it is huge as it has a description for every variable therein. Still, only descriptions of variables instead of their intriacacies are documented. In other words, for a given column, I only know that it stands for a date with a particular meaning but do not know that it can take multiple formats like "12/16/15" and "12 16 15". Only when I imported it into SAS did I realize this issue.&lt;/P&gt;</description>
      <pubDate>Tue, 17 Jun 2025 08:29:52 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/969167#M376777</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-06-17T08:29:52Z</dc:date>
    </item>
    <item>
      <title>Re: Importing multiple large CSV files with varying data formats in one column</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/971165#M377268</link>
      <description>&lt;P&gt;I have an additional question on this issue. Since my question is the same for everyone in the chat, I will only post it in a response to &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/159"&gt;@Tom&lt;/a&gt;.&lt;/P&gt;
&lt;P&gt;My question is: is it possible to import the CSV files in a parallel instead of a sequential manner? My experience in importing such large files is that the importation process of even a single file takes a lot of time. Therefore, "parallel importation", a word I coined from "parallel computing", is preferred. I wonder if SAS can realize this.&lt;/P&gt;</description>
      <pubDate>Sun, 20 Jul 2025 03:42:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Importing-multiple-large-CSV-files-with-varying-data-formats-in/m-p/971165#M377268</guid>
      <dc:creator>Season</dc:creator>
      <dc:date>2025-07-20T03:42:46Z</dc:date>
    </item>
  </channel>
</rss>

