DATA Step, Macro, Functions and more

Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

Accepted Solution Solved
Reply
Contributor
Posts: 23
Accepted Solution

Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

Hi,

I have a macro that loops through a dataset that contains the file paths of around 200 CSV's and imports them.

The CSV files do not all have the same columns, however, one column that every CSV does have is a date column, in ddmm10 format. It will always be in the 2nd column, the 1st column will also be the same across all CSV's.

Where the first few rows in the CSV have dates >=12th of the month, SAS correctly assigns ddmmyy10. informat. But when this is not the case, it incorrectly assigns informat mmdd10., then when it comes to a date >=12th of the month, it fails to import that piece of data.

I have tried adding a GUESSINGROWS=500 to force SAS to evaluate more rows before assigning an informat, but this does not resolve the issue.

I believe because of the varying columns across the CSV's, an 'infile' statement is out of the question. I have also seen some ideas to suggest copying the log where the proc import has generated datasteps into a program and remove any non SAS code, however, due to the number of CSV's involved this would be a long and tedious exercise.

Could anyone suggest a way round this and force SAS to assign a ddmmyy10. informat to date fields in the proc import?

Kind Regards,

cxkev


Accepted Solutions
Solution
‎04-10-2014 01:58 PM
Super Contributor
Posts: 1,636

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

I would create files with the first two columns using  ‘infile’ statement, then combine these files
with files created by proc import:

Data want;

  merge cvs1_part1  cvs1_part2(drop=column_1 column_2);

  Run;

View solution in original post


All Replies
Super User
Super User
Posts: 7,407

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

Hi,

I assume you have put: mixed=yes; in your proc import?

Super User
Super User
Posts: 7,407

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

You can also have some code in between imports to fix it if nothing else works:

data have1;
  attrib mytext format=$20. mydate format=mmddyy10.;
  mytext="Hello";
  mydate='12jan14'd;
  output;
  mytext="World";
  mydate='13Nov13'd;
  output;
run;

proc sql;
  create table TMP as
  select  distinct NAME
  from    DICTIONARY.COLUMNS
  where   LIBNAME="WORK"
    and   MEMNAME="HAVE1"
    and   FORMAT="MMDDYY10.";
quit;

data _null_;
  set TMP;
  call execute('data have1;
                  set have1 (rename=('||strip(name)||'=tmp_var));
                  attrib '||strip(name)||' format=ddmmyy10.;
                  '||strip(name)||'=tmp_var;
                run;');
run;

Contributor
Posts: 23

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

RW9 wrote:

You can also have some code in between imports to fix it if nothing else works:

data have1;
  attrib mytext format=$20. mydate format=mmddyy10.;
  mytext="Hello";
  mydate='12jan14'd;
  output;
  mytext="World";
  mydate='13Nov13'd;
  output;
run;

proc sql;
  create table TMP as
  select  distinct NAME
  from    DICTIONARY.COLUMNS
  where   LIBNAME="WORK"
    and   MEMNAME="HAVE1"
    and   FORMAT="MMDDYY10.";
quit;

data _null_;
  set TMP;
  call execute('data have1;
                  set have1 (rename=('||strip(name)||'=tmp_var));
                  attrib '||strip(name)||' format=ddmmyy10.;
                  '||strip(name)||'=tmp_var;
                run;');
run;

The results of the proc import, where the date informat incorrectly assigns a mmddyy means that anything >=12th of month, is imported into SAS with a . value, so there is no value there to change

Contributor
Posts: 23

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

I have tried mixed=yes, however, results in an error. I have researched this after our suggestion as it was the first time I'd heard of this option. Someone had wrote that it is only compatible with dbms=xls.

Respected Advisor
Posts: 3,777

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

I wonder if changing the datestyle system option would have any effect?

proc options option=datestyle;
   run;


15         proc options option=datestyle;
16            run;

    SAS (r) Proprietary Software Release
9.3  TS1M2

DATESTYLE=MDY     Identify sequence of month, day and year when ANYDATE informat data is ambiguous
Contributor
Posts: 23

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

data_null_; wrote:

I wonder if changing the datestyle system option would have any effect?

proc options option=datestyle;
   run;


15         proc options option=datestyle;
16            run;

    SAS (r) Proprietary Software Release
9.3  TS1M2

DATESTYLE=MDY     Identify sequence of month, day and year when ANYDATE informat data is ambiguous

Thanks for the suggestion, unfortunately, this has not solved the issue.

Super User
Posts: 17,840

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

Did you change the datestyle option or just run the code data _null_ provided?

proc options option=datestyle;

   run;

options datestyle=dmy;

proc options option=datestyle;

   run;

*Change back;

*options datestyle=mdy;

Contributor
Posts: 23

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

Reeza wrote:

Did you change the datestyle option or just run the code data _null_ provided?

proc options option=datestyle;

   run;

options datestyle=dmy;

proc options option=datestyle;

   run;

*Change back;

*options datestyle=mdy;

Hi Reeza,

I added a options datestyle=dmy; to the start of my program.

Solution
‎04-10-2014 01:58 PM
Super Contributor
Posts: 1,636

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

I would create files with the first two columns using  ‘infile’ statement, then combine these files
with files created by proc import:

Data want;

  merge cvs1_part1  cvs1_part2(drop=column_1 column_2);

  Run;

Contributor
Posts: 23

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

Linlin wrote:

I would create files with the first two columns using  ‘infile’ statement, then combine these files
with files created by proc import:

Data want;

  merge cvs1_part1  cvs1_part2(drop=column_1 column_2);

  Run;

Thanks very much, that worked! Simple when you know how Smiley Happy.

Thanks to everyone for your help.

Valued Guide
Posts: 2,175

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

If there were rules indicating the column structure from the file names   

or a consistent informat for each column with known column header 

then you would not be constrained by the "flexibility" offered by proc import. 

you would have ways to pre-determine the informats you need and data steps could be provided with suitable informats and infile  options in a programmatic way.

good luck

Super User
Posts: 10,508

Re: Proc Import DBMS=CSV, incorrectly assigning informats mmddyy10. instead of ddmmyy10.

If the many csv files are all the same format then it is not a lot of code. You only change the input file reference and the outpu data set name once it is working.

You didn't say how many records your files have, you might try a much larger value than 500 for guessingrows, especially if your data is sorted by the date.

Personally I only use proc import on CSV files to build a program to modify especially if there are more than one input file of the same layout to process. Generally I set text fields such as names, addresses, account identifiers or other to about 25% longer than found in the first file because such things tend to vary from file to file and if I guess right I have all of the data with the same lengths which can be important if the files ever get combined (since they are the same layout the likelihood is pretty large).

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 1005 views
  • 7 likes
  • 7 in conversation