DATA Step, Macro, Functions and more

Reading date from input file

Accepted Solution Solved
Reply
Contributor
Posts: 58
Accepted Solution

Reading date from input file


Hi,

I have the following problem reading date from csv file.

I've a file where the first row is something like this:

AAA BBB CCC 12/2/2014

I'm running the following code

data _null_;

   infile "my.csv"

          dlm='09'x

          missover

          dsd

          lrecl=4096

          firstobs = 1

          ;

   input a $ b $ c $ d mmddyy10.;

   call symput ('mydate', put(d, yymmddn8.));

   stop;

run;

but it does not working expecially when in the date the month or the day have only one char. (es Februray is 2, not 02)

Do there in SAS a format to read date where month or day are written with the minumum number of char and not always with the 0 ahead?

Thanks


Accepted Solutions
Solution
‎02-25-2014 11:33 AM
Super User
Super User
Posts: 7,047

Re: Reading date from input file

So your problem is what data_null_ diagnosed. Because you switched from list mode to formatted input style in the middle of your INPUT statement SAS is confused by the TAB that is following the date field. Because you told it to read 10 characters it read the TAB also and this made for an invalid value for the informat.  If you are reading a delimited file stick with list input.

data want ;

   infile 'myfile.csv' dlm='09'x dsd truncover obs=1;

   length a b c $8 d 8 ;

   informat d mmddyy10.;

    format d yymmddn8. ;

   input a b c d ;

   call symputx('mydate',put(d,yymmddn8.));

run;

View solution in original post


All Replies
Super Contributor
Posts: 543

Re: Reading date from input file

Hi.

When I run this (a modified version of what you wrote):

data _null_;

   infile datalines dlm=" " missover  dsd;

   input a $ b $ c $ d mmddyy10.;

   format d yymmddn8.;

   call symput ('mydate', put(d, yymmddn8.));

   stop;

   datalines;

AAA BBB CCC 2/2/2014

run;

%put &mydate.;

I get the date right.

When I run this (your code):

data _null_;

   infile "my.csv"

/*   infile datalines*/

   dlm=" " missover  dsd firstobs = 1 lrecl = 4096;

   input a $ b $ c $ d mmddyy10.;

   format d yymmddn8.;

   call symput ('mydate', put(d, yymmddn8.));

   stop;

/*   datalines;*/

/*AAA BBB CCC 2/2/2014*/

run;

%put &mydate.;

It doesn't work....because it reads all sorts of gibberish from the file, as you can see by modifying your data _NULL_ to a data step:

data step;

   infile "my.csv"

   dlm=" " missover  dsd firstobs = 1 lrecl = 4096;

   input a $ b $ c $ d mmddyy10.;

   format d yymmddn8.;

run;

proc print;run;

I am curious what's happening.

Do you get weird characters as well? or it's just me.

Anca.

Super User
Super User
Posts: 7,047

Re: Reading date from input file

Posted in reply to AncaTilea

Do you just want to take the last field in the input file and convert it to a string of digits in yyyymmdd format?

data _null_;

  infile 'myfile' obs=1 ;

  input;

  call symputx('mydate',put(input(scan(_infile_,-1,'090D0A20'x),mmddyy10.),yymmddn8.));

run;

Contributor
Posts: 58

Re: Reading date from input file

Here you are what my file contains:

I tahke it from the log running

data _null_;

   infile "my.csv" ;

   input;

   list;

run;

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----

1   CHAR  AAAAA:.BBBBBB.CCCCCCC.2/2/2014.. 32
    ZONE  44444304444440444444403232333300
    NUMR  11111A92222229333333392F2F20149D

the code you before provided me works, I mean I can get the date but I lost the other fields contained in the line.

The initial gol was easly (how happen in the other languageSmiley Happy) read the complete line.

But I will split my goal in two steps, read the first three fields and than the last one using your code...

Many thanks!

Solution
‎02-25-2014 11:33 AM
Super User
Super User
Posts: 7,047

Re: Reading date from input file

So your problem is what data_null_ diagnosed. Because you switched from list mode to formatted input style in the middle of your INPUT statement SAS is confused by the TAB that is following the date field. Because you told it to read 10 characters it read the TAB also and this made for an invalid value for the informat.  If you are reading a delimited file stick with list input.

data want ;

   infile 'myfile.csv' dlm='09'x dsd truncover obs=1;

   length a b c $8 d 8 ;

   informat d mmddyy10.;

    format d yymmddn8. ;

   input a b c d ;

   call symputx('mydate',put(d,yymmddn8.));

run;

Contributor
Posts: 58

Re: Reading date from input file

On my side your code works also if I write

data _null_...

but it is not a problem on the reason...I think my brain works in a way more different than SAS, anyway I couldn't understand the reason...Smiley Happy

many thanks

Super User
Super User
Posts: 7,047

Re: Reading date from input file

The special keyword _NULL_ when used as the dataset name for a DATA step just means that no real dataset is created.  So you can use all of the programming features of the DATA step but without actually creating a dataset.

Other trick I included was the OBS=1 option on the INFILE since you appeared to only want to read the first line.  You had added a STOP statement to halt the data step.

Normally with SAS you do not need to worry about the number of lines in a file as it will automatically loop over the data step code and stop when it runs out of data.  So if your input file really only has one line neither the STOP or the OBS=1 is needed. And if it doesn't have only one line then perhaps there is something strange going on with your data.

Super User
Posts: 5,429

Re: Reading date from input file

ddmmyy10. can read dates with 8 positions. I thin rather you are running to some issue due to end of record.

One work-around is to read as char (specify length $10 first), then use the input function to create your SAS date variable.

The other is to lift out the 10 part of the informat, but then you need to rely on the YEARCUTOFF= option.

Data never sleeps
Super User
Super User
Posts: 7,047

Re: Reading date from input file

See what happens when you replace MISSOVER with the more appropriate TRUNCOVER. It is almost never right to use MISSOVER as you usually do not want to throw away short fields.

Also you might have end-of-line issues.  If you are reading a file created on a PC on a UNIX system by default SAS will treat the extra Carriage Return ('0D'X) as data and this will mess up the data for the last column.  You can try adding the TERMSTR=CRLF option to your INFILE statement.

Contributor
Posts: 58

Re: Reading date from input file

HI Tom,

Thanks for you almost  useful reply.

I replaced MISSOVER with TRUNCOVER but never changed.

I also confirm that a Carriage Return is present at the end of the line but adding the option TERMSTR=CRLF to infile statement haven't changed so much: it still doesn't work.

To be honest I'm quite new in SAS, but some things that with some other language we can do in few minutes with SAS we need a lot of time and a lot of tries to make it working....also if you ask to very SAS expert...it seems that there isn't logical way to get the things working, only try and try and try...


Super User
Super User
Posts: 7,047

Re: Reading date from input file

Of course I find the opposite, things that are simple to do with SAS take forever to figure out how to do with less powerful languages with their strange syntax rules.

Another thing to check is does your file really have tabs as the delimiter?  The example you posted just has spaces between the values.

Also why not let SAS to show you what is actually in your file.  You can use the LIST statement to get SAS to dump the lines from your input file to the log.

data _null_;

   infile "my.csv" ;

   input;

   list;

run;

You can then see if there are tabs ('09'X) or CR ('0D'x) or other strange things like 'A0'X that can get into files, especially if they have been corrupted by Microsoft products.

Respected Advisor
Posts: 3,799

Re: Reading date from input file

I think the problem is exacerbated by mixing input "types".  The OPs input statement changes from LIST input to formatted input at the last variable.  This "switch" is very subtle and not well understood it would seem.  Formatted input is rarely needed when reading delimited fields.

Valued Guide
Posts: 2,177

Re: Reading date from input file

running the following code

data _null_;

   infile "my.csv"

          dlm='09'x

          missover

          dsd

          lrecl=4096

          firstobs = 1

          ;

   input a $ b $ c $ d mmddyy10.;

   call symput ('mydate', put(d, yymmddn8.));

   stop;

run;

but it does not working expecially when in the date the month or the day have only one char. (es Februray is 2, not 02)

as @Tom pointed out the explicit (or implied) width for the informat resulted in the TAB being treated as data rather than delimiter. As usual the INPUT statement has a feature to support what you want to do. It is the "informat modifier" :

Just adapt your input statement to

input a $ b $ c $ d   :  mmddyy10.;


🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 1510 views
  • 0 likes
  • 6 in conversation