INFILE problem

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 19
Accepted Solution

INFILE problem

Having an issue importing a csv... 

simple 3 columns, char, num,num

its a HUGE file... almost 2 million rows.. small sample

 

1103719801,0.27,

1103719901,1.81,

1103720001,1.99,

1103720101,1.88,

1107846101,321.02,2140.16

1103729801,0.27,

 

Note that the 3rd column often as nothing.. 

 

DATA PMA,

    INFILE '/home/xxxx/PMAccrual.csv' delimiter=',' firstobs=2;

    INFORMAT Key $10.;

    INFORMAT Accrual 8.2;

    INFORMAT UpdateAccrual 8.2;

    INPUT Key $ Accrual UpateAccrual;

RUN;

Key and Accrual look fine. but Update Accrual is all .   

nothing is importing for the 3rd column...

 


Accepted Solutions
Solution
‎11-17-2017 11:30 AM
Super User
Super User
Posts: 7,393

Re: INFILE problem

[ Edited ]

So your later post clarified the real issue, which is that you are reading a file with Windows/DOS end of line characters (CRLF) on a system that is using Unix style end of line characters (LF).  This is causing SAS to threat the carriage return ('0D'x) as data. Since your last variable on the line is numeric then that caused SAS to not be able to translate the characters to a number.

 

Add the TERMSTR=CRLF option to the INFILE statement and see if that helps.

Also add TRUNCOVER to keep SAS from running to the next line for a value when the last value is missing.

Don't specify a decimal value in a INFORMAT unless your text files has remove the dots from the value and you want to tell SAS you know where the dots used to be.  Isn't that a line from Guys and Dolls?  If you tell SAS to use 8.2 and the value being read is '1234' then SAS will interpret that as 12.34.

Also don't use INFORMAT when you mean LENGTH.

When you have already defined the variables type there is no need to add $ in the INPUT statement.

DATA PMA,
    INFILE '/home/xxxx/PMAccrual.csv' dsd firstobs=2 truncover termstr=crlf;
    LENGTH Key $10 Accrual 8 UpdateAccrual  8;
    INPUT Key Accrual UpateAccrual;
RUN;

 

 

View solution in original post


All Replies
Super User
Super User
Posts: 8,634

Re: INFILE problem

Please do not code in upper case, its the equivalent of shouting at us.  Now hard to tell from that snippet, but two simple changes to your code should make it work:

data pma;
  infile "s:\temp\x.csv" dlm="," firstobs=2 dsd;
  informat key $10.;
  informat accrual 8.2;
  informat updateaccrual 8.2;
  input key $ accrual updateaccrual;
run;

Change 1 - put dsd so when the variable is missing it moves on - otherwise the next key will appear in the variable - this is likely what is breaking your code.

Change 2 - you have mispelt updateacrrual in the input, so your reading a different variable than the input.

Super User
Posts: 21,527

Re: INFILE problem

I would also recommend adding TRUNCOVER to the INFILE statement. 

Occasional Contributor
Posts: 19

Re: INFILE problem

The updateaccrual was just a spelling issue (i didnt copy/paste) Smiley Happy

 

sorry caps (I like to do most keywords in caps)

 

but, Neither worked. Still has a null value...

Super User
Posts: 21,527

Re: INFILE problem

It works fine for the example posted, does it work for you for the sample? If not, please post your code and log - for the actual file not this test code. 

 

 data want;
 infile cards dlm="," truncover dsd;
  informat key $10.;
  informat accrual 8.;
  informat updateaccrual 8.;
  input key $ accrual updateaccrual;
  cards;
1103719801,0.27,
1103719901,1.81,
1103720001,1.99,
1103720101,1.88,
1107846101,321.02,2140.16
1103729801,0.27,
;
run;
Super User
Super User
Posts: 8,634

Re: INFILE problem

Seriously, don't know where the whole upcase coding has come from but it really just sounds like shouting.  Anyways for the sample you provided the code works fine, you need to supply a sample that doesn't work as expected so we can see.  Maybe the informat length is not great enough, maybe there is a special character in the data?

Occasional Contributor
Posts: 19

Re: INFILE problem

starting to wonder that...

 

NOTE: Invalid data for UpdateAccrual in line 2 10-14.

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0

2 CHAR AAA,0.00,0.00. 14

ZONE 44423233232330

NUMR 111C0E00C0E00D

Key=AAA Accrual=0 UpdateAccrual=. _ERROR_=1 _N_=1

 

continues on line 3,4,5,6,etc...

 

 

Key,Accrual,UpdatedAccrual
AAA,0.00,0.00
1005260501,0.00,
1004833501,1239.00,
1004805701,552.00,
1004802501,1446.00,
1004802601,1291.00,

 

I put that first line in just to test...

Super User
Super User
Posts: 8,634

Re: INFILE problem

Your data looks wrong - I can't really tell as you don't want to post it, but this:
2 CHAR AAA,0.00,0.00. 14

                              ^          ^

Doesn't look right, you cant have two decimal points, nor can you have spaces in a value.

Occasional Contributor
Posts: 19

Re: INFILE problem

that last . is not in the file

 

Key,Accrual,UpdatedAccrual
1005260501,0.00,
1004833501,1239.00,
1004805701,552.00,
1004802501,1446.00,
1004802601,1291.00,
1004802701,1084.00,
1004802801,6337.00,
1004802901,2480.00,

 

NOTE: Invalid data for UpdateAccrual in line 2 17-17.

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0

2 CHAR 1005260501,0.00,. 17

ZONE 33333333332323320

NUMR 1005260501C0E00CD

Key=1005260501 Accrual=0 UpdateAccrual=. _ERROR_=1 _N_=1

 

could there be a char on each line im not seeing?

Super User
Posts: 12,148

Re: INFILE problem

When the log shows


StaticFX wrote:

starting to wonder that...

 

NOTE: Invalid data for UpdateAccrual in line 2 10-14.

RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0

2 CHAR AAA,0.00,0.00. 14

ZONE 44423233232330

NUMR 111C0E00C0E00D

Key=AAA Accrual=0 UpdateAccrual=. _ERROR_=1 _N_=1

 

 


Notice there is a second . in the last value:  0.00. So SAS thinks the value is character. It is not impossible that you are getting "invisible" characters if copying and pasting from some sources.

 

If your data is not actually sensitive either paste a few lines into a code box opened using the forum {I} menu icon. The main window reformats text and may insert characters. The code box is plain text.

Or create a txt file with a few lines and attach it, though the code box is better so that folks can see directly what you have without downloading attachments.

Solution
‎11-17-2017 11:30 AM
Super User
Super User
Posts: 7,393

Re: INFILE problem

[ Edited ]

So your later post clarified the real issue, which is that you are reading a file with Windows/DOS end of line characters (CRLF) on a system that is using Unix style end of line characters (LF).  This is causing SAS to threat the carriage return ('0D'x) as data. Since your last variable on the line is numeric then that caused SAS to not be able to translate the characters to a number.

 

Add the TERMSTR=CRLF option to the INFILE statement and see if that helps.

Also add TRUNCOVER to keep SAS from running to the next line for a value when the last value is missing.

Don't specify a decimal value in a INFORMAT unless your text files has remove the dots from the value and you want to tell SAS you know where the dots used to be.  Isn't that a line from Guys and Dolls?  If you tell SAS to use 8.2 and the value being read is '1234' then SAS will interpret that as 12.34.

Also don't use INFORMAT when you mean LENGTH.

When you have already defined the variables type there is no need to add $ in the INPUT statement.

DATA PMA,
    INFILE '/home/xxxx/PMAccrual.csv' dsd firstobs=2 truncover termstr=crlf;
    LENGTH Key $10 Accrual 8 UpdateAccrual  8;
    INPUT Key Accrual UpateAccrual;
RUN;

 

 

Occasional Contributor
Posts: 19

Re: INFILE problem

Bingo! THANK YOU !!!
whats the diff between Informat and Length?
Super User
Super User
Posts: 7,393

Re: INFILE problem


StaticFX wrote:
Bingo! THANK YOU !!!
whats the diff between Informat and Length?

You use the LENGTH statement to define a variable. You tell it the type (character or numeric) and the number of bytes that you want it to use. You use the INFORMAT statement to attach an informat to the variable. An informat is instructions on how to convert text to the value to be stored.

 

The reason is mainly works to use INFORMAT when you mean LENGTH is because SAS will define a variable based on how you first use it.  So if the first use is in the INFORMAT (or FORMAT) statement then SAS will make a guess at what type of variable you wanted to create based on the informat (or format) you are attaching to it.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 12 replies
  • 230 views
  • 0 likes
  • 5 in conversation