Having an issue importing a csv...
simple 3 columns, char, num,num
its a HUGE file... almost 2 million rows.. small sample
1103719801,0.27,
1103719901,1.81,
1103720001,1.99,
1103720101,1.88,
1107846101,321.02,2140.16
1103729801,0.27,
Note that the 3rd column often as nothing..
DATA PMA,
INFILE '/home/xxxx/PMAccrual.csv' delimiter=',' firstobs=2;
INFORMAT Key $10.;
INFORMAT Accrual 8.2;
INFORMAT UpdateAccrual 8.2;
INPUT Key $ Accrual UpateAccrual;
RUN;
Key and Accrual look fine. but Update Accrual is all .
nothing is importing for the 3rd column...
So your later post clarified the real issue, which is that you are reading a file with Windows/DOS end of line characters (CRLF) on a system that is using Unix style end of line characters (LF). This is causing SAS to threat the carriage return ('0D'x) as data. Since your last variable on the line is numeric then that caused SAS to not be able to translate the characters to a number.
Add the TERMSTR=CRLF option to the INFILE statement and see if that helps.
Also add TRUNCOVER to keep SAS from running to the next line for a value when the last value is missing.
Don't specify a decimal value in a INFORMAT unless your text files has remove the dots from the value and you want to tell SAS you know where the dots used to be. Isn't that a line from Guys and Dolls? If you tell SAS to use 8.2 and the value being read is '1234' then SAS will interpret that as 12.34.
Also don't use INFORMAT when you mean LENGTH.
When you have already defined the variables type there is no need to add $ in the INPUT statement.
DATA PMA,
INFILE '/home/xxxx/PMAccrual.csv' dsd firstobs=2 truncover termstr=crlf;
LENGTH Key $10 Accrual 8 UpdateAccrual 8;
INPUT Key Accrual UpateAccrual;
RUN;
Please do not code in upper case, its the equivalent of shouting at us. Now hard to tell from that snippet, but two simple changes to your code should make it work:
data pma; infile "s:\temp\x.csv" dlm="," firstobs=2 dsd; informat key $10.; informat accrual 8.2; informat updateaccrual 8.2; input key $ accrual updateaccrual; run;
Change 1 - put dsd so when the variable is missing it moves on - otherwise the next key will appear in the variable - this is likely what is breaking your code.
Change 2 - you have mispelt updateacrrual in the input, so your reading a different variable than the input.
I would also recommend adding TRUNCOVER to the INFILE statement.
The updateaccrual was just a spelling issue (i didnt copy/paste) 🙂
sorry caps (I like to do most keywords in caps)
but, Neither worked. Still has a null value...
It works fine for the example posted, does it work for you for the sample? If not, please post your code and log - for the actual file not this test code.
data want;
infile cards dlm="," truncover dsd;
informat key $10.;
informat accrual 8.;
informat updateaccrual 8.;
input key $ accrual updateaccrual;
cards;
1103719801,0.27,
1103719901,1.81,
1103720001,1.99,
1103720101,1.88,
1107846101,321.02,2140.16
1103729801,0.27,
;
run;
Seriously, don't know where the whole upcase coding has come from but it really just sounds like shouting. Anyways for the sample you provided the code works fine, you need to supply a sample that doesn't work as expected so we can see. Maybe the informat length is not great enough, maybe there is a special character in the data?
starting to wonder that...
NOTE: Invalid data for UpdateAccrual in line 2 10-14.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 CHAR AAA,0.00,0.00. 14
ZONE 44423233232330
NUMR 111C0E00C0E00D
Key=AAA Accrual=0 UpdateAccrual=. _ERROR_=1 _N_=1
continues on line 3,4,5,6,etc...
Key,Accrual,UpdatedAccrual
AAA,0.00,0.00
1005260501,0.00,
1004833501,1239.00,
1004805701,552.00,
1004802501,1446.00,
1004802601,1291.00,
I put that first line in just to test...
Your data looks wrong - I can't really tell as you don't want to post it, but this:
2 CHAR AAA,0.00,0.00. 14
^ ^
Doesn't look right, you cant have two decimal points, nor can you have spaces in a value.
that last . is not in the file
Key,Accrual,UpdatedAccrual
1005260501,0.00,
1004833501,1239.00,
1004805701,552.00,
1004802501,1446.00,
1004802601,1291.00,
1004802701,1084.00,
1004802801,6337.00,
1004802901,2480.00,
NOTE: Invalid data for UpdateAccrual in line 2 17-17.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 CHAR 1005260501,0.00,. 17
ZONE 33333333332323320
NUMR 1005260501C0E00CD
Key=1005260501 Accrual=0 UpdateAccrual=. _ERROR_=1 _N_=1
could there be a char on each line im not seeing?
When the log shows
@StaticFX wrote:
starting to wonder that...
NOTE: Invalid data for UpdateAccrual in line 2 10-14.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0
2 CHAR AAA,0.00,0.00. 14
ZONE 44423233232330
NUMR 111C0E00C0E00D
Key=AAA Accrual=0 UpdateAccrual=. _ERROR_=1 _N_=1
Notice there is a second . in the last value: 0.00. So SAS thinks the value is character. It is not impossible that you are getting "invisible" characters if copying and pasting from some sources.
If your data is not actually sensitive either paste a few lines into a code box opened using the forum {I} menu icon. The main window reformats text and may insert characters. The code box is plain text.
Or create a txt file with a few lines and attach it, though the code box is better so that folks can see directly what you have without downloading attachments.
So your later post clarified the real issue, which is that you are reading a file with Windows/DOS end of line characters (CRLF) on a system that is using Unix style end of line characters (LF). This is causing SAS to threat the carriage return ('0D'x) as data. Since your last variable on the line is numeric then that caused SAS to not be able to translate the characters to a number.
Add the TERMSTR=CRLF option to the INFILE statement and see if that helps.
Also add TRUNCOVER to keep SAS from running to the next line for a value when the last value is missing.
Don't specify a decimal value in a INFORMAT unless your text files has remove the dots from the value and you want to tell SAS you know where the dots used to be. Isn't that a line from Guys and Dolls? If you tell SAS to use 8.2 and the value being read is '1234' then SAS will interpret that as 12.34.
Also don't use INFORMAT when you mean LENGTH.
When you have already defined the variables type there is no need to add $ in the INPUT statement.
DATA PMA,
INFILE '/home/xxxx/PMAccrual.csv' dsd firstobs=2 truncover termstr=crlf;
LENGTH Key $10 Accrual 8 UpdateAccrual 8;
INPUT Key Accrual UpateAccrual;
RUN;
@StaticFX wrote:
Bingo! THANK YOU !!!
whats the diff between Informat and Length?
You use the LENGTH statement to define a variable. You tell it the type (character or numeric) and the number of bytes that you want it to use. You use the INFORMAT statement to attach an informat to the variable. An informat is instructions on how to convert text to the value to be stored.
The reason is mainly works to use INFORMAT when you mean LENGTH is because SAS will define a variable based on how you first use it. So if the first use is in the INFORMAT (or FORMAT) statement then SAS will make a guess at what type of variable you wanted to create based on the informat (or format) you are attaching to it.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.