Issue working with comma delim file on Mainframes

Reply
Occasional Contributor
Posts: 13

Issue working with comma delim file on Mainframes

Hi,

I have a comma delimited file .csv created out of lotus notes db, I have renamed the file to .txt and I have FTPed the file to Mainframes. I am unable to read the file on Mainframes and receiving error invalid data on all the input lines.

Sample of the Input as seen on the Mainframe:

JReF No.,TRUS No.,TRUS Name,Trustees Bank,Stamp Code,Penaltys  Number,Type,PEN Amount ,Trust Amount ,Funds Held ,Total Penaltys     

57735,15286,ABCDE ABCDEDEFGHI COMPANY (UN),515,,35645311,L,$0.11 ,$0.00,$0.00,$0.11            

52715,55325,ABABCDEDFGHKI ADLADJFD SLK ,515,,31954258,L,$0.16 ,$0.00,$0.00,$0.16                                              

52715,55325,ABABCDEDFGHKI ADLADJFD SLK ,515,,31954258,L,$0.17 ,$0.00,$0.00,$0.17

11955,22956,LIONSDENIT INC. PP. LTD.,515,,35644876,L,$0.17 ,$0.00,$0.00,$0.17

My Code:

DATA UPLOAD1;                            

                                         

  INFILE UPLOAD DLM=',' DSD RECFM=VB LRECL=256; /* Please note, I also tried FTPing the file into a fixed 200 lrecl file and tried the same code */

                                         

  INPUT  FLD1-FLD10 :$20.;               

  PUT 'FLD1= '  FLD1;                    

  PUT 'FLD2= '  FLD2;                    

  PUT 'FLD3= '  FLD3;                    

                                         

RUN;                                     

SAS Log:

NOTE: Invalid data for FLD1 in line 1 1-8.                                     

NOTE: Invalid data for FLD2 in line 1 10-17.                                   

NOTE: Invalid data for FLD3 in line 1 19-27.                                   

2                                                          The SAS System      

                                                                               

NOTE: Invalid data for FLD4 in line 1 29-41.                                   

NOTE: Invalid data for FLD5 in line 1 43-52.                                   

NOTE: Invalid data for FLD6 in line 1 54-69.                                   

NOTE: Invalid data for FLD7 in line 1 71-74.                                   

NOTE: Invalid data for FLD8 in line 1 76-85.                                   

NOTE: Invalid data for FLD9 in line 1 88-99.                                   

FLD1= .                                                                        

FLD2= .                                                                        

FLD3= .                                                                        

RULE:     ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0

1         JReF No.,TRUS No.,TRUS Name,Trustees Bank,Stamp Code,Penaltys  Number,Type,PEN Amount ,Trust Amount 

     101  ,Funds Held ,Total Security  128                                     

FLD1=. FLD2=. FLD3=. FLD4=. FLD5=. FLD6=. FLD7=. FLD8=. FLD9=. FLD10=Funds Held

NOTE: Invalid data for FLD3 in line 2 13-42.                                   

NOTE: Invalid data for FLD7 in line 2 58-58.                                   

NOTE: Invalid data for FLD8 in line 2 60-64.                                   

NOTE: Invalid data for FLD9 in line 2 67-71.                                   

FLD1= 57735                                                                     

FLD2= 15286                                                                    

FLD3= .                                                                        

2         57735,15286,ABCDE ABCDEDEFGHI COMPANY (UN),515,,35645311,L,$0.11 ,$0.0

FLD1=57735 FLD2=15286 FLD3=. FLD4=515 FLD5=. FLD6=35645311 FLD7=. FLD8=. FLD9=. FL

Can anyone please help where I am doing the mistake?  I am using SAS 9 on Mainframes (z/os) .

Thank You!!

Rgds

HK_Endeavour Forever

Trusted Advisor
Posts: 3,215

Re: Issue working with comma delim file on Mainframes

Posted in reply to HK_EndeavourForever

There is nothing special for a  mainframe within this.

You uploaded the file successfully and all is still readable. the Ascii/Ebcdic byte translation is workable.
Using a VB (lrecl 32Kb bklsize=27998) is the best one for the record crlf convention translation, a lrecl=255 (common convention in Unix) is also good as long your records are not longer.

No zoned/number lines with not readable characters, you have mistakes in some generic issue. You could doe this also on your pc.

The file is coming in with te first record obviously being column names, You better drop this record and start with the second and after.

use the firstobs=2 to do that SAS(R) 9.3 Statements: Reference

You are reading 10 char variables Fld1-fld10. The default length is 8. This is what the column pointering is telling you. Every time increasing 8 positions, not using de comma as seperator. 

Better define variables with formats (attribute) and length you are needing. 2 numeric, 1 character (how long?) hopefully without special chars and without "," , 3? numeric, 1 character (length 1?), 3 numeric with dollar format. You are lucky with the decimal-point.  Hopefully no "," used with big ammounts. The "," is not a certain unique symbol. When conflicting you can move to the "tab" or ~ approach.

You are inputting list format. But is functioning as a formatted one. The list input is not that easy with specifying formats. The length of a format and the var is not working by conflicting logic.
SAS(R) 9.3 Statements: Reference (input list). The adjusted approach with fully specifying the variables in advance should solve this.  

---->-- ja karman --<-----
Super User
Posts: 7,844

Re: Issue working with comma delim file on Mainframes

Actually, the way he codes his example, the vars are defined as numeric, because of the :$20.

So he gets errors everytime there is non-numeric data.

The data step should look like:

data test;

infile '$HOME/intest.txt' dlm=',' dsd;

format fld1-fld20 $20.;

input fld1-fld20;

run;

I tested this on my AIX server, so the behaviour is platform-indepenent.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Super User
Posts: 10,041

Re: Issue working with comma delim file on Mainframes

Posted in reply to HK_EndeavourForever

use TRUNCOVER option ?

Trusted Advisor
Posts: 3,215

Re: Issue working with comma delim file on Mainframes

@ksharp, Why truncover?
It did read 9 numeric (8 long) and the 10-th as string (8 long). All according the coding, but as pitty not what was wanted. 
The postion/record lengt was 128 short enough to be under the 255.  

---->-- ja karman --<-----
Super User
Posts: 10,041

Re: Issue working with comma delim file on Mainframes

Just guess. :smileyplain:

Trusted Advisor
Posts: 3,215

Re: Issue working with comma delim file on Mainframes

@Ksharp, you are right to mention it. Those things are probably coming when he is proceeding with his data.

---->-- ja karman --<-----
Occasional Contributor
Posts: 13

Re: Issue working with comma delim file on Mainframes

Posted in reply to HK_EndeavourForever

Thank You Jaap and Ksharp for the suggestions!! After much struggling I realize that ',' delim will not just work because of the data challenges. For some records with character, I have "," in between the character string and that data is represented as "AAA, BBB CCC", after exporting it out of the lotus notes db. For other records do not have "," in them. And big amounts are also formatted in commas, so it is making difficult to read and format the data. I had to workaround with importing the data to MS Access and format it properly, convert to text file and upload to Mainframes as a fixed length, then I could read it.

Your suggestions would have really worked if the data were consistent, this is a limitation in SAS I guess. Character with spaces are also not being read properly, they are being set as missing values. If the default length is 8, atleast it should print the first 8 characters of the data ? And yeah, I do not intend to drop the header record during initial read, because I want to read every field as a character variable during initial read.

Super Contributor
Posts: 644

Re: Issue working with comma delim file on Mainframes

Posted in reply to HK_EndeavourForever

HK_EndeavourForever

You say "this is a limitation in SAS I guess" but it is really a limitation in the CSV conventions - because there is no standard and thus different implementations publish and resolve embedded characters differently.  It would be quite possible, though tedious, to write SAS code to import the kind of data you have, but if you are happy to add the extra Access steps in your process that is a good enough solution (for now!).


Richard

Trusted Advisor
Posts: 3,215

Re: Issue working with comma delim file on Mainframes

Posted in reply to HK_EndeavourForever

1/ using a delimiter like ~ (when unique) instead of comma will sole a lot.

2/ Embedded spaces and other characters can be read.

    SAS(R) 9.3 Formats and Informats: Reference (informats $quote )

3/ reading all data as character is possible  you must define the SAS variables and informats accordingly

    Reading them as numeric will result in showing missings.  That is what most likely has happened.
4/ You cannot change numeric/character type after having defined them. SAS is using a tabular approach like a RDBMS.
    That limitation is quite common with that.

---->-- ja karman --<-----
Super User
Posts: 7,844

Re: Issue working with comma delim file on Mainframes

Posted in reply to HK_EndeavourForever

Your problem does not stem from the fact you are working on a mainframe.

Your code makes SAS define the variables as numeric.

What you probably want is

format fld1-fld20 $20.; * this causes all vars to be character;

input fld1-fld20; * format not necessary;

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Ask a Question
Discussion stats
  • 10 replies
  • 500 views
  • 4 likes
  • 5 in conversation