DATA Step, Macro, Functions and more

Issues reading tab-delimited file on z/os

Reply
Contributor
Posts: 32

Issues reading tab-delimited file on z/os

I have a tab-delimited file which was created in Excel 2007 under Windows which looks as follows (portion shown, tabs represented by a T)
95008T 68106T 44970T 7001T
95014T 100020T 54412T 2643T

The file is uploaded to the mainframe and I have a SAS program which reads
in the file as follows:

filename census 'CENSUS.DATA' disp=shr;

Data Census;
infile census dlm='05'X expandtabs missover lrecl=300

input zipcode $5.
median_income 8.
population 8.
pop_density 8.;

run;

The code runs fine EXCEPT for lines where median_income is 100000 and higher (six digits vs.five). For those lines, SAS then complains there is invalid data for the population field and truncates the median_income field (the value should be 100020)

NOTE: Invalid data for population in line 143 14-21.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6--
143 95014 100020 54412 2643
zipcode=95014 med_Income=10002 population=. pop_density=2643

I have tried not using the expandtabs option and the file does not read in properly at all.
Super Contributor
Super Contributor
Posts: 3,174

Re: Issues reading tab-delimited file on z/os

Suggest you dump the EXPANDTABS parameter, which is causing your $5 zip-code field to be expanded to $8, resulting in your data-truncation. Otherwise, you will need to account for the adjusted field-width with each of your input data-source locations.

Scott Barry
SBBWorks, Inc.
Contributor
Posts: 32

Re: Issues reading tab-delimited file on z/os

I'll look into that, thanks for the quick response, Scott.
Frequent Contributor
Posts: 106

Re: Issues reading tab-delimited file on z/os

Putting the lengths into a LENGTH statement might solve your problem.
Ran the following (with your T delimiters) on z/OS batch:

[pre]

Data Census;
infile census dlm='T' expandtabs missover /*lrecl=300*/ ;

length zipcode $5 median_income population pop_density 8 ;

input zipcode
median_income
population
pop_density ;

run;

[/pre]

Robert
Contributor
Posts: 29

Re: Issues reading tab-delimited file on z/os

When I use a tab delimited file on the mainframe, I use LIST input and tell SAS the maximum length each variable can be. If the length is shorter than the max, SAS will allow for it. I code like this:
Data Temp;
infile in dlm='05'x dsd trancover;
input v1 : $45.
v2 : $10.
v3 : 12.
etc ;

The colon : allows you to read the value from the next nonblank column until the pointer reaches the next blank column, in this case reading from tab to tab.
Contributor
Posts: 29

Re: Issues reading tab-delimited file on z/os

Sorry, I should have TRUNCOVER instead of trancover
Valued Guide
Posts: 2,175

Re: Issues reading tab-delimited file on z/os

hi DavidJ

your best direction is coming from "garybald"

tab-delimited and csv-type data files have no fixed structure, as implied by explicit lengths on an input statement.
To support these data types SAS data steps have the features "list input" and DSD.
It is DSD which has the biggest effect because "normal" list input would expect "white space" to be the "delimiter" between data items. For that way. multiple blanks are ignored and treated as "one delimiter".
For DSD (I like to interpret as Delimiter Separated Data) each blank is a delimiter and consecutive delimiters imply a missing value. This DSD feature was introduced coinciding with the introduction of SAS on ms-windows platforms, so it is not really a new feature- but doesn't seem to be among basic SAS training, judging by the number of discussions about how to handle delimiter separated data.

Any time you have to deal with delimited data (csv, tab or pipe or whatever) remember to use DSD.

good luck
peterC
Ask a Question
Discussion stats
  • 6 replies
  • 427 views
  • 0 likes
  • 5 in conversation