SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

How to split a column from a text file into multiple columns?

Reply
Contributor
Posts: 55

How to split a column from a text file into multiple columns?

 

We a have a pipe-delimited text file where some fields are longer than 32767 characters. Our client wants to place the excess characters into another column. Example, the original column is called "COL1". If one row is found to have COL1 exceeding 32767 characters, the excess ones will be placed in "COL2" and so on. Can this be possible?

 

Respected Advisor
Posts: 4,173

Re: How to split a column from a text file into multiple columns?

Posted in reply to angeliquec

It is possible to do but it's very painful to code and it will have a heavy impact on extract performance.

 

You would need to read your file as data stream using "recfm=N" and then read character by character until you either reach 32KB, your pipe delimiter or an end-of-line indicator (like '0A'x for a line feed); and then populate your target variables accordingly. So quite a bit of processing and cumbersome coding.

 

I've had similar requirements in past projects. Whenever I started asking "why" and "how do you intend to use this data" and also explained what it means in regards of storage required and implementation effort, such requirements got dropped.

Super User
Super User
Posts: 7,997

Re: How to split a column from a text file into multiple columns?

I would totally agree with you here Patrick.  The problem lies at data source, if the OP starts "fixing" it here it becomes another source then and opens another set of issues.  For instance, even at a basic level the data which is generated needs to be throughly checked against the source data.  The question should be asked why the data is exceeding that length, as that is a huge chunk.  Someone has obviously made a boobo somewhere along the line, or not coded/split something up correctly.

Respected Advisor
Posts: 4,173

Re: How to split a column from a text file into multiple columns?

[ Edited ]

I would assume it's an extract from a database CLOB field (like free text applications or claims or attachments).

 

There can be reasons for a business user to want such data, eg. for searches or text analytics, but: A CLOB field can also get exported into separate files stored in a content server only storing the links to the content server in SAS files.

 

In real life I've been only in one project so far where we really needed to provide access to such files (for SAS Intelligence Management).

Ask a Question
Discussion stats
  • 3 replies
  • 695 views
  • 1 like
  • 3 in conversation