BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SASFanTodd
Calcite | Level 5

Hello All,

I'm very new to SAS (started training in E.G.5.1 today) and this may be putting the cart before the horse but I need to ask anyway.  I have the need to import .txt files which are healthcare claims 837p/x12 formated files.  Within each file, segments/records are broken up by a "~".  However, when I try to import the file delimiting the file by the "~" using EG, it tells me I cannot.  Any help would be appreciated

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You can just have the INFILE statement loop over the filenames without bothering to make another dataset of them. (Note it really only works with a single '*' wildcard in the string.)

data output_file

  infile "&dirname\*.txt" recfm=n dlm='7E'x dss;

  length f1 $ 500;

  input f1 @;

run;

If you want to save the file names then you can use the FILENAME option on the INFILE statement.  You need two variables because the one referenced in the INFILE statement will be dropped by SAS.


data output_file

  length filename fname f1 $ 500;

  infile "&dirname\*.txt" recfm=n dlm='7E'x dss filename=fname;

  input f1 @;

  filename=fname;

run;

View solution in original post

6 REPLIES 6
ballardw
Super User

How are attempting to import the files?  Showing a copy of a log entry with the error would be helpful.

Also, are all of the files in the same layout? Does all of the result go to a single data set or does each file need to create a separate data set?

You  may want to post a couple of records with any sensitive information replaced with XXX or similar.

SASFanTodd
Calcite | Level 5

My apologies for the lack of detai.  An 837/x12 file is basically 1 long string of data containing segments/records which can be delimeted by a "~" (in most cases) and looks something like this "ISA*00*          *00*        xxxxxxxxxx~GS*xxx    xxxxxx      xxxxxxxxxxxxx           xxx~ST*      xxxxxx~BHT*xxxxx     *xxxxxxxxxxxxxxxxxxxxxxxxxxxx  xxxxxxxxxxxxxxxxxxxxx~........and so on..............~IEA*xxxxxxxxxxx~" with each file beginning with an ISA segment/record and ending with an IEA segment/record.

As you can see, each segment/record is very different and contains different types of data which needs to be parsed out base on how the segment begins (i.e. ISA, GS, ST, etc).  I only know enough SQL to be dangerous but was able to load the files in SQL by having a table with 1 field/column/variable lets say 1000 characters long.  My load package had each column delimited by the ~ and worked just fine with 1 record per input segment/record.  Currently, we do not have SQL connected to SAS and my company would like to avoid that if possible however we may not be able to if SAS cannot load this directly.  As I stated in my original post, we are brand new to SAS and this is my first attempt at this.  We knew this may be a sticky spot so wanted to try as soon as possible.

A little extra background - We are a small shop with just myself and another person as developers.  Like I said, I have just enough SQL knowledge to be dangerous and basically just used it to house table which our previous BI tool (WebFOCUS) would use.  SQL just contained the stand alone tables and we used the BI tool to create datamarts and such for reporting against.  The company would like to avoid the SQL part if at all possible.

Thanks Much!

TomKari
Onyx | Level 15

SAS is very good at this kind of transformation. so in the long run you won't have any problems.

However, it is probably beyond the scope of this support community.

I would suggest that you post it under "SAS Drug Development" and "SAS in Health Care Related Fields". Someone who monitors those groups may be aware of some code that already exists that can transform this data as you require.

Also, a general internet search for software that can input an 837/x12 file into SAS might point you in a good direction. This appears to be a fairly standard data format, somebody has probably already done this. (However, they might charge...sigh!)

Tom

Tom
Super User Tom
Super User

It does not look like it is any different than reading any other complex data format.  But the description on the format at Statewide Planning and Research Cooperative System does not look very complete to me.  They seem to describe the fields, but not the records.  Perhaps the fields are fixed length.

In general if you have a raw data file where the content of the row varies then usually you want to read in the beginning of the row to determine the record type and then read the rest of the row base on the record type.

input type $ 1-3 @ ;

if type='ISA' then input .... ;

else if type='xxx' then input .... ;

SASFanTodd
Calcite | Level 5

I was able to figure out the solution (there is a "however") which as it turns out was quite simple and the code is as follows:

data output_file

     infile 'input_file'

          recfm=n

          dlm-'7E'x

          dss;

     length f1 $ 500;

     input f1 @;

run;

However....there is always a however, the next issue is that I have multiple of these files in a folder which I would like to load and put together.  Please keep in mind that new files are constantly being added to the folder thus needing to be loaded as well on a monthly basis.  I'm using the following code:

%let dirname = s:\input_file_folder;

filename dirlist pipe "dir /b &dirname\*.txt";

data dirlist;

     length fname $256;

     infile dirlist length=reclen;

     input fname $varying256. reclen;

run;

data output_file

     set dirlist

     filepath = "&dirname\"||fname;

     infile dummy filevar=filepath

          lrecl=10000000

          termstr=crlf

          recfm=n

          dsd

          dlm='7E'x

          end=end;

     do while(not end);

          length f1 $ 500;

          input f1 @;

          output;

     end;

run;

Now, this will load the first file just fine but then I get an error stating the following in the log:

NOTE: Unexpected end of file for binary input.

NOTE: There were 1 observations read from the data set WORK.dirlist

.......

......

......

Keep in mind as well that each of these files are actually just 1 record and the record which could be between a few hundred bytes to what I think wont be larger that 10,000,000 bytes and indeed does end with the CRLF marker

Again, I am quite new to SAS and am just learning the language.  Any help to get all the files to load together would be greatly apperciated.

Thanks in advance

Tom
Super User Tom
Super User

You can just have the INFILE statement loop over the filenames without bothering to make another dataset of them. (Note it really only works with a single '*' wildcard in the string.)

data output_file

  infile "&dirname\*.txt" recfm=n dlm='7E'x dss;

  length f1 $ 500;

  input f1 @;

run;

If you want to save the file names then you can use the FILENAME option on the INFILE statement.  You need two variables because the one referenced in the INFILE statement will be dropped by SAS.


data output_file

  length filename fname f1 $ 500;

  infile "&dirname\*.txt" recfm=n dlm='7E'x dss filename=fname;

  input f1 @;

  filename=fname;

run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4319 views
  • 0 likes
  • 4 in conversation