DATA Step, Macro, Functions and more

reading pipe delimited file with multiple record layouts

Accepted Solution Solved
Reply
New Contributor
Posts: 3
Accepted Solution

reading pipe delimited file with multiple record layouts

Hello.  I am trying to read in pipe delimited data that has multiple record layouts.  I have tried several methods that get me close but with no success.  There are 3 issues I am encountering with the data (1) There are no end-of-record pipes (2) The first field will determing the remaining record layouts (3) I do not have position information for the fields, they are only seperated by "|".  Below is the first method I have tried in SAS 9.4

 

Method 1:

data want;

  infile indata dlm='|' firstobs=1;

  input @1 rec_type $2.;

  if rec_type='RA';

  input rec_type $2.

           var2        $2

;

run;

 

Method 2:

data want    ;
  %let _EFIERR_ = 0; /* set the ERROR detection macro variable */
  infile indata
  delimiter = '|'  DSD lrecl=32767 ;
   informat rec_type $2.;
   input  rec_type $;
   if   rec_type='RA' then do;

  informat var2 $2.;

  format var2 ;

end;

run;

 

My problem is trying to address the changing record layout and telling says where the end of the record is. Attached is a sample of the data file I am working with.

 


Accepted Solutions
Solution
‎02-07-2017 09:38 AM
Super User
Super User
Posts: 6,502

Re: reading pipe delimited file with multiple record layouts

You can read that data dictionary into a SAS dataset and use it to write the SAS program. You will need to convert descriptive names into valid SAS variable names. You will need to convert the data types into SAS variable types and at least for the date variables also FORMAT and INFORMAT settings.  

 

Also to make it easier you might want to just write a separate data step for each record type, rather than doing it all in one data step. The program will be easier to write and understand. And unless you have a hugh data file it shouldn't take long to read it multiple times.

 

data ra ;
   infile 'myfile.txt' dsd dlm='|' truncover ;
   input record_type $ @ ;
   if record_type='RA';
   .....
run;

data rs ;
   infile 'myfile.txt' dsd dlm='|' truncover ;
   input record_type $ @ ;
   if record_type='RS';
   .....
run;

....

View solution in original post


All Replies
Super User
Posts: 10,516

Re: reading pipe delimited file with multiple record layouts

Did anyone give you a data dictionary or file layout document? If not, you will need to tell us in excrutiating detail how you know which record format descripter is used for any given line of data and what any given "column" characterstics should be.

If the same "variables" meaning they will have the same variable name occurs in different row layouts then you will need to specify that as well.

New Contributor
Posts: 3

Re: reading pipe delimited file with multiple record layouts

Attached are the field descriptions based on the first field of each record

Solution
‎02-07-2017 09:38 AM
Super User
Super User
Posts: 6,502

Re: reading pipe delimited file with multiple record layouts

You can read that data dictionary into a SAS dataset and use it to write the SAS program. You will need to convert descriptive names into valid SAS variable names. You will need to convert the data types into SAS variable types and at least for the date variables also FORMAT and INFORMAT settings.  

 

Also to make it easier you might want to just write a separate data step for each record type, rather than doing it all in one data step. The program will be easier to write and understand. And unless you have a hugh data file it shouldn't take long to read it multiple times.

 

data ra ;
   infile 'myfile.txt' dsd dlm='|' truncover ;
   input record_type $ @ ;
   if record_type='RA';
   .....
run;

data rs ;
   infile 'myfile.txt' dsd dlm='|' truncover ;
   input record_type $ @ ;
   if record_type='RS';
   .....
run;

....
Super User
Super User
Posts: 6,502

Re: reading pipe delimited file with multiple record layouts

[ Edited ]

You just need to know the fields that each record type contains.

Since you didn't tell us what information is in which type of record, let's just make up simplified example.

Let's assume the first field is the record type, let's call it TYPE, and the second is the identifier, lets' call it ID.

Now let's say the type1 contains five fields, let's call them A1-A5. and type2 contain three fields that we can call B1-B3.

You will probably want to write the record types to different datasets sicne they have different variables.

So let's setup a program to do that.  It is just a minor modifciation from the program you would want to make for reading any delimited file.  First you define your variables and set your input location.

length type $3 id $8 A1-A5 $20 B1-B3 $10 ;
infile cards dsd dlm='|' truncover ;

Normally then all you need is to write a simple INPUT statement that just lists the variable names in the order they are in the source file.  But in this case we need to add something to support writing out two or more output datasets and for reading in different variables based on the record type.

data
      type1 (keep=id A1-A5) 
      type2 (keep=id B1-B3) 
;
  length type $3 id $8 A1-A5 $20 B1-B3 $10 ;
  infile cards dsd dlm='|' truncover ;
  input type id @ ;
  if type='AA' then do;
     input A1-A5 ;
     output type1;
  end;
  else if type='BB' then do;
     input B1-B3 ;
     output type2;
  end;
  else do;
     put 'ERROR: Invalid record type. ' type= / _infile_;
     delete;
  end;
run;

 

New Contributor
Posts: 3

Re: reading pipe delimited file with multiple record layouts

Thank you. I was able to successufly read the files in with your help.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 174 views
  • 0 likes
  • 3 in conversation