Hello. I am trying to read in pipe delimited data that has multiple record layouts. I have tried several methods that get me close but with no success. There are 3 issues I am encountering with the data (1) There are no end-of-record pipes (2) The first field will determing the remaining record layouts (3) I do not have position information for the fields, they are only seperated by "|". Below is the first method I have tried in SAS 9.4
Method 1:
data want;
infile indata dlm='|' firstobs=1;
input @1 rec_type $2.;
if rec_type='RA';
input rec_type $2.
var2 $2
;
run;
Method 2:
data want ;
%let _EFIERR_ = 0; /* set the ERROR detection macro variable */
infile indata
delimiter = '|' DSD lrecl=32767 ;
informat rec_type $2.;
input rec_type $;
if rec_type='RA' then do;
informat var2 $2.;
format var2 ;
end;
run;
My problem is trying to address the changing record layout and telling says where the end of the record is. Attached is a sample of the data file I am working with.
You can read that data dictionary into a SAS dataset and use it to write the SAS program. You will need to convert descriptive names into valid SAS variable names. You will need to convert the data types into SAS variable types and at least for the date variables also FORMAT and INFORMAT settings.
Also to make it easier you might want to just write a separate data step for each record type, rather than doing it all in one data step. The program will be easier to write and understand. And unless you have a hugh data file it shouldn't take long to read it multiple times.
data ra ;
infile 'myfile.txt' dsd dlm='|' truncover ;
input record_type $ @ ;
if record_type='RA';
.....
run;
data rs ;
infile 'myfile.txt' dsd dlm='|' truncover ;
input record_type $ @ ;
if record_type='RS';
.....
run;
....
Did anyone give you a data dictionary or file layout document? If not, you will need to tell us in excrutiating detail how you know which record format descripter is used for any given line of data and what any given "column" characterstics should be.
If the same "variables" meaning they will have the same variable name occurs in different row layouts then you will need to specify that as well.
Attached are the field descriptions based on the first field of each record
You can read that data dictionary into a SAS dataset and use it to write the SAS program. You will need to convert descriptive names into valid SAS variable names. You will need to convert the data types into SAS variable types and at least for the date variables also FORMAT and INFORMAT settings.
Also to make it easier you might want to just write a separate data step for each record type, rather than doing it all in one data step. The program will be easier to write and understand. And unless you have a hugh data file it shouldn't take long to read it multiple times.
data ra ;
infile 'myfile.txt' dsd dlm='|' truncover ;
input record_type $ @ ;
if record_type='RA';
.....
run;
data rs ;
infile 'myfile.txt' dsd dlm='|' truncover ;
input record_type $ @ ;
if record_type='RS';
.....
run;
....
You just need to know the fields that each record type contains.
Since you didn't tell us what information is in which type of record, let's just make up simplified example.
Let's assume the first field is the record type, let's call it TYPE, and the second is the identifier, lets' call it ID.
Now let's say the type1 contains five fields, let's call them A1-A5. and type2 contain three fields that we can call B1-B3.
You will probably want to write the record types to different datasets sicne they have different variables.
So let's setup a program to do that. It is just a minor modifciation from the program you would want to make for reading any delimited file. First you define your variables and set your input location.
length type $3 id $8 A1-A5 $20 B1-B3 $10 ;
infile cards dsd dlm='|' truncover ;
Normally then all you need is to write a simple INPUT statement that just lists the variable names in the order they are in the source file. But in this case we need to add something to support writing out two or more output datasets and for reading in different variables based on the record type.
data
type1 (keep=id A1-A5)
type2 (keep=id B1-B3)
;
length type $3 id $8 A1-A5 $20 B1-B3 $10 ;
infile cards dsd dlm='|' truncover ;
input type id @ ;
if type='AA' then do;
input A1-A5 ;
output type1;
end;
else if type='BB' then do;
input B1-B3 ;
output type2;
end;
else do;
put 'ERROR: Invalid record type. ' type= / _infile_;
delete;
end;
run;
Thank you. I was able to successufly read the files in with your help.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.