@Addison
The attached sample data really helps to understand what you're dealing with.
Below sample code reads your source data into a long format which is often an easier structure to deal with.
%let source_file=c:\temp\test file.txt;
data long(drop=_:);
attrib
id length=$15 informat=$15.
fin length=$15 informat=$15.
_lab_dates length=$400 informat=$400.
_lab_names length=$400 informat=$400.
_lab_values length=$400 informat=$400.
_lab_units length=$400 informat=$400.
lab_date length=8 informat=mmddyy10. format=date9.
lab_name length=$40
/* lab_value length=8*/
lab_value length=$40
lab_unit length=$40
;
infile "&source_file" dlm='09'x dsd truncover firstobs=2 lrecl=1650;
input id fin _lab_dates _lab_names _lab_values _lab_units;
/* start: if rows without lab data not required then remove below code */
if cmiss(_lab_dates, _lab_names, _lab_values, _lab_units)=4 then output;
else
/* end: of remove section */
do;
_loop_cnt=1+max(countc(_lab_dates,'|'), countc(_lab_names,'|'), countc(_lab_values,'|'), countc(_lab_units,'|'));
do _i=1 to _loop_cnt;
lab_date=input(scan(_lab_dates,_i,'|','M'),mmddyy10.);
lab_name=scan(_lab_names,_i,'|','M');
/* lab_value=input(scan(_lab_values,_i,'|'),best32.);*/
lab_value=scan(_lab_values,_i,'|','M');
lab_unit=scan(_lab_units,_i,'|','M');
output;
end;
end;
run;
I wasn't sure if you also need the rows without Lab data. If not then remove the section as per comment in the code.
Changes - new code version
1. change type of lab_value to character because there are "yes" strings in the source data (is this a DQ issue?)
2. add 'M' modifier to scan() function to read data correctly in case of missing data elements
3. add '1' to loop counter to read all the pipe delimited data elements
... View more