Hello SAS Community. I've popped outside my 'ETL DI Studio bubble' to have hit a rather challenging but tricky SAS issue. i'm on SAS 9.4M2 We have a number of file from source which have a specific file specification. The files are named D0149,D0159,D0086 and so on. Each file has its own file spec, BUT, within the file itself, there are a header, a footer, and sandwiched in between is the data we need to read in. however, the first 3 characters of each line are numbered, and that number determines the file spec... per record. for example file D0086 will look like this: ZHV|99899|D0086001|D|NEEB|X|NEEB|29990998199924||||TR01|
196|9899999799996|V|
197|L99XXX9999|D|
198|01|29990528000000|65706.0|X|Y|T|
ZPT|99899|3||1|29990998199924| Group 196 and 197 has 2 fields and group 198 has 6.. The next day we might get similar data file D0086_2, but no 198 group, instead a 199 group which has 2 fields. another example D0150 ZHV|99999|D0150001|M|LBSL|X|NEEB|20179568182352||||TR01|
288|127227343996|20179568|D|x|
289|0158|20179568|Z||
290|XXXTEST011|||40|F|L&G CL27|LBSL|||||||||||K|20179568|20179568|20200101|||H|20179568|
293|L|C|AI|9.00||9||DE|
293|N|C|AI|9.00||7|||
08A|KXXA 12387|20179568|LBSL|
ZPT|99999|6||1|20179568182352| my thoughts are, the first 3 characters is what really tells us what the file spec is going to based on the 3 numbers, but only once the file (D0086, D0086_2 or D0150) is read in, so we have to know what files exist before reading them in and maybe have a control file of every group number and the possible fields with the field lengths? i have tried a basic read in, but this still leaves the pipes in the data, and ideally would need it all on one line. this is not dynamic yet but a start. would you agree with having the file spec as a control table like FILE,GROUP,FIELD_NAME,LENGTH
D0086,196,var_1,13,
D0086,196,var_2,1,
D0086,197,var_3,10,
D0086,197,var_4,1,
D0086,198,var_5,2,
D0086,198,var_6,14,
D0086,198,var_7,10,
D0086,198,var_8,1,
D0086,198,var_9,1,
D0086,198,var_10,1 how we would dynamically build the code is another thing... having had an attempt at the very basic read in but not with the desired results. how do we omit the header and footer? has anyone else ever had data within a file which formats are spread over multiple rows? data test;
infile 'some/file/location/D0086.dat' delimiter = '|' dsd missover firstobs = 2;
input group $3. @;
if group ='196' then input @1 group var_1 $13. var_2 $1.;
if group ='197' then input @1 group var_3 $10. var_4 $1.;
if group ='198' then input @1 group var_5 $2. var_6 $14. var_7 $10. ;
run; any help is appreciated. Matt
... View more