One way to deal with lines that have different fields based on their "type" is to use trailing @ on the INPUT so you can execute different INPUT statements to read the rest of the line.
First let's create a file with your test data.
options parmcards=example;
filename example temp;
parmcrds;
LineA 1022 John Chan
LineB 2/1/2021 23,000 42
LineA 3012 Susan Kwan
LineB 16/1/2021 12,000 38
LineA 1120 Tom Wong
LineB 13/01/2021 9,000 .
LineA 2401 Judy Wang
LineB 12/01/2021 135,000.5 59
LineA 2401 George Cheung
LineB 12/01/2021 . 61
;
Now let's use an INFILE statement with the FILEREF we just created that points to that text.
data want ;
infile example truncover ;
input type $ @ ;
if type='LineA' then input cno $ name $30. ;
else if type='LineB' then do;
input appdate :ddmmyy. income :comma. age ;
output;
call missing(of _all_);
end;
retain cno name;
format appdate yymmdd10.;
drop type;
run;
Results:
Obs cno name appdate income age
1 1022 John Chan 2021-01-02 23000.0 42
2 3012 Susan Kwan 2021-01-16 12000.0 38
3 1120 Tom Wong 2021-01-13 9000.0 .
4 2401 Judy Wang 2021-01-12 135000.5 59
5 2401 George Cheung 2021-01-12 . 61
So what is the code doing?
DATA statement names the dataset the step is going to create, WANT in this case.
INFILE statement points to the file to read. I am using a fileref here, but you could also just use the actual name of the file in quotes instead. The TRUNCOVER option says to not go to the next line to hunt for data if there are not enough values to satisfy the INPUT statement. Note the MISSOVER option is almost never wanted as the only different between it and TRUNCOVER is that MISSOVER will ignore text at the end of the line if it happens to be shorter than the width of the formatted input request. TRUNCOVER will instead use that "truncated" text.
The first INPUT statement is reading that first column of text in your file so it can use that to tell what type of line it is.
Then the IF/THEN/ELSE is deciding which code to run based on the TYPE read.
When the type is LineA it reads the CNO (note this looks like an identifier and not an actual number so read it as character) and NAME. For the CNO I just used the bare $ modifier on the INPUT statement to let SAS know I want it to be character. SAS will default the length to 8 bytes. For the name I use formatted input so I can read the whole name into one field. In real life names do not come in neat first/last single word pairs. By using formatted input it will read the space into the NAME instead of treating it as a delimiter.
For LineB we want to do more than one statement so we use a DO/END block. For two of the numbers on the line we need to use a special informat to read them. But we want the INPUT to still use the space as a delimiter so we prefix the informat with the : modifier. Note that we don't need to specify the width of the informat here as it will be ignored anyway. The AGE does not need any special informat to be read. The OUTPUT statement will write the observation the dataset at this point. The CALL MISSING will clear out the values retain from the the LineA records, just in case there are any ids' that only have LineB without LineA.
The RETAIN statement says to remember the values read on the LineA line so they are still there when we read the LineB line.
The FORMAT statement says to attach a special format to the appdate variable so humans will recognize the values are dates. Note that I used YMD order on the dates. It you use DMY or MDY order then either way half of your audience will be confused. The other variables do not need formats attached to them, SAS already knows how to display numbers and strings. But you might want to attach a format to INCOME so it will always show two decimal places. Use the COMMA or DOLLAR format if you want the thousands separator to print. Make sure to include those punctuation characters when calculating what width to use on the format.
The DROP statement says we don't want to have the TYPE variable on the output dataset.
The RUN statement let's SAS know the step is done being described and it can start trying to interpret the code and run it.
... View more