The block quote below is a sample of my data stored in a .txt file. Simply, I can import my expected data if it only has headers and data values, but in this case, there are some redundant information which I should remove to make a data set. Please help to generate a set of code that I can import this data from txt file and make a data set including only ID,Name,Department,DOB and corresponding data values. Other information such as "Date: 20180128" , "THIS IS THE TITLE" , "Page 1" should be removed.
Hope for the support with appreciation.
Date:20180128 Page 1
THIS IS THE TITLE
ID Name Department DOB
1 John Math 1980/01/30
2 Peter Physics 1985/02/15
Date:20180128 Page 2
THIS IS THE TITLE
ID Name Department DOB
3 Pop Math 1982/05/30
4 Mary IT 1985/07/15
Date:20180128 Page 3
THIS IS THE TITLE
ID Name Department DOB
5 Kata Math 1982/05/30
6 Tom IT 1985/07/15
You could key on the presence of a date in the fourth field:
data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
id = input(t1, best.);
name = t2;
department = t3;
output;
end;
keep id name department DOB;
format DOB yymmdd10.;
run;
proc print; run;
Try next code:
filename fin ' <file path and name> ';
data want;
length name $15 department $15; /* addapt to max expected length */
informat dob yymmdd10.;
infile fin truncover;
retain phase 0;
do while phase=0;
input a_line $80.; /* addapt to max lenght of a line */
if scan(a_file,1,' ') = 'ID' then phase=1;
end;
do while phase=1;
input id @;
if missing(id) then phase=0;
else input name department dob;
end;
run;
Dear Shmuel,
Somehow I understand your approach, but unfortunately, the ID was generated by a system, so the ID could be any number. Therefore, using the number "1" is not good for initialize the input process , I think.
However, I learn something from your approach. Many thanks 🙂
Have you tried the code ?
ID can be any numeric value, unless it may be alphanumeric - in such case add ID to the length statement.
Run the code and in case of any issue please post the log.
You could key on the presence of a date in the fourth field:
data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
id = input(t1, best.);
name = t2;
department = t3;
output;
end;
keep id name department DOB;
format DOB yymmdd10.;
run;
proc print; run;
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.