The block quote below is a sample of my data stored in a .txt file. Simply, I can import my expected data if it only has headers and data values, but in this case, there are some redundant information which I should remove to make a data set. Please help to generate a set of code that I can import this data from txt file and make a data set including only ID,Name,Department,DOB and corresponding data values. Other information such as "Date: 20180128" , "THIS IS THE TITLE" , "Page 1" should be removed.
Hope for the support with appreciation.
Date:20180128 Page 1
THIS IS THE TITLE
ID Name Department DOB
1 John Math 1980/01/30
2 Peter Physics 1985/02/15
Date:20180128 Page 2
THIS IS THE TITLE
ID Name Department DOB
3 Pop Math 1982/05/30
4 Mary IT 1985/07/15
Date:20180128 Page 3
THIS IS THE TITLE
ID Name Department DOB
5 Kata Math 1982/05/30
6 Tom IT 1985/07/15
You could key on the presence of a date in the fourth field:
data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
id = input(t1, best.);
name = t2;
department = t3;
output;
end;
keep id name department DOB;
format DOB yymmdd10.;
run;
proc print; run;
Try next code:
filename fin ' <file path and name> ';
data want;
length name $15 department $15; /* addapt to max expected length */
informat dob yymmdd10.;
infile fin truncover;
retain phase 0;
do while phase=0;
input a_line $80.; /* addapt to max lenght of a line */
if scan(a_file,1,' ') = 'ID' then phase=1;
end;
do while phase=1;
input id @;
if missing(id) then phase=0;
else input name department dob;
end;
run;
Dear Shmuel,
Somehow I understand your approach, but unfortunately, the ID was generated by a system, so the ID could be any number. Therefore, using the number "1" is not good for initialize the input process , I think.
However, I learn something from your approach. Many thanks 🙂
Have you tried the code ?
ID can be any numeric value, unless it may be alphanumeric - in such case add ID to the length statement.
Run the code and in case of any issue please post the log.
You could key on the presence of a date in the fourth field:
data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
id = input(t1, best.);
name = t2;
department = t3;
output;
end;
keep id name department DOB;
format DOB yymmdd10.;
run;
proc print; run;
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.