BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Crit_Viet
Fluorite | Level 6

The block quote below is a sample of my data stored in a .txt file. Simply, I can import my expected data if it only has headers and data values, but in this case, there are some redundant information which I should remove to make a data set. Please help to generate a set of code  that I can import this data from txt file and make a data set including only ID,Name,Department,DOB and corresponding data values. Other information such as "Date: 20180128" , "THIS IS THE TITLE" , "Page 1" should be removed.

 

Hope for the support with appreciation.

 

 

Date:20180128                                                                                                   Page 1

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            1      John     Math               1980/01/30

            2      Peter    Physics           1985/02/15

 

 

Date:20180128                                                                                                   Page 2

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            3      Pop     Math                1982/05/30

            4      Mary    IT                    1985/07/15

 

Date:20180128                                                                                                   Page 3

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            5      Kata     Math                1982/05/30

            6      Tom      IT                    1985/07/15

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You could key on the presence of a date in the fourth field:

 

data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
    id = input(t1, best.);
    name = t2;
    department = t3;
    output;
    end;
keep id name department DOB;
format DOB yymmdd10.;
run;

proc print; run;
PG

View solution in original post

5 REPLIES 5
Shmuel
Garnet | Level 18

Try next code:

filename fin ' <file path and name> ';

data want;
     length name $15 department $15;  /* addapt to max expected length */
     informat dob yymmdd10.;

      infile fin truncover;
      retain phase 0;

     do while phase=0;
           input a_line $80.;  /* addapt to max lenght of a line */
           if scan(a_file,1,' ') = 'ID' then phase=1;
    end;

    do while phase=1;
          input id @;
          if missing(id) then phase=0;
          else input name department dob;
    end;
run;
            
          
          
Crit_Viet
Fluorite | Level 6

Dear Shmuel,

 

Somehow I understand your approach, but unfortunately, the ID was generated by a system, so the ID could be any number. Therefore, using the number "1" is not good for initialize the input process , I think.

 

However, I learn something from your approach. Many thanks 🙂

Shmuel
Garnet | Level 18

Have you tried the code ?

ID can be any numeric value, unless it may be alphanumeric - in such case add ID to the length statement.

 

Run the code and in case of any issue please post the log.

PGStats
Opal | Level 21

You could key on the presence of a date in the fourth field:

 

data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
    id = input(t1, best.);
    name = t2;
    department = t3;
    output;
    end;
keep id name department DOB;
format DOB yymmdd10.;
run;

proc print; run;
PG
Crit_Viet
Fluorite | Level 6
Hi PG Stats,

Sorry for this late reply, I found the solution based on your approach. When the value imported is blank or specific values such as date:.... which we can identify, just remove them and "keep" the expected values.

Many thanks all

sas-innovate-wordmark-2025-midnight.png

Register Today!

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1604 views
  • 1 like
  • 3 in conversation