BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Crit_Viet
Fluorite | Level 6

The block quote below is a sample of my data stored in a .txt file. Simply, I can import my expected data if it only has headers and data values, but in this case, there are some redundant information which I should remove to make a data set. Please help to generate a set of code  that I can import this data from txt file and make a data set including only ID,Name,Department,DOB and corresponding data values. Other information such as "Date: 20180128" , "THIS IS THE TITLE" , "Page 1" should be removed.

 

Hope for the support with appreciation.

 

 

Date:20180128                                                                                                   Page 1

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            1      John     Math               1980/01/30

            2      Peter    Physics           1985/02/15

 

 

Date:20180128                                                                                                   Page 2

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            3      Pop     Math                1982/05/30

            4      Mary    IT                    1985/07/15

 

Date:20180128                                                                                                   Page 3

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            5      Kata     Math                1982/05/30

            6      Tom      IT                    1985/07/15

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

You could key on the presence of a date in the fourth field:

 

data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
    id = input(t1, best.);
    name = t2;
    department = t3;
    output;
    end;
keep id name department DOB;
format DOB yymmdd10.;
run;

proc print; run;
PG

View solution in original post

5 REPLIES 5
Shmuel
Garnet | Level 18

Try next code:

filename fin ' <file path and name> ';

data want;
     length name $15 department $15;  /* addapt to max expected length */
     informat dob yymmdd10.;

      infile fin truncover;
      retain phase 0;

     do while phase=0;
           input a_line $80.;  /* addapt to max lenght of a line */
           if scan(a_file,1,' ') = 'ID' then phase=1;
    end;

    do while phase=1;
          input id @;
          if missing(id) then phase=0;
          else input name department dob;
    end;
run;
            
          
          
Crit_Viet
Fluorite | Level 6

Dear Shmuel,

 

Somehow I understand your approach, but unfortunately, the ID was generated by a system, so the ID could be any number. Therefore, using the number "1" is not good for initialize the input process , I think.

 

However, I learn something from your approach. Many thanks 🙂

Shmuel
Garnet | Level 18

Have you tried the code ?

ID can be any numeric value, unless it may be alphanumeric - in such case add ID to the length statement.

 

Run the code and in case of any issue please post the log.

PGStats
Opal | Level 21

You could key on the presence of a date in the fourth field:

 

data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
    id = input(t1, best.);
    name = t2;
    department = t3;
    output;
    end;
keep id name department DOB;
format DOB yymmdd10.;
run;

proc print; run;
PG
Crit_Viet
Fluorite | Level 6
Hi PG Stats,

Sorry for this late reply, I found the solution based on your approach. When the value imported is blank or specific values such as date:.... which we can identify, just remove them and "keep" the expected values.

Many thanks all

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 1114 views
  • 1 like
  • 3 in conversation