Help using Base SAS procedures

How to import header of columns and data values only without redundant information

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

How to import header of columns and data values only without redundant information

The block quote below is a sample of my data stored in a .txt file. Simply, I can import my expected data if it only has headers and data values, but in this case, there are some redundant information which I should remove to make a data set. Please help to generate a set of code  that I can import this data from txt file and make a data set including only ID,Name,Department,DOB and corresponding data values. Other information such as "Date: 20180128" , "THIS IS THE TITLE" , "Page 1" should be removed.

 

Hope for the support with appreciation.

 

 

Date:20180128                                                                                                   Page 1

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            1      John     Math               1980/01/30

            2      Peter    Physics           1985/02/15

 

 

Date:20180128                                                                                                   Page 2

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            3      Pop     Math                1982/05/30

            4      Mary    IT                    1985/07/15

 

Date:20180128                                                                                                   Page 3

                                                  THIS IS THE TITLE

 

            ID    Name    Department    DOB

            5      Kata     Math                1982/05/30

            6      Tom      IT                    1985/07/15

 

 


Accepted Solutions
Solution
‎03-04-2018 01:24 AM
Esteemed Advisor
Posts: 5,543

Re: How to import header of columns and data values only without redundant information

Posted in reply to Crit_Viet

You could key on the presence of a date in the fourth field:

 

data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
    id = input(t1, best.);
    name = t2;
    department = t3;
    output;
    end;
keep id name department DOB;
format DOB yymmdd10.;
run;

proc print; run;
PG

View solution in original post


All Replies
Trusted Advisor
Posts: 1,837

Re: How to import header of columns and data values only without redundant information

[ Edited ]
Posted in reply to Crit_Viet

Try next code:

filename fin ' <file path and name> ';

data want;
     length name $15 department $15;  /* addapt to max expected length */
     informat dob yymmdd10.;

      infile fin truncover;
      retain phase 0;

     do while phase=0;
           input a_line $80.;  /* addapt to max lenght of a line */
           if scan(a_file,1,' ') = 'ID' then phase=1;
    end;

    do while phase=1;
          input id @;
          if missing(id) then phase=0;
          else input name department dob;
    end;
run;
            
          
          
Occasional Contributor
Posts: 5

Re: How to import header of columns and data values only without redundant information

Dear Shmuel,

 

Somehow I understand your approach, but unfortunately, the ID was generated by a system, so the ID could be any number. Therefore, using the number "1" is not good for initialize the input process , I think.

 

However, I learn something from your approach. Many thanks Smiley Happy

Trusted Advisor
Posts: 1,837

Re: How to import header of columns and data values only without redundant information

Posted in reply to Crit_Viet

Have you tried the code ?

ID can be any numeric value, unless it may be alphanumeric - in such case add ID to the length statement.

 

Run the code and in case of any issue please post the log.

Solution
‎03-04-2018 01:24 AM
Esteemed Advisor
Posts: 5,543

Re: How to import header of columns and data values only without redundant information

Posted in reply to Crit_Viet

You could key on the presence of a date in the fourth field:

 

data want;
infile "&sasforum\datasets\crit_viet.txt" truncover;
length t1-t4 $32;
input (t1-t4) (&);
DOB = input(t4, ?? yymmdd10.);
if not missing(DOB) then do;
    id = input(t1, best.);
    name = t2;
    department = t3;
    output;
    end;
keep id name department DOB;
format DOB yymmdd10.;
run;

proc print; run;
PG
Occasional Contributor
Posts: 5

Re: How to import header of columns and data values only without redundant information

Hi PG Stats,

Sorry for this late reply, I found the solution based on your approach. When the value imported is blank or specific values such as date:.... which we can identify, just remove them and "keep" the expected values.

Many thanks all
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 221 views
  • 1 like
  • 3 in conversation