BookmarkSubscribeRSS Feed
Season_TT
Calcite | Level 5

Question:

txt file name:ABC

LineA 1022 John Chan
LineB 2/1/2021 23,000 42
LineA 3012 Susan Kwan
LineB 16/1/2021 12,000 38
LineA 1120 Tom Wong
LineB 13/01/2021 9,000 .
LineA 2401 Judy Wang
LineB 12/01/2021 135,000.5 59
LineA 2401 George Cheung
LineB 12/01/2021 . 61

 

I don't know how to import/infile correctly

Name can't read all

AppDate can't be changed into date form

income will be missing

Age also can't be read

 

12 REPLIES 12
PaigeMiller
Diamond | Level 26

What is the desired result?

 

Are there ALWAYS two rows for each person; there is never one row for a person and there are never three (or more) rows for a person?

--
Paige Miller
Season_TT
Calcite | Level 5

it is two rows for each person.
I only need to read this txt correctly just mean finish task1.

I have read this code:

Season_TT_0-1635507250166.png

and output

Season_TT_1-1635507269176.png

but it is not correct because it should be Name rather than FName and SName.

Also, Income and Age can't read success.

(Actually, AppDate should be date in form of DD/MM/YYYY.)

 

I think i can finish other after read txt correctly.

PaigeMiller
Diamond | Level 26
data have;
input string $200.;
infile cards truncover;
cards;
LineA 1022 John Chan
LineB 2/1/2021 23,000 42
LineA 3012 Susan Kwan
LineB 16/1/2021 12,000 38
LineA 1120 Tom Wong
LineB 13/01/2021 9,000 .
LineA 2401 Judy Wang
LineB 12/01/2021 135,000.5 59
LineA 2401 George Cheung
LineB 12/01/2021 . 61
;
--
Paige Miller
Season_TT
Calcite | Level 5

no, I don't need to use datafile. I need to read into SAS data set.

Also, LineA and LineB is not necessary.

 

I want the output is this:

Season_TT_0-1635507905816.png

 

But my result now:

Season_TT_0-1635508825144.png

 

 

 

 

PaigeMiller
Diamond | Level 26

 

data have;
input @7 cno name $char30. / @7 appdate ddmmyy10. income :comma12. age;
format appdate ddmmyys8.;
cards;
LineA 1022 John Chan
LineB 2/1/2021 23,000 42
LineA 3012 Susan Kwan
LineB 16/1/2021 12,000 38
LineA 1120 Tom Wong
LineB 13/01/2021 9,000 .
LineA 2401 Judy Wang
LineB 12/01/2021 135,000.5 59
LineA 2401 George Cheung
LineB 12/01/2021 . 61
;
--
Paige Miller
Season_TT
Calcite | Level 5

Based on your code:

Season_TT_1-1635509246178.png

 

Output:

Season_TT_2-1635509262015.png

 

Problems:
1. Name is missing

2. first row Income is wrong

 

PaigeMiller
Diamond | Level 26

@Season_TT 

From the inputs you provided, it works for me properly. So something in the inputs you provided does not match the actual file loan_app.txt.

--
Paige Miller
Season_TT
Calcite | Level 5

I just copy and paste the txt file

 

but i have correct something

 

Season_TT_0-1635509735808.png

 

 

OUTPUT:

Season_TT_1-1635509754813.png

 

Only the first Income is wrong which should be 23,000

PaigeMiller
Diamond | Level 26

Add a colon to the INPUT statement before ddmmyy10.

 

input @7 cno name $char30. / @7 appdate :ddmmyy10. income :comma12. age;
--
Paige Miller
Season_TT
Calcite | Level 5
it's okay.
I have submitted.
Thanks for helping.
At least I can finish it because of your way.
Thank you so much.
Ksharp
Super User

You could try this one :

 

data have;
infile cards dlm=' ';
input dummy1 : $20. cno name & $30. / 
      dummy2 : $20. appdate : ddmmyy10. income :comma12. age;
format appdate ddmmyys8.;
cards;
LineA 1022 John Chan
LineB 2/1/2021 23,000 42
LineA 3012 Susan Kwan
LineB 16/1/2021 12,000 38
LineA 1120 Tom Wong
LineB 13/01/2021 9,000 .
LineA 2401 Judy Wang
LineB 12/01/2021 135,000.5 59
LineA 2401 George Cheung
LineB 12/01/2021 . 61
;
Tom
Super User Tom
Super User

One way to deal with lines that have different fields based on their "type" is to use trailing @ on the INPUT so you can execute different INPUT statements to read the rest of the line.

 

First let's create a file with your test data.

options parmcards=example;
filename example temp;
parmcrds;
LineA 1022 John Chan
LineB 2/1/2021 23,000 42
LineA 3012 Susan Kwan
LineB 16/1/2021 12,000 38
LineA 1120 Tom Wong
LineB 13/01/2021 9,000 .
LineA 2401 Judy Wang
LineB 12/01/2021 135,000.5 59
LineA 2401 George Cheung
LineB 12/01/2021 . 61
;

Now let's use an INFILE statement with the FILEREF we just created that points to that text.

data want ;
  infile example truncover ;
  input type $ @ ;
  if type='LineA' then input cno $ name $30. ;
  else if type='LineB' then do;
     input appdate :ddmmyy. income :comma. age ;
     output;
     call missing(of _all_);
  end;
  retain cno name;
  format appdate yymmdd10.;
  drop type;
run;

Results:

Obs    cno     name                appdate     income     age

 1     1022    John Chan        2021-01-02     23000.0     42
 2     3012    Susan Kwan       2021-01-16     12000.0     38
 3     1120    Tom Wong         2021-01-13      9000.0      .
 4     2401    Judy Wang        2021-01-12    135000.5     59
 5     2401    George Cheung    2021-01-12          .      61

So what is the code doing?

DATA statement names the dataset the step is going to create, WANT in this case.

INFILE statement points to the file to read. I am using a fileref here, but you could also just use the actual name of the file in quotes instead.  The TRUNCOVER option says to not go to the next line to hunt for data if there are not enough values to satisfy the INPUT statement.  Note the MISSOVER option is almost never wanted as the only different between it and TRUNCOVER is that MISSOVER will ignore text at the end of the line if it happens to be shorter than the width of the formatted input request.  TRUNCOVER will instead use that "truncated" text.

The first INPUT statement is reading that first column of text in your file so it can use that to tell what type of line it is.

Then the IF/THEN/ELSE is deciding which code to run based on the TYPE read.

When the type is LineA it reads the CNO (note this looks like an identifier and not an actual number so read it as character) and NAME.  For the CNO I just used the bare $ modifier on the INPUT statement to let SAS know I want it to be character.  SAS will default the length to 8 bytes.  For the name I use formatted input so I can read the whole name into one field.  In real life names do not come in neat first/last single word pairs. By using formatted input it will read the space into the NAME instead of treating it as a delimiter.

For LineB we want to do more than one statement so we use a DO/END block.  For two of the numbers on the line we need to use a special informat to read them.  But we want the INPUT to still use the space as a delimiter so we prefix the informat with the  : modifier.  Note that we don't need to specify the width of the informat here as it will be ignored anyway.  The AGE does not need any special informat to be read.  The OUTPUT statement will write the observation the dataset at this point.  The CALL MISSING will clear out the values retain from the the LineA records, just in case there are any ids' that only have LineB without LineA.

The RETAIN statement says to remember the values read on the LineA line so they are still there when we read the LineB line.

The FORMAT statement says to attach a special format to the appdate variable so humans will recognize the values are dates.  Note that I used YMD order on the dates. It you use DMY or MDY order then either way half of your audience will be confused.  The other variables do not need formats attached to them, SAS already knows how to display numbers and strings.  But you might want to attach a format to INCOME so it will always show two decimal places.  Use the COMMA or DOLLAR format if you want the thousands separator to print. Make sure to include those punctuation characters when calculating what width to use on the format.

The DROP statement says we don't want to have the TYPE variable on the output dataset.

The RUN statement let's SAS know the step is done being described and it can start trying to interpret the code and run it.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 12 replies
  • 974 views
  • 1 like
  • 4 in conversation