Solved: Re: Reading raw data from a text file

NR13 · Posted 05-17-2018 02:21 PM

Hi - I'm trying to read in a raw text file (.f03), that isn't delimited consistently throughout. I'm able to read it in using a proc import dbms=csv option, and it puts everything into one column. I essentially want to be able to pull 2 different statistics out of this entire file. Is there a way to scan the one variable to keep the stat that comes after "ESTIMATED PROPORTION CONSISTENTLY CLASSIFIED ="? This only appears once in the file.

PaigeMiller · Posted 05-17-2018 03:15 PM

data try;
    set testing;
    if find(var1,'estimated proportion consistently classified','i')>0 then do;
        value=scan(var1,-1)+0;
        output;
    end;
run;

--
Paige Miller

View solution in original post

PaigeMiller · Posted 05-17-2018 02:31 PM

Sure.

In a data step, use the FIND function to see if the line has ESTIMATED PROPORTION CONSISTENTLY CLASSIFIED

If so, then use the SCAN function to retrieve the last "word" which in this case is .875, convert character to numeric, then output to the data set.

--
Paige Miller

NR13 · Posted 05-17-2018 02:44 PM

Thank you! Is there anyway you can give me a syntax example of what you're thinking? I'm having trouble getting the scan function to work without explicitly stating what that statistic is.

PaigeMiller · Posted 05-17-2018 02:55 PM

@NR13 wrote:

Thank you! Is there anyway you can give me a syntax example of what you're thinking? I'm having trouble getting the scan function to work without explicitly stating what that statistic is.

If the line of text has been found to have the desired text string of ESTIMATED PROPORTION CONSISTENTLY CLASSIFIED, then you can use the SCAN function like this (assuming we are not very creative and the text string is a variable named TEXTSTRING)

number = scan(textstring,-1);

--
Paige Miller

NR13 · Posted 05-17-2018 03:04 PM

I'm sorry - I know i'm probably doing something stupid. This is what I'm trying:

Data Try;
Set Testing;
Correct=Scan(find(Var1,"ESTIMATED PROPORTION CONSISTENTLY CLASSIFIED"),-1);
Run;

PaigeMiller · Posted 05-17-2018 03:15 PM

data try;
    set testing;
    if find(var1,'estimated proportion consistently classified','i')>0 then do;
        value=scan(var1,-1)+0;
        output;
    end;
run;

--
Paige Miller

NR13 · Posted 05-17-2018 03:21 PM

You're the best - thank you!!

ballardw · Posted 05-17-2018 02:45 PM

You can also use a column "pointer" based on the value of text. If the key text is not found then nothing would be read. Keep only the records with values. A raw example;

data junk;
   infile datalines missover;
   input @"Some text" x;
datalines;
Some text 123
No key text 456
893 Some text 333
;
run;

And to keep only records with the value:

data junk;
   infile datalines missover;
   input @"Some text" x;
   if x ne .;
datalines;
Some text 123
No key text 456
893 Some text 333
;
run;

LOVE_SAA · Posted 09-11-2018 04:23 AM

I have below data in an input file. I want to scan using a key word in each line and then want the ouput as below

Input File:

Code 201 Created
Date: Mon, 10 Sep 2018 08:30:56 GMT
Location: Asia, Singapore
Length: 150

Output:

Col1	Col2	Col3	Col4
201	Mon, 10 Sep 2018 08:30:56 GMT	Asia, Singapore	150

Col1: After Code, get the 3 digits

Col2: After "Date: " and till end

Col3: After "Location: " and till end

Col4: After "Length: " and till end

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away