BookmarkSubscribeRSS Feed
skt
Calcite | Level 5 skt
Calcite | Level 5

Hi all,

 

Lets say the data is as below 

Bellatorum School: CSULA Time: 1:40.5
The Kraken School: ASU Time: 1:45.35
Black Widow School: UoA Time: 1:33.7
Koicrete School: CSUF Time: 1:40.25
Khaos School: UNLV Time: 2:03.45
Max School: UCSD Time: 1:26.47
Hakuna Matata School: UCLA Time: 1:20.64
Prospector School: CPSLO Time: 1:12.08
Andromeda School: CPP Time: 1:25.1
Kekoapohaku School: UHM Time: 1:24.49

 

if I want to read the text before the word "School:" how to go about it.

 

@n is useful to move to a column, but I want to identify a column @'School:' and move back from there.

 

Alternatively can we track the @n column where the dataline is held and use it to a calculation

 

*being a noob, pls be kind ;

5 REPLIES 5
Norman21
Lapis Lazuli | Level 10

You can read the whole thing, then extract the text that you need. Here is an example:

 

https://communities.sas.com/t5/SAS-Procedures/Identifying-and-removing-part-of-a-string-of-text/td-p...

Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

FreelanceReinh
Jade | Level 19

@skt wrote:

@n is useful to move to a column, but I want to identify a column @'School:' and move back from there.

 

Alternatively can we track the @n column where the dataline is held and use it to a calculation


Hi @skt,

 

I guess many people would indeed read the whole data line (e.g. using an "empty" INPUT statement so that the text is captured in automatic variable _INFILE_), then search for 'School' (e.g. using the FIND function to determine the position) and finally extract the appropriate substring (e.g. name=substr(_infile_,1,pos-2)), as suggested by Norman21.

 

But more specifically to your question: Yes, we can move back from the column we have found using the @'character-string' syntax and yes, we can track the position where the input pointer is currently located and use that in a calculation:

 

data want;
infile cards col=c;
input @'School' @;
c=c-8;
input @1 name $varying40. c;
cards;
Bellatorum School: CSULA Time: 1:40.5
The Kraken School: ASU Time: 1:45.35
Black Widow School: UoA Time: 1:33.7
;
  • I use the INFILE statement to make the COL= (or COLUMN=) option available.
  • C is the (user-specified) name of a variable to store the current column location of the input pointer.
  • After the first INPUT statement, C will contain the position (column number) of the first character following the word "School".
  • The trailing "@" holds the input record so that the second INPUT statement can read from it, too.
  • By subtracting 8 (or 7), C gets the position of the last character of the name (or of the following blank, respectively).
  • This value is used as the length specification for the $VARYING40. informat. which reads the name as a string of the specified length (max. 40 -- increase the number if you're not sure that 40 is sufficient) after the input pointer has been moved back to column 1 by means of the "@1" column pointer control.

Note that the @'character-string' syntax lets SAS go to the next line if 'character-string' is not found. So, records not containing the word "School" would be skipped and a note in the log "NOTE: SAS went to a new line ..." would indicate this. The other suggested method would be more flexible regarding this situation. For example, if "School" was not found, a different assignment statement could be executed (e.g. name=scan(_infile_,1,':')).

 

Edit: Changed all occurrences of "schoolname" and "school name" to "name" after realizing that CSULA, ASU etc. are the real "school" (i.e. university) names. Thanks @Tom for clarifying this for the non-native speakers. 🙂

andreas_lds
Jade | Level 19

If you want a more arcane solution, you could use a regular expression:

 

data work.want;

    length 
        SchoolName $ 50
        rx 8
    ;
    drop rx;
	
    input;
	
    rx = prxparse('/(.+) school.*/i');
	
    if prxmatch(rx, _infile_) then do;
        SchoolName = prxposn(rx, 1, _infile_);
    end;
	
    datalines;
Bellatorum School: CSULA Time: 1:40.5
The Kraken School: ASU Time: 1:45.35
Black Widow School: UoA Time: 1:33.7
Koicrete School: CSUF Time: 1:40.25
Khaos School: UNLV Time: 2:03.45
Max School: UCSD Time: 1:26.47
Hakuna Matata School: UCLA Time: 1:20.64
Prospector School: CPSLO Time: 1:12.08
Andromeda School: CPP Time: 1:25.1
Kekoapohaku School: UHM Time: 1:24.49
;
run;
Tom
Super User Tom
Super User

Use the COL= option on the INFILE statement to have SAS tell you where the current pointer is.

You can use @ and + in INPUT statement to move the pointer if you want.

 

For your particular example I wouldn't bother.  Just read the data using ':' as the delimiter.  Read the last field using a format so that it will read the colon's into the value.  Then remove the "School" and "Time" from the values and convert the time string into a format that SAS can interpret.

data want;
  length name $50 school $20 time 8;
  infile datalines dsd dlm=':' truncover ;
  input x1 :$100. x2 :$100. x3 $100. ;
  drop x1-x3;
  name=tranwrd(x1,'School    ',' ');
  school=tranwrd(x2,'Time    ',' ');
  time=input(cats('00:',x3),time20.);
  format time time11.2;
datalines;
Bellatorum School: CSULA Time: 1:40.5
The Kraken School: ASU Time: 1:45.35
Black Widow School: UoA Time: 1:33.7
Koicrete School: CSUF Time: 1:40.25
Khaos School: UNLV Time: 2:03.45
Max School: UCSD Time: 1:26.47
Hakuna Matata School: UCLA Time: 1:20.64
Prospector School: CPSLO Time: 1:12.08
Andromeda School: CPP Time: 1:25.1
Kekoapohaku School: UHM Time: 1:24.49
;
proc print; run;
Obs    name             school           time

  1    Bellatorum       CSULA      0:01:40.50
  2    The Kraken       ASU        0:01:45.35
  3    Black Widow      UoA        0:01:33.70
  4    Koicrete         CSUF       0:01:40.25
  5    Khaos            UNLV       0:02:03.45
  6    Max              UCSD       0:01:26.47
  7    Hakuna Matata    UCLA       0:01:20.64
  8    Prospector       CPSLO      0:01:12.08
  9    Andromeda        CPP        0:01:25.10
 10    Kekoapohaku      UHM        0:01:24.49
ballardw
Super User

Having dealt with a similar issue I ask "Does every single one of your schools have "School" as the last word when present?"

I had to deal with "Idaho School for the Deaf and Blind" and "The School at <place name" constructs which meant more than a simple replacement was needed for a few cases.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1040 views
  • 2 likes
  • 6 in conversation