BookmarkSubscribeRSS Feed
Parthach
Fluorite | Level 6

Hai All,

 

data a;
input details &:$50.;
cards;
my name is parthasaradhi
i am from dowlaiswaram
it is nearer to rajahmundry
;
run;

 

I want to extract 3rd word from each line using character function other than scan function.

 

new variable values is like this:

 

is

from

nearer

 

Thanks in Advance.

 

11 REPLIES 11
Kurt_Bremser
Super User

"I want to extract 3rd word from each line using character function other than scan function."

 

And next you try to run without using your feet?

The scan() function is THE tool for this, use it.

Kurt_Bremser
Super User

If this is actually some kind of homework, then you are supposed to solve it, not we.

Hints:

Use a loop with the findc() function to find the second occurence of a blank. Then find the next occurence of a blank (keep in mind that all character variables are padded with blanks), and use substr() with the positions you determined.

Parthach
Fluorite | Level 6
Hai Mr. Bremser,
I am trying to do it but how to find out second occurrence of blank.

data b;
set a;
info=findc(details," ");
run;
it is showing first occurrence of blank but not second.
Help me
.
Thanks
Kurt_Bremser
Super User

As I said, you need to do a loop to find the second occurence. One feature of findc is that you can supply a start position for the search.

start = findc(details,' ');

will find the first occurence.

start = findc(details,' ',start+1);

will find the next occurence after that. If you repeat that until you get the nth occurence (2 if you want to find the 3rd word), you have the start for your substring. How you find the end should be quite obvious now. Note that the third parameter for substr() has to be a length, not a position, so you need to do a little calculation there.

Parthach
Fluorite | Level 6
Hai Kurt Bremser,
Thanks for your suggestions.
here is the code:

/*without scan function*/
data c;
set a;
info=findc(details,"",4+1);
infon=findc(details,"",9+1);
ext=substr(details,info,infon-info);
run;

Finally i got it.
Thanks for the support.

Partha.

Shmuel
Garnet | Level 18

As you do not want to use SCAN function, you can loop the line character by character, using SUBSTR function,

check for a space (or any other delimiter), count it and select all characters between the 2nd and the 3rd space/delimiter.

Amir
PROC Star

Hi,

 

Well done on presenting your input data in the form of a data step.

 

Is there a reason you want to avoid using the scan() function, e.g., this is a homework exercise, curiosity, you want to practice using other character manipulation functions, your boss said so (in which case ask your boss why), etc.?

 

As has already been indicated by @Kurt_Bremser, each word is separated by spaces, so scan() would ordinarily be the way to go, so giving us the reason why you don't want to use scan() might help us understand your thinking for asking, so that we can advise accordingly.

 

 

Regards,

Amir.

Amir
PROC Star

Hi,

 

A less serious, but potentially still valid response, is if you're not allowed to use the scan() function, then how about using the %scan() function?

 

EDIT: Or even the scanq() function?

 

Regards,

Amir.

 

s_lassen
Meteorite | Level 14

Many ways to skin that cat. I would suggest using PRX (Pearl Regular Expressions), as they are worth learning:

data want;
  set a;
  prxid=prxparse('/^\s*\S+\s+\S+\s+(\S+)/');
  drop prxid;
  length thirdword $10;
  if prxmatch(prxid,details) then
    thirdword=prxposn(prxid,1,details);
run;

A short explanation:

^ Means beginning of string

\s* Means zero or more blanks (or other whitespace characters, such as tabs)

\S+ Means one or more non-blanks (non-whitespace)

\s+ Means one or more blanks

(\S+) puts the third occurence of one or more non-blanks in a capture buffer (the one and only in this case). 

PRXPOSN then retrieves the contents of the first capture buffer.

 

But sometimes you need to do stuff where it would be very handy if the variable you wanted to parse was in a file, so that you could use INPUT statements to parse it. In that case you do not have to write the whole dataset to a file and then read it, you can just use the fact that the _INFILE_ variable contains the input buffer, and it can be written to:

data want;
  infile sasautos(verify.sas) ;
  if _N_= 1 then input @@;
  set a;
  _infile_=details;
  input @1 thirdword $ thirdword $ thirdword $ @@ ;
run;

I used SASAUTOS(VERIFY.SAS) as the infile, as the file must exist, and this macro seems to exist on most SAS installations.

Ksharp
Super User
 

data a;
input details &:$50.;
details=compbl(details);
n=0;
do i=1 to length(details);
 if char(details,i)=' ' then do;n+1;if n=2 then s=i;if n=3 then e=i;end;
end;
want=substr(details,s,e-s);
cards;
my name is parthasaradhi
i am from dowlaiswaram
it is nearer to rajahmundry
;
run;
novinosrin
Tourmaline | Level 20
data a;
input details & $50.;
cards;
my name is parthasaradhi
i am from dowlaiswaram
it is nearer to rajahmundry
;
run;

data output;
set a;
k=substr(details,anyspace(details)+1);
k1=substr(k,anyspace(strip(k))+1);
want=substr(k1,1,anyspace(k1));
drop k:;
run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1888 views
  • 5 likes
  • 7 in conversation