data s ;
name='hfisfae afwa ffjeudad judsh sewla' ;
run;
i want to all positions of ' a ' letter in above string
Please post fully and complete questions in future, as requested a few times. Show what you want the output to look like, do you want a variable for each position, a row for each position, the last one, the first one?
data s ;
name='hfisfae afwa ffjeudad judsh sewla' ;
run;
%let l=a;
data want;
set s;
do position=1 by 1 until(position=length(name));
l=char(name,position);
if l="&l" then output;
end;
run;
@novinosrin; You could code the loop a bit simpler:
do position = 1 to length (name) ;
But it's a minor thing ;). I've been thinking more in terms of whether it's necessary to scan through the whole target string to locate a few relatively rare characters. Obviously, if the string contains mostly "a"s (~60 per cent, per my testing), it's faster just to scan all the way, like you do, rather than resort to something else.
But suppose that the search-for character is in the minority - as is the case with "a' in 'hfisfae afwa ffjeudad judsh sewla'. Since the algorithms behind the string search functions are way faster than the linear scan, it may be expected that they could offer some advantage. Here's the basic idea:
1. Search for the given character.
2. If found, search again from the next higher position.
3. Otherwise, stop.
This way, we progress to the next lookup stage without having to examine every position one at a time. What makes it possible is the FINDC function's ability to begin search from a given position (by contrast, INDEXC cannot do that). Thus:
data _null_ ;
retain ch "a" str "hfisfae afwa ffjeudad judsh sewla" ;
do pos = findc (str, ch) by 0 while (pos) ;
put pos @ ;
pos = findc (str, ch, pos + 1) ;
end ;
run ;
Which duly prints:
6 9 12 20 33
We can now compare the two algorithms against our test string by running each, say, 10m times:
data _null_ ;
retain c "a" s "hfisfae afwa ffjeudad judsh sewla" N 1E7
t = time() ;
do k = 1 to N ;
link lscan ;
end ;
t1 = time() - t ;
t = time() ;
do k = 1 to N ;
link findc ;
end ;
t2 = time() - t ;
put t1= t2= ;
stop ;
findc: do p = findc (s, c) by 0 until (p=0) ;
p = findc (s, c, p + 1) ;
end ;
return ;
lscan: do p = 1 to length (s) ;
if char (s, p) = c then ;
end ;
run ;
As a result, I get time(findc):time(lscan)~1:3.5. It is then reduced to 1:1 when about 20-21 out of 33 characters are "a"s; and after that, your linear search becomes progressively faster. Quod erat inveniendum.
Paul D.
Boss, Hmm Brilliant intuition. Also, I see some similarities of this to golf fun post where all of us were part of.
To be honest, i thought of that idea earlier and I do not know why i didn't attempt it. Well perhaps, meant to learn from the boss 🙂
My initial idea was something like
loop : 1 to countc(var,'a') ?*looping only the count of a and not having to go all length(letters)*/
and findc or indexc
end
Just didn't feel confident enough to suggest OP. Of course I am excited to notice you being more active here these days with extensive diligence posts helps me speed.
Actually, using COUNTC is a great idea. Though it won't make it any faster, it makes it more straightforward to code:
data _null_ ;
retain ch "a" str "hfisfae afwa ffjeudad judsh sewla" ;
pos = 0 ;
do _n_ = 1 to countc (str, "a") ;
pos = findc (str, ch, pos + 1) ;
put pos @ ;
end ;
run ;
Best
Paul D.
I can help you with petty things and you rather bother boss for harder stuff lol
so change
data _null_ ;
to
data want ;
to save observations in a dataset,
Ordinary blokes like me can do easy stuff. Let's bother boss for harder stuff 🙂
It's easy.
Just change _null_ to the name of the data set you want and replace the entire PUT statement with the OUTPUT statement.
Voila.
Paul D.
@thanikondharish wrote:
finally it gives results on log or output window but i want to observations
should be
save in a dataset.
You have been asked to show what the desired output should like. Please do not be surprised if the result of process does not match a NOT STATED output format.
Still not clear. Do you want one observation for each occurrence of the found letter? Do you want one observation with multiple variables holding the positions?
Show what the output should look like.
Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.