Help using Base SAS procedures

Help with using index or string functions WITH array function.

Reply
New Contributor KDA
New Contributor
Posts: 4

Help with using index or string functions WITH array function.

I have a dataset with 14000 obs and 200 variables.
There is a set of 54 character variables, named goname1-goname54, for which I want to set up an array and then search for a word/phrase within the variable values for goname1-goname54. Previously, I've successfully used both the array and the index functions, but I am unable to figure out how to do both in the same step.
For example, I want to create a new variable, called POT, that will be an index of whether any of the variables, goname1-goname54, contain the word 'potassium' (i.e., POT=1 if this word appears, and POT=0 if none of the variables contain this word.) In an attempt to do so, I've written the following program, but get all zeros for POT (and I know this is not correct):

data new;
set old;
array goname{1:54} $ 255 goname1-goname54;
flag = "potassium";
POT=0;
do i=1 to 54;
POT = index(goname{i}, flag);
end;
run;

I've also tried listing out all of the variable names (i.e., goname1 goname2 goname3...) after the array length statement, and also tried omitting the element length, among other attempts to manipulate the order of the program.
Any advice?
KDA
N/A
Posts: 0

Re: Help with using index or string functions WITH array function.

an excellent SUGI 29 paper (authors Paul Dorfman and someone else) refers to optimising this challenge.
In the paper at http://www2.sas.com/proceedings/sugi29/264-29.pdf entitled "A-P-P Advanced Data Management Functions" the impressive statement which addresses this challenge is[pre]
found = ^^ indexw (peekc (addr(a), 81 ), srchfor) ; * ^^ normalizes to std boolean ;[/pre]
Implemented for your challenge, it would look something like [pre]
data result_data ;
array goname{54} $ 255 ;
set old_wide_large_enough_data ;
srchfor = "potassium" ;
POT= ^^ indexw( peekc( addr( goname1 ), %eval(54*81) ), srchfor ) ;
run;[/pre]
The array is defined before the data is SET to ensure these variables are all together for the PEEKC() function.

Much more clarification can be found in that paper, including a pointer to the alternative to peekc() for 64bit platforms.

PeterC.
Occasional Contributor BPD
Occasional Contributor
Posts: 12

Re: Help with using index or string functions WITH array function.

KDA,

You could just insert the line IF POT = 1 THEN LEAVE into the do loop. VIZ:

data new;
set old;
array goname{1:54} $ 255 goname1-goname54;
flag = "potassium";
POT=0;
do i=1 to 54;
POT = index(goname{i}, flag);
IF POT = 1 THEN LEAVE;
end;
run;

This stops POT subsequently being reset to 0 which may be all that's stopping the code working now.

Regards,

BPD
Respected Advisor
Posts: 4,173

Re: Help with using index or string functions WITH array function.

Hi KDA

I have no doubt that Peter and Paul's solution can't be beaten in regards of performance.
Peter: Thanks for posting this link. Very interesting.

I believe the code below would also work for the example given:

data new;
set old;
flag = "potassium";
pot= find(cats(of goname1-goname54),flag)>0;
run;


HTH
Patrick
Valued Guide
Posts: 2,177

Re: Help with using index or string functions WITH array function.

Hi Patrick

>
> Peter: Thanks for posting this link. Very interesting.
>
> I believe the code below would also work for the example given:
>
> data new;
> set old;
> flag = "potassium";
> pot= find(cats(of goname1-goname54),flag)>0;

* issues like trailing blanks in FLAG, and rejecting substrings during the search, are addressed in SAS9.2 with the findW() function.
For SAS9 platforms where FINDW() is not available, the following work-around looks tedious ;
pot = find( '|'!! catx( '||', of goname1-goname54) !!'|', cats('|',flag,'|') )>0 ;

> run;
>

The enhanced functions in SAS9.2 are really making a difference.

regards
peterC
Super User
Posts: 10,046

Re: Help with using index or string functions WITH array function.

Hi. I think you should add
[pre]
retain flag;
[/pre]


Because flag variable is not come from dataset, so it will be set missing when data step enter the next iteration.


Ksharp
SAS Super FREQ
Posts: 8,868

Re: Help with using index or string functions WITH array function.

Hi:
You are correct, that flag is initialized to MISSING for each iteration of the DATA step at the top of the program, however, flag is also assigned the value 'potassium' on each iteration of the DATA step program. So, the original code was OK. The problem was more likely one of the other issues noted.

It would only be better to use a retain for the FLAG variable if there was also a statement that assigned the value to FLAG only 1 time...something like:
[pre]
retain flag;
if _n_ = 1 then flag = 'potassium';
[/pre]

cynthia

ps...you can prove to yourself that potassium gets set on every iteration of the DATA step by using a test program that reads ANY data and uses similar logic:
[pre]
3097 data new;
3098 length flag $9;
3099 set sashelp.class;
3100 put 'before assignment statement: ' _n_= flag=;
3101
3102 flag = "potassium";
3103 put 'after assignment statement...' ;
3104 put _n_= name= flag=;
3105 run;

before assignment statement: _N_=1 flag=
after assignment statement...
_N_=1 Name=Alfred flag=potassium
before assignment statement: _N_=2 flag=
after assignment statement...
_N_=2 Name=Alice flag=potassium
before assignment statement: _N_=3 flag=
after assignment statement...
_N_=3 Name=Barbara flag=potassium
before assignment statement: _N_=4 flag=
after assignment statement...
_N_=4 Name=Carol flag=potassium
before assignment statement: _N_=5 flag=
after assignment statement...
_N_=5 Name=Henry flag=potassium
before assignment statement: _N_=6 flag=
after assignment statement...
_N_=6 Name=James flag=potassium
before assignment statement: _N_=7 flag=
after assignment statement...
_N_=7 Name=Jane flag=potassium
before assignment statement: _N_=8 flag=
after assignment statement...
_N_=8 Name=Janet flag=potassium
before assignment statement: _N_=9 flag=
after assignment statement...
_N_=9 Name=Jeffrey flag=potassium
before assignment statement: _N_=10 flag=
after assignment statement...
_N_=10 Name=John flag=potassium
before assignment statement: _N_=11 flag=
after assignment statement...
_N_=11 Name=Joyce flag=potassium
before assignment statement: _N_=12 flag=
after assignment statement...
_N_=12 Name=Judy flag=potassium
before assignment statement: _N_=13 flag=
after assignment statement...
_N_=13 Name=Louise flag=potassium
before assignment statement: _N_=14 flag=
after assignment statement...
_N_=14 Name=Mary flag=potassium
before assignment statement: _N_=15 flag=
after assignment statement...
_N_=15 Name=Philip flag=potassium
before assignment statement: _N_=16 flag=
after assignment statement...
_N_=16 Name=Robert flag=potassium
before assignment statement: _N_=17 flag=
after assignment statement...
_N_=17 Name=Ronald flag=potassium
before assignment statement: _N_=18 flag=
after assignment statement...
_N_=18 Name=Thomas flag=potassium
before assignment statement: _N_=19 flag=
after assignment statement...
_N_=19 Name=William flag=potassium
NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set WORK.NEW has 19 observations and 6 variables.
NOTE: DATA statement used (Total process time):
real time 0.03 seconds
cpu time 0.03 seconds

[/pre]
Super User
Posts: 10,046

Re: Help with using index or string functions WITH array function.

Posted in reply to Cynthia_sas
Hi.
You are right .I think i am too sensitive.
N/A
Posts: 0

Re: Help with using index or string functions WITH array function.

You might want to watch using some of string related functions like peekc & scan with array data if you are using particulary large sized array's i.e. Element size * element length.

Anything resulting in an overall array size of above 32k the functions end up producing unexpected results.

i.e. Try changing the number of the elements in the examples above to be 200 and you'll notice that the functions fail to work correctly
Ask a Question
Discussion stats
  • 8 replies
  • 149 views
  • 0 likes
  • 7 in conversation