DATA Step, Macro, Functions and more

Basic array naming question

Accepted Solution Solved
Reply
Contributor
Posts: 38
Accepted Solution

Basic array naming question

Is it possible to use an array name multiple times? For example I am using an array to set certain variable values to missing if they satisfy a certain condition. Instead of using a new array for each condition I want to reinitialize the array and reuse it. See example below:

Current way to doing things (using implicitly-indexed arrays)

data blah;

     input var1 var2 var3 var4 var5;

     array miss1 var1 var2 var3;

     do over miss1; if miss1=99 then miss1=.; end;

     array miss2 var4 var5;

     do over miss2; if miss2=999 then miss2=.; end;

run;

Is it possible to reuse the miss array so I can run the code like:

data blah;

     input var1 var2 var3 var4 var5;

     array miss var1 var2 var3;

     do over miss; if miss=99 then miss=.; end;

     array miss var4 var5;

     do over miss; if miss=999 then miss=.; end;

run;

Thanks!


Accepted Solutions
Solution
‎01-07-2014 09:33 AM
Super User
Posts: 5,081

Re: Basic array naming question

A couple of items to add for your consideration ...

Yes, SAS does ban the re-use of array names.  The ban is so strong that even this program would fail:

data fails;

   array vars {3} v1 v2 v3;

   array vars {3} v1 v2 v3;

run;

Second, have you considered using special missing values?  Instead of these statements:

if var1=99 then var1=.;

if var5=999 then var5=.;

Consider these instead:

if var1=99 then var1=.B;

if var5=999 then var5=.C;

That way, the variables can remain missing for analysis purposes, yet you can distinguish what the value was originally.

Good luck.

View solution in original post


All Replies
Respected Advisor
Posts: 4,644

Re: Basic array naming question

There are syntax errors in both versions of your code. You could use this instead (untested):

data want;

array missVal{5} _temporary_ (3*99 2*999);

input var1 var2 var3 var4 var5;

array miss{*} var1-var5;

do _n_ = 1 to dim(missVal);

     if miss{_n_}=missVal{_n_} then call missing(miss{_n_});

     end;

run;

PG

PG
Valued Guide
Posts: 2,175

Re: Basic array naming question

In answer to your question "can an array be reused" the answer is no.

but why would you wish to do this?

Your explanation of the purpose might lead to a solution -  if @PGStats has not already solved it.

Contributor
Posts: 38

Re: Basic array naming question

Peter -

the reason I posed the question was so I wouldn't have to keep defining new arrays for each new condition. The array code remains virtually the same for each new condition aside from changing the array name in blue and the variable/condition in red below:

array miss var1 var2 var3;

do over miss; if miss=99 then miss=.; end;

So I was just wondering if there was a way to reuse the "miss" array name each time so i wouldn't have to define a new array each time. My only use for the miss array is to assign missing values - after it has done its job for the variable/condition combination it is no longer required.

Super User
Super User
Posts: 6,499

Re: Basic array naming question

You are probably going to be better off just getting the 999's and 99's out of the data from the beginning rather than having to try to deal with it later.

Looks like you are reading from a text file so enter period (.) instead of 999 for missing. Or use one of the special missing values.

If the data files have already been entered or are created using a data entry system that prevents using periods or letters then use an INFORMAT that will do the conversion for you.

proc format ;

  invalue miss999x '999'=. ;

  invalue miss99x '99'=. ;

  invalue miss9x '9'=. ;

run;

data want;

  input var1-var5 ;

  informat var1 var3 miss999x. ;

  informat var2 var5 miss99x. ;

  informat var4 miss9x. ;

  put (_all_) (=);

cards;

1 2 3 4 5

9 9 9 9 9

99 99 99 99 99

999 999 999 999 999

run;

var1=1 var2=2 var3=3 var4=4 var5=5

var1=9 var2=9 var3=9 var4=. var5=9

var1=99 var2=. var3=99 var4=99 var5=.

var1=. var2=999 var3=. var4=999 var5=999

Contributor
Posts: 38

Re: Basic array naming question

Tom,

intriguing approach. I will actually be reading in from another SAS dataset. Can your approach also be applied to data reading from another sas dataset or is it only applicable when reading from text files?

Super User
Super User
Posts: 6,499

Re: Basic array naming question

Not really.  Again you are better off fixing the underlying problem of having missing values represented in your data as valid numbers instead of using SAS's built in missing value.  So if your existing dataset is called FRED then create a new permanent dataset called FRED_MISSING with the values recoded to missing and just use that dataset as the basis for your analyses going forward.

You could take advantage of the fact that SAS can query the format or informat attached to members in an array to help.  But I am not sure that it actually saves you anything over just creating multiple arrays and do loops.

proc format ;

  invalue miss999x '999'=. ;

  invalue miss99x '99'=. ;

  invalue miss9x '9'=. ;

run;

data have;

  input var1-var5 ;

cards;

1 2 3 4 5

9 9 9 9 9

99 99 99 99 99

999 999 999 999 999

run;

data want ;

  set have ;

  informat var1 var3 miss999x. ;

  informat var2 var5 miss99x. ;

  informat var4 miss9x. ;

  array miss var1-var5 ;

  do over miss ;

     if miss = input(scan(substr(vinformat(miss),5),1,'X'),??10.) then miss=.;

  end;

  put (_all_) (=);

run;

Valued Guide
Posts: 2,175

Re: Basic array naming question

There is no need to worry about reusing an arrayname because with very little engineering you can generate a unique (defined at data step compilation) arrayname. I hope someone will correct me if I am wrong but I believe that an array definition consumes very little resources.

The automatic macrovariable &sysindex is incremented at each invocation of the macro which contains it. So syntax like

%macro aname( namePref= ar );

&namePref&sysindex

%mend  aname ;

will prepare a unique name.

Then for each variable list you need to manage this way, generate a new array name by invoking the macro like:

%let arraN = %aname ;

Array &arraN &variable_list ;

do over &arraN ; if &arraN =99 then &arraN =. ; end;

Solution
‎01-07-2014 09:33 AM
Super User
Posts: 5,081

Re: Basic array naming question

A couple of items to add for your consideration ...

Yes, SAS does ban the re-use of array names.  The ban is so strong that even this program would fail:

data fails;

   array vars {3} v1 v2 v3;

   array vars {3} v1 v2 v3;

run;

Second, have you considered using special missing values?  Instead of these statements:

if var1=99 then var1=.;

if var5=999 then var5=.;

Consider these instead:

if var1=99 then var1=.B;

if var5=999 then var5=.C;

That way, the variables can remain missing for analysis purposes, yet you can distinguish what the value was originally.

Good luck.

Contributor
Posts: 38

Re: Basic array naming question

Thank you all for your thoughts!! I guess the short answer is no you cannot reuse the same array name in a datastep but I see now there are many ways to get around this.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 722 views
  • 7 likes
  • 5 in conversation