DATA Step, Macro, Functions and more

Array Syntax Question

Reply
New Contributor
Posts: 2

Array Syntax Question

Hello! I'm trying to easily create a dichotomous yes/no variable by searching for specific values across an array of variables. Can you tell me what's wrong with this syntax? I'm getting the following error message:  "Illegal reference to the array diagnoses"

 

data neds_2014_array;
set neds_2014_core;
array diagnoses(30) DX1 dx2 dx3 dx4 dx5 dx6 dx7 dx8 dx9 dx10 dx11 dx12 dx13 dx14 dx15 dx16 dx17 dx18 dx19 dx20 dx21 dx22 dx23 dx24 dx25 dx26 dx27 dx28 dx29 dx30;
if diagnoses in ('5793' '99552' '99584' '7994' '260' '261' '262' '630' '2631' '2632' '2638' '2639' '78321' '7833' '78341' '7837' '78322' 'v850' 'v8551')
then malnutrition = 1;
else malnutrition = 0;
run;

Super Contributor
Posts: 320

Re: Array Syntax Question

Posted in reply to DavidLanctin

You are handling an array of CHARACTERS but have declared the ARRAY as NUMBER. Add $ with correct length(w). If it is all 8 or less

then you use $ without length. If the length is more than 8, then give it as $w as:

 

array diagnoses(30) $w DX1 dx2 dx3 dx4 dx5 dx6 dx7 dx8 dx9 dx10 dx11 dx12 dx13 dx14 dx15 dx16 dx17 dx18 dx19 dx20 dx21 dx22 dx23 dx24 dx25 dx26 dx27 dx28 dx29 dx30;

Super User
Posts: 6,543

Re: Array Syntax Question

Posted in reply to DavidLanctin

The IN operator supports only a single value before the word IN.  You would have to search the items in the array individually, for example:

 

malnutrition=0;

do k=1 to 30 until (malnutrition=1);
   if diagnoses{k} in ('5793' '99552' '99584' '7994' '260' '261' '262' '630' '2631' '2632'

                                '2638' '2639' '78321' '7833' '78341' '7837' '78322' 'v850' 'v8551')
   then malnutrition = 1;

end;

 

New Contributor
Posts: 2

Re: Array Syntax Question

Posted in reply to Astounding

I suspect that the combination of these 2 comments is the answer. Although, I am working with a very large dataset, so it has been running for over an hour now. Does anyone have ideas for a more efficient way to create this dichotomous variable from 30 other variables? 

Esteemed Advisor
Posts: 5,403

Re: Array Syntax Question

Posted in reply to DavidLanctin

Try this

 

data neds_2014_array;
set neds_2014_core;
array diagnoses $8 dx1-dx30;
malnutrition = 0;
do i = 1 to dim(diagnoses);
    if diagnoses{i} in ('5793' '99552' '99584' '7994' '260' '261' '262' '630' '2631' '2632' '2638' '2639' '78321' '7833' '78341' '7837' '78322' 'v850' 'v8551') then do;
    malnutrition = 1;
    leave;
    end;
drop i;
run;

(untested)

 

PG
Super Contributor
Posts: 320

Re: Array Syntax Question

Posted in reply to DavidLanctin


array diagnoses(30) DX1 dx2 dx3 dx4 dx5 dx6 dx7 dx8 dx9 dx10 dx11 dx12 dx13 dx14 dx15 dx16 dx17 dx18 dx19 dx20 dx21 dx22 dx23 dx24 dx25 dx26 dx27 dx28 dx29 dx30;

 

You are asking for a time-saving solution. How will an observation look like of your input Data Set(set neds_2014_core) ? I am wondering whether you need the above array? I presume that binary-search enabled FORMAT or HASH object might be faster than IN Operator used here. Will a sorted list of ('5793' '99552' '99584' '7994' '260' '261' '262' '630' '2631' '2632' '2638' '2639' '78321' '7833' '78341' '7837' '78322' 'v850' 'v8551') be equally faster with IN Operator? I have no resource at present to check my questions. Anyway the structure of an observation(Variables) if shown might be useful to suggest a better solution.

Ask a Question
Discussion stats
  • 5 replies
  • 132 views
  • 0 likes
  • 4 in conversation