Solved: Re: Array processing with a proc step

Hello_Melinda · Posted 08-09-2022 10:59 AM

Hello,

I have a large number of variables identifying some data characteristics (frequency of environmental factors identified), and a total cost variable. I'd like to get average costs for each of the characteristics where that characteristic is not missing.

I'm having trouble writing a code that loops through each and creates an average. For the output, I would like a table that lists the average cost (total cost divided by lines where characteristic is not missing) for each characteristic, i.e. average cost where a1 is present, average cost where a2 is present, etc.

Thanks in advance for any help!

data have;
infile datalines;
input a1 1. a2 1. a3 1. cost 10.;
datalines;
1 . 3 100
4 6 1 200
. 2 1 300
2 . . 400
2 2 2 500;
run;

*test out sql step;

proc sql;
create table averages as
select avg(cost) as avgcost1
from have
where a1 ne .
;quit;
* Result is 1200/4=300;

*convert to macro;

%macro avg(i);
proc sql;
create table averages as
select avg(cost) as avgcost&i
from have
where a&i ne .
;quit;
%mend;

%avg(1);

*array step using macro;

data temp;
set have;
array a_s {3} a:;
do i = 1 to 3;
%avg%i;
end;
run;

PaigeMiller · Posted 08-09-2022 11:11 AM

I think a better idea is to use an array to create a new set of COST_A variables where the value of COST_A is missing if A is missing and has the value of COST if A is not missing. Then means can be computed by PROC SUMMARY or PROC MEANS.

data have;
infile datalines;
input a1 a2 a3 cost;
datalines;
1 . 3 100
4 6 1 200
. 2 1 300
2 . . 400
2 2 2 500
;
run;

data have1;
    set have;
    array a a1-a3;
    array cost_a cost_a1-cost_a3;
    do i=1 to dim(a);
        if not missing(a(i)) then cost_a(i)=cost;
    end;
    drop i;
run;
proc summary data=have1;
    var cost_a:;
    output out=stats mean=/autoname;
run;

Adding: what you are trying to do in your code is insert a PROC (in this case PROC SQL) inside a DATA step. This can't be done. This is invalid syntax. It doesn't matter if you try to do it with a macro or without a macro, you can't put a PROC inside a DATA step.

--
Paige Miller

View solution in original post

PaigeMiller · Posted 08-09-2022 11:11 AM

I think a better idea is to use an array to create a new set of COST_A variables where the value of COST_A is missing if A is missing and has the value of COST if A is not missing. Then means can be computed by PROC SUMMARY or PROC MEANS.

data have;
infile datalines;
input a1 a2 a3 cost;
datalines;
1 . 3 100
4 6 1 200
. 2 1 300
2 . . 400
2 2 2 500
;
run;

data have1;
    set have;
    array a a1-a3;
    array cost_a cost_a1-cost_a3;
    do i=1 to dim(a);
        if not missing(a(i)) then cost_a(i)=cost;
    end;
    drop i;
run;
proc summary data=have1;
    var cost_a:;
    output out=stats mean=/autoname;
run;

Adding: what you are trying to do in your code is insert a PROC (in this case PROC SQL) inside a DATA step. This can't be done. This is invalid syntax. It doesn't matter if you try to do it with a macro or without a macro, you can't put a PROC inside a DATA step.

--
Paige Miller

Hello_Melinda · Posted 08-09-2022 11:54 AM

@PaigeMiller , that works great, thank you very much!

Array processing with a proc step

Re: Array processing with a proc step

Re: Array processing with a proc step

Re: Array processing with a proc step

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away