Hello I am trying to sum a large amount of obs called NAICS4 which have 390 distinct values. The taxable??? colume has 90 distinct columes and will increase monthly. I can do a sum statement in proc sql like this:
proc sql; create table state4SortSum as select
naics4, sum(taxable190) as sumTaxable190, sum(taxable191) as sumTaxable191, sum(taxable390) as sumTaxable390
from rg.state4sort
group by naics4; quit;
But I am not wanting to write 90+ sum statements, so I tried to write a array:
data stateTest; length item $4;
array txSum{*} taxable190-taxable211 ;
do i = 1 to dim(txSum);
set rg.state4sort;
by naics4;
item =naics4;
tx = sum( of txSum{i}) ;
if last.naics4 then output;
end;
drop tx;
run;
However this array is only giving me the value of the last obs it runs through for each NAICS4 and I want it to give me the sum of all like it would in SQL. Any help with a better way to get the equivialent of the SQL without writing out everything would be great.
naics4 | taxable190 | taxable290 | taxable390 | taxable490 | taxable191 | taxable291 | taxable391 | taxable491 | taxable192 | taxable292 | taxable392 | taxable492 |
1111 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 |
1111 | ||||||||||||
1111 | ||||||||||||
1112 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | |
1112 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | |||
1112 | ||||||||||||
1112 | ||||||||||||
1113 | 23398 | 15296 | 21359 | 64786 | 23673 | 15961 | 17453 | 159439 | 27648 | 24140 | -0.001 | 236359 |
1113 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | |
1113 | -0.001 | -0.001 | -0.001 | |||||||||
1113 | -0.001 | -0.001 | -0.001 | |||||||||
1113 | -0.001 | -0.001 | 131411 | -0.001 | -0.001 | -0.001 | 4772 | |||||
1113 | ||||||||||||
1113 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 | -0.001 |
The ARRAY statement is to allow you to work across mulitple variables in the same observation.
The SUM() function in PROC SQL is to allow you to work across multiple obsevations.
You want to use PROC SUMMARY to sum your variables.
proc summary data=rg.state4sort nway ;
class naics4;
var Taxable: ;
output out=state4SortSum sum=;
run;
You can work out how to rename the variables from Taxable ... to SumTaxable...
The ARRAY statement is to allow you to work across mulitple variables in the same observation.
The SUM() function in PROC SQL is to allow you to work across multiple obsevations.
You want to use PROC SUMMARY to sum your variables.
proc summary data=rg.state4sort nway ;
class naics4;
var Taxable: ;
output out=state4SortSum sum=;
run;
You can work out how to rename the variables from Taxable ... to SumTaxable...
anothe approach:
example:
data have;
input naics4 $ taxable1-taxable5;
cards;
1 20 30 40 50 60
1 25 35 45 55 65
1 30 40 50 60 70
2 20 30 40 50 60
2 25 35 45 55 65
3 30 40 50 60 70
;
run;
proc sql noprint;
select catt('sum','(',name,')',' as ',catt(' sum',name))into : names separated by ','
from dictionary.columns
where libname='WORK' and memname='HAVE' and upcase(name) like 'TAXABLE%' ;
quit;
%put &names;
proc sql;
create table want as select naics4,&names
from have
group by naics4;
quit;
Linlin
An array approach, borrowed examples from LinLin's post:
data have;
input naics4 $ taxable1-taxable5;
cards;
1 20 30 40 50 60
1 25 35 45 55 65
1 30 40 50 60 70
2 20 30 40 50 60
2 25 35 45 55 65
3 30 40 50 60 70
;
run;
data want (drop=tax:);
do until (last.naics4);
set have;
by naics4;
array tax(*) taxable1-taxable5;
array st(*) sum_tax1-sum_tax5;
do _n_=1 to dim(tax);
st(_n_)+tax(_n_);
end;
end;
output;
call missing (of _all_);
run;
proc print;run;
Regards,
Haikuo
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.