If I have a data for employees with ID 1 2 3 4 5 and each one has a row with year 2000-2005 and each ear has a salary of $10,000 --- the question how can I run ARRAY and DO in a single DATA statement to obtain a total salary???
You don't usually use DO loops to do summaries in SAS. You can use PROC MEANS or SUMMARY or if it's required for an array, the SUM/MEAN functions work on array references so you don't need a DO loop either.
If you post an example of your data we can further assist. It does not have to be your real data but should reflect your data structure.
1 | 101 | 2000 | $107,000 |
2 | 101 | 2001 | $106,000 |
3 | 101 | 2002 | $109,000 |
4 | 101 | 2003 | $110,000 |
5 | 101 | 2005 | $114,000 |
6 | 101 | 2006 | $115,000 |
7 | 101 | 2007 | $118,000 |
8 | 101 | 2008 | $119,000 |
9 | 102 | 2000 | $78,000 |
10 | 102 | 2001 | $74,000 |
11 | 102 | 2002 | $76,000 |
12 | 102 | 2005 | $78,000 |
13 | 102 | 2006 | $76,000 |
14 | 102 | 2007 | $72,000 |
15 | 102 | 2008 | $68,000 |
16 | 102 | 2009 | $70,000 |
17 | 103 | 2000 | $99,000 |
18 | 103 | 2001 | $96,000 |
19 | 103 | 2003 | $92,000 |
20 | 103 | 2004 |
Still the same as @Reeza suggested-->
data have;
infile datalines truncover;
input row_num id year salary :dollar10.;
format salary dollar10.;
drop row_num;
datalines;
1 101 2000 $107,000
2 101 2001 $106,000
3 101 2002 $109,000
4 101 2003 $110,000
5 101 2005 $114,000
6 101 2006 $115,000
7 101 2007 $118,000
8 101 2008 $119,000
9 102 2000 $78,000
10 102 2001 $74,000
11 102 2002 $76,000
12 102 2005 $78,000
13 102 2006 $76,000
14 102 2007 $72,000
15 102 2008 $68,000
16 102 2009 $70,000
17 103 2000 $99,000
18 103 2001 $96,000
19 103 2003 $92,000
20 103 2004
;
proc sql;
create table total_salary as
select id, sum(salary) as tot_salary
from have
group by id;
quit;
alternatively try procs means/summary etc
As you might have guessed by the replies you have already received, there is no way. Arrays are not applicable here. However, what if your concept of the data is wrong and it really looks like this:
ID Sal2000 Sal2001 Sal2002 Sal2003 Sal2004 Sal2005 Sal2006 Sal2007
1 107000 106000 109000 110000 114000 115000 118000
2 74000 76000 78000 76000 72000
For that structure to the data, arrays could come into play. While the total could be obtained without arrays:
totsal = sum(of sal2000 - sal2007);
It could also be done with arrays. For example:
data want;
set have;
array sals {8} sal2000 - sal2007;
totsal = 0;
do j=1 to 8;
totsal + sals{j};
end;
run;
But if you are confident about the structure of the data, then forget about arrays. They just don't apply.
Thank for the response- looks like an array is a must in this particular assignment. The solution was found.
The DATA step will look something like that:
DATA outcome;
SET have;
BY id;
first_record=FIRST.id;
last_record=LAST.id;
RETAIN total_salary;
FORMAT total_salary dollar8.;
IF first_record= 1 THEN total_salary = salary;
ELSE total_salary=total_salary+salary;
IF last_record=1 THEN OUTPUT;
DROP year salary first_record last_record;
RUN;
Thanks all
If you absolutely need an array in your solution (and it would make sense to use one for more sophisticated statistics) you could do:
data have;
infile datalines truncover;
input row_num id year salary :dollar10.;
format salary dollar10.;
drop row_num;
datalines;
1 101 2000 $107,000
2 101 2001 $106,000
3 101 2002 $109,000
4 101 2003 $110,000
5 101 2005 $114,000
6 101 2006 $115,000
7 101 2007 $118,000
8 101 2008 $119,000
9 102 2000 $78,000
10 102 2001 $74,000
11 102 2002 $76,000
12 102 2005 $78,000
13 102 2006 $76,000
14 102 2007 $72,000
15 102 2008 $68,000
16 102 2009 $70,000
17 103 2000 $99,000
18 103 2001 $96,000
19 103 2003 $92,000
20 103 2004
;
data want;
array s{99} _temporary_;
call missing(of s{*});
do i = 1 by 1 until(last.id);
set have; by id;
s{i} = salary;
end;
totalSalary = sum(of s{*});
format totalSalary dollar10.;
keep id totalSalary;
run;
proc print noobs; run;
the data looks like that
I am looking to DROP the last column (salary) and include total salaries for the years from 2000 to 2009 in one row
Are you after something like this?
/*your input dataset*/
data have;
do id = 1 to 5;
do year=2000 to 2005;
salary=10000;
output;
end;
end;
run;
/*your wanted output*/
proc sql;
create table total_salary as
select id, sum(salary) as tot_salary
from have
group by id;
quit;
If yes, it's just a summation, why bother with array and l
Hi,
If you are looking for a data step then following may get you the desired output:
data want(keep=total_salary);
set have end=last;
total_salary+salary;
if last;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.