I have a longitudinal dataset and I want to sum var for IDs where indicate = 1. How can I do this in sql?
Data format:
ID Indicate var
1 1 10
1 0 20
1 0 10
2 0 5
2 0 15
2 0 30
3 1 30
3 0 10
Desired data:
ID indicate sumvar
1 1 40
3 1 40
data have;
input ID Indicate var;
cards;
1 1 10
1 0 20
1 0 10
2 0 5
2 0 15
2 0 30
3 1 30
3 0 10
;
proc sql;
create table want as
select id,max(indicate=1) as indicate, sum(var) as sumvar
from have
group by id
having indicate=1;
quit;
SUM + WHERE works as expected.
https://github.com/statgeek/SAS-Tutorials/blob/master/add_average_value_to_dataset.sas
Proc sql;
Select sex, sum(weight) as tot_weight
From sashelp.class
where sex='F';
Quit;
proc summary data=have nway;
Class id;
var indicate var;
output out=want(where=(max_indicate=1)) max(indicate)=max_indicate
sum(var)=sum_var;
run;
data have;
input ID Indicate var;
cards;
1 1 10
1 0 20
1 0 10
2 0 5
2 0 15
2 0 30
3 1 30
3 0 10
;
proc sql;
create table want as
select id,max(indicate=1) as indicate, sum(var) as sumvar
from have
group by id
having indicate=1;
quit;
You can also try below sql approach
proc sql;
create table want as select a.indicate,b.* from have as a left join (select id, sum(var) as sum from have group by id) as b on a.id=b.id having a.indicate=1;
quit;
Though the datastep approach is not asked, I am writing this if it helps
proc sort data=have;
by id indicate ;
run;
data want(where=(id=1));
set have;
by id indicate;
retain sum;
if first.id then sum=var;
else sum+var;
if last.id;
run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.