https://communities.sas.com/message/151801#151801
I want a slight change in the output. I am using the dataset in the previous thread for reference.
ID Year Bill Charlie
1 2000 25 1
1 2000 35 2
1 2001 25 3
2 2001 11 4
2 2001 23 5
4 2000 2 6
4 2000 1 7
4 2000 3 8
I would like my output in this manner (a variable equal to the sum of observations for the same ID and year in Bill & Charlie)
ID Year Bill1 Charlie1
1 2000 60 3
1 2001 25 3
2 2001 34 9
4 2000 6 21
The codes used are:
proc sql;
create table base1 as
select *, sum(bill) as bill1,sum(Charlie) as charlie1
from have
group by ID, year;
quit;
Now, I use the nodupkey option get my result but is there a more efficient way- to use only Proc SQL or only Data Step?
proc sort data=base1 nodupkey;
by id year;
run;
Thanks!
Namrata
Not sure of the question here. If you want PROC SQL to print the result instead of making a dataset then just remove the "create table base1 as" part of the statement. Or add a PROC PRINT step.
If you want to make a report then a reporting tool like PROC REPORT might be what you want.
proc report data=have headline nofs ;
column id year bill charlie ;
define id / group ;
define year /group;
define bill / sum ;
define charlie / sum;
run;
What I am trying to get is an output that contains only one observation(sum of all obs for the same year). My proc sql does not give me the output in the way I desire. What I get is:
ID Year Bill Charlie Bill1 Charlie1
1 2000 25 1 60 3
1 2000 35 2 60 3
1 2001 25 3 25 3
2 2001 11 4 34 9
2 2001 23 5 34 9
4 2000 2 6 6 21
4 2000 1 7 6 21
4 2000 3 8 6 21
And that is precisely why I thought I could just use nodupkey to delete repetitions in the years for each ID.
your codes work but I need to retain a couple more variables from my original dataset.
You are getting the extra rows because of the extra columns you have selected.
Instead of
select *, sum(bill) as bill1,sum(Charlie) as charlie1
You should replace the * (which means ALL columns) with just the columns in your GROUP BY statement.
select ID, year, sum(bill) as bill1,sum(Charlie) as charlie1
Aahh! That makes sense.
I am sure that is what Loko's codes were pointing at. I had tried his codes but kept most of the other variables(including the ones I am summing) instead of *.
Thanks a ton, Tom!
I meant to add to check out the ID statement - it lets you retain variables.
I guess, you don't need to use nodupkey while printing the data base1.
You Proc sql is giving the result without any duplicates.
Hello,
Adjust a little bit the sql code and it will be allright:
proc sql;
create table base1 as
select ID, year, sum(bill) as bill1,sum(Charlie) as charlie1
from have
group by ID, year;
quit;
assuming your data set is named t:
PROC MEANS data=T nway noprint;
CLASS id year;
VAR bill charlie;
OUTPUT OUT=sum_t sum(bill charlie)=;
RUN;
no need to sort, class statement does it for you.
output dataset is sum_t
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.