Re: Vertical addition

namrata · Posted 09-19-2014 01:58 AM

https://communities.sas.com/message/151801#151801

I want a slight change in the output. I am using the dataset in the previous thread for reference.

ID Year Bill Charlie

1 2000 25 1

1 2000 35 2

1 2001 25 3

2 2001 11 4

2 2001 23 5

4 2000 2 6

4 2000 1 7

4 2000 3 8

I would like my output in this manner (a variable equal to the sum of observations for the same ID and year in Bill & Charlie)

ID Year Bill1 Charlie1

1 2000 60 3

1 2001 25 3

2 2001 34 9

4 2000 6 21

The codes used are:

proc sql;

create table base1 as

select *, sum(bill) as bill1,sum(Charlie) as charlie1

from have

group by ID, year;

quit;

Now, I use the nodupkey option get my result but is there a more efficient way- to use only Proc SQL or only Data Step?

proc sort data=base1 nodupkey;

by id year;

run;

Thanks!
Namrata

Tom · Posted 09-19-2014 02:31 AM

Not sure of the question here. If you want PROC SQL to print the result instead of making a dataset then just remove the "create table base1 as" part of the statement. Or add a PROC PRINT step.

If you want to make a report then a reporting tool like PROC REPORT might be what you want.

proc report data=have headline nofs ;

column id year bill charlie ;

define id / group ;

define year /group;

define bill / sum ;

define charlie / sum;

run;

namrata · Posted 09-21-2014 12:35 AM

What I am trying to get is an output that contains only one observation(sum of all obs for the same year). My proc sql does not give me the output in the way I desire. What I get is:

ID Year Bill Charlie Bill1 Charlie1

1 2000 25 1 60 3

1 2000 35 2 60 3

1 2001 25 3 25 3

2 2001 11 4 34 9

2 2001 23 5 34 9

4 2000 2 6 6 21

4 2000 1 7 6 21

4 2000 3 8 6 21

And that is precisely why I thought I could just use nodupkey to delete repetitions in the years for each ID.

your codes work but I need to retain a couple more variables from my original dataset.

Tom · Posted 09-21-2014 04:46 AM

You are getting the extra rows because of the extra columns you have selected.

Instead of

select *, sum(bill) as bill1,sum(Charlie) as charlie1

You should replace the * (which means ALL columns) with just the columns in your GROUP BY statement.

select ID, year, sum(bill) as bill1,sum(Charlie) as charlie1

namrata · Posted 09-21-2014 03:02 PM

Aahh! That makes sense.

I am sure that is what Loko's codes were pointing at. I had tried his codes but kept most of the other variables(including the ones I am summing) instead of *.

Thanks a ton, Tom!

RCW · Posted 09-21-2014 10:37 PM

I meant to add to check out the ID statement - it lets you retain variables.

RaviKommuri · Posted 09-19-2014 03:14 AM

I guess, you don't need to use nodupkey while printing the data base1.

You Proc sql is giving the result without any duplicates.

Loko · Posted 09-19-2014 03:18 AM

Hello,

Adjust a little bit the sql code and it will be allright:

proc sql;

create table base1 as

select ID, year, sum(bill) as bill1,sum(Charlie) as charlie1

from have

group by ID, year;

quit;

namrata · Posted 09-21-2014 12:52 AM

I tried your codes. It returns a sum column(as I have shown in my reply above) but that is not what I want.

rcwright · Posted 09-19-2014 10:19 AM

assuming your data set is named t:

PROC MEANS data=T nway noprint;

CLASS id year;

VAR bill charlie;

OUTPUT OUT=sum_t sum(bill charlie)=;

RUN;

no need to sort, class statement does it for you.

output dataset is sum_t

SAS Innovate 2025: Save the Date

SAS Training: Just a Click Away