Solved
Contributor
Posts: 67

How sum selected rows for given dates in a panel data-set

Hi,

I have a dataset with annual sales of selected companies. I want to account for merges&acquisitions that took place and aggregate histrorical (pre-merge) data for both companies involved in a merge.

For example, I have a following dataset with 5 companies A-E. There were two merges:
1. in year 3, company B took over company A (merge No. 1)
2. in year 4, company C took over company D (merge No. 2)

```data have;
input year company \$1. takeover :\$1. sales;
datalines;
1 A . 100
1 B . 120
1 C . 130
1 D . 140
1 E . 150
2 A . 105
2 B . 125
2 C . 135
2 D . 145
2 E . 155
3 B A 240
3 C . 140
3 D . 150
3 E . 160
4 B . 245
4 C D 300
4 E . 170
;
run;

proc sgplot data=have;
series x=year y=sales/group=company;
run;```

I want to get rid of the structural breaks in data for companies B and C. To do this, I want to sum sales of both companies invloved in each merge before it took place. In other words, I want to:
1. Sum the sales of A and B in each year 1 and 2
2. Sum the sales of C and D in each year 1,2 and 3

Can you advice me how to do it in SAS?

I was thinking of creating another variable merge_number that would equal to:
- 1 for companies A and B in year 1 and 2 (to indicate the companies involved in merge No. 1)
- 2 for companies C and D in year 1,2 and 3 (to indicate the companies involved in merge No. 2)
- 0 otherwise

Then I could use:

```proc sql;
create table want as
select merge_number, sum(sales) as new_sales
from have
group by merge_number;
quit;```

The problem is that I don't know how to create a variable merge_number. Could you help? Or maybe there is some other (simpler) way of summing the sales?

Accepted Solutions
Solution
‎04-04-2017 11:20 AM
Super User
Posts: 9,423

Re: How sum selected rows for given dates in a panel data-set

Right, got SAS back, this should work:

```data have;
input year company \$1. takeover :\$1. sales;
datalines;
1 A . 100
1 B . 120
1 C . 130
1 D . 140
1 E . 150
2 A . 105
2 B . 125
2 C . 135
2 D . 145
2 E . 155
3 B A 240
3 C . 140
3 D . 150
3 E . 160
4 B . 245
4 C D 300
4 E . 170
;
run;

proc sql;
create table WANT as
select  A.YEAR,
case when B.CMP is not null then B.CMP else A.COMPANY end as COMPANY,
A.SALES
from    HAVE A
left join (
select  YEAR,
CATS(COMPANY,TAKEOVER) as CMP,
TAKEOVER,
SALES
from   HAVE
where  TAKEOVER is not null
) B
on      index(B.CMP,A.COMPANY)
order by YEAR,
COMPANY;
quit;```

All Replies
Occasional Contributor
Posts: 7

Re: How sum selected rows for given dates in a panel data-set

proc sql;
select year,company,sum(sales) as sale_amt
from have
group by company,year
order by company,year;
quit;

Contributor
Posts: 67

Re: How sum selected rows for given dates in a panel data-set

No, in this way I would get the exactly the same dataset. I want to sum sales of companies A and B, C and D only
Occasional Contributor
Posts: 7

Re: How sum selected rows for given dates in a panel data-set

You can try with case statements  in proc sql for creating a merge_number  variable

Contributor
Posts: 67

Re: How sum selected rows for given dates in a panel data-set

How exactly?
Regular Contributor
Posts: 153

Re: How sum selected rows for given dates in a panel data-set

[ Edited ]

Hi,

try this

``````data want;
length sumSales 8;
set have;
call missing(sumSales);
if not missing(takeover) then do;
call execute(compbl('
PROC SQL;
UPDATE want
SET sumSales=(SELECT sum(sales)
FROM have
WHERE company in ("'||strip(company)||'","'||strip(takeover)||'")
AND year le '||strip(put(year,best.))||'
)
WHERE company eq "'||strip(company)||'" AND takeover eq "'||strip(takeover)||'";
QUIT;
'));

end;
run;``````

________________________

- Cheers -

Super User
Posts: 9,423

Re: How sum selected rows for given dates in a panel data-set

My SAS has died at the moment, so this is just a guess, but could you not just replace the company based on the smaller dataset?

E.g.;

```proc sql;
create table WANT as
select  A.YEAR,
case when B.COMPANY is not null then B.COMPANY else A.COMPANY end as COMPANY,
A.TAKEOVER,
A.SALES
from   HAVE A
left join HAVE B
on     A.TAKEOVER=B.COMPANY
and    A.YEAR < B.YEAR;
quit;```
Contributor
Posts: 67

Re: How sum selected rows for given dates in a panel data-set

@RW9

@Oligolas

I would like to have a following dataset at the end:

``````data want;
input Year new_company :\$2. new_sales ;
datalines;
1 AB 220
1 CD 270
1 E 150
2 AB 230
2 CD 280
2 E 155
3 AB 240
3 CD 290
3 E 160
4 AB 245
4 CD 300
4 E 170
;
run;``````

where AB means sales of synthetic company combined of A and B in each year (e.g. 100+120=220) and CD - joint sales of C and D (e.g. 270=130+140). Of course, starting from year3, sales of AB=sales of B and in year4 sales of CD=sales of C as companies formally merged.

Super User
Posts: 9,423

Re: How sum selected rows for given dates in a panel data-set

Hi,

Well, I still dont have SAS to try this out but my idea should still work.  The idea is to combine data from rows with takeover, with all the other companies in that pair, and creating a code from that:

```proc sql;
create table WANT as
select  A.YEAR,
case when B.COMPANY is not null then cats(A.COMPANY,B.COMPANY) else A.COMPANY end as COMPANY,
A.TAKEOVER,
A.SALES
from   HAVE A
left join HAVE B
on     A.TAKEOVER=B.COMPANY
and    A.YEAR < B.YEAR;
quit;```

What the idea is to get your original data to look something like:

```1 AD . 100
1 B . 120
1 CD . 130
1 CD . 140
1 E . 150
2 AB . 105
2 B . 125
2 CD . 135
2 CD . 145
2 E . 155
3 AB A 240
3 CD . 140
3 CD . 150
3 E . 160
4 B . 245
4 CD D 300
4 E . 170```

Which you can then sum up as normal.

Contributor
Posts: 67

Re: How sum selected rows for given dates in a panel data-set

@RW9

Ok, I think I get the idea now.The problem is that with the code I don't get the "new" company name (the one that "cats(A.COMPANY,B.COMPANY)" should produce) - see the attached file. Any idea why?

Solution
‎04-04-2017 11:20 AM
Super User
Posts: 9,423

Re: How sum selected rows for given dates in a panel data-set

Right, got SAS back, this should work:

```data have;
input year company \$1. takeover :\$1. sales;
datalines;
1 A . 100
1 B . 120
1 C . 130
1 D . 140
1 E . 150
2 A . 105
2 B . 125
2 C . 135
2 D . 145
2 E . 155
3 B A 240
3 C . 140
3 D . 150
3 E . 160
4 B . 245
4 C D 300
4 E . 170
;
run;

proc sql;
create table WANT as
select  A.YEAR,
case when B.CMP is not null then B.CMP else A.COMPANY end as COMPANY,
A.SALES
from    HAVE A
left join (
select  YEAR,
CATS(COMPANY,TAKEOVER) as CMP,
TAKEOVER,
SALES
from   HAVE
where  TAKEOVER is not null
) B
on      index(B.CMP,A.COMPANY)
order by YEAR,
COMPANY;
quit;```
Contributor
Posts: 67

Re: How sum selected rows for given dates in a panel data-set

@RW9 Thanks, now it works.

@kaverisonu1 Thanks, I guess that's another way of doing it, though some fine-tuning would be neccesary, as the output dataset is still a bit different from the one I need.

Super User
Posts: 13,321

Re: How sum selected rows for given dates in a panel data-set

Since you already know the problem cases perhaps this alternative:

```proc format library=work;
value \$NewComp
'A','B' = 'BA'
'C','D' = 'CD'
;
run;

data want;
set have;
NewCompany=put(company,\$NewComp.);
run;```

I often find that Formats are good for mapping /combining changed coded or names.

With the format you may not even need a new data set but apply the format to the variable in the summary or display procedure that does grouping.

Such as

proc summary data=have;

class company;

format company \$newcomp. ;

var sales;

run;

Occasional Contributor
Posts: 7

Re: How sum selected rows for given dates in a panel data-set

You can use the merge column post the code to sum the sales accordingly.

data have;
input year company \$1. takeover :\$1. sales;
datalines;
1 A . 100
1 B . 120
1 C . 130
1 D . 140
1 E . 150
2 A . 105
2 B . 125
2 C . 135
2 D . 145
2 E . 155
3 B A 240
3 C . 140
3 D . 150
3 E . 160
4 B . 245
4 C D 300
4 E . 170
;
run;

data have1;
set have;
select;
when(year=1 and company in('A','B'))do;
merge="AandB in 1";
end;
when(year=1 and company in ('C','D'))do;
merge="CandD in 2";
end;
when(year=1 and company="E") do;
merge="E in 1";
end;
when(year=2 and company in('A','B'))do;
merge="AandB in 2";
end;
when(year=2 and company in ('C','D'))do;
merge="CandD in 2";
end;
when(year=2 and company='E')do;
merge="E in 2";
end;
otherwise;
end;

☑ This topic is solved.