Hi,
I have a dataset with annual sales of selected companies. I want to account for merges&acquisitions that took place and aggregate histrorical (pre-merge) data for both companies involved in a merge.
For example, I have a following dataset with 5 companies A-E. There were two merges:
1. in year 3, company B took over company A (merge No. 1)
2. in year 4, company C took over company D (merge No. 2)
data have; input year company $1. takeover :$1. sales; datalines; 1 A . 100 1 B . 120 1 C . 130 1 D . 140 1 E . 150 2 A . 105 2 B . 125 2 C . 135 2 D . 145 2 E . 155 3 B A 240 3 C . 140 3 D . 150 3 E . 160 4 B . 245 4 C D 300 4 E . 170 ; run; proc sgplot data=have; series x=year y=sales/group=company; run;
I want to get rid of the structural breaks in data for companies B and C. To do this, I want to sum sales of both companies invloved in each merge before it took place. In other words, I want to:
1. Sum the sales of A and B in each year 1 and 2
2. Sum the sales of C and D in each year 1,2 and 3
Can you advice me how to do it in SAS?
I was thinking of creating another variable merge_number that would equal to:
- 1 for companies A and B in year 1 and 2 (to indicate the companies involved in merge No. 1)
- 2 for companies C and D in year 1,2 and 3 (to indicate the companies involved in merge No. 2)
- 0 otherwise
Then I could use:
proc sql; create table want as select merge_number, sum(sales) as new_sales from have group by merge_number; quit;
The problem is that I don't know how to create a variable merge_number. Could you help? Or maybe there is some other (simpler) way of summing the sales?
Right, got SAS back, this should work:
data have; input year company $1. takeover :$1. sales; datalines; 1 A . 100 1 B . 120 1 C . 130 1 D . 140 1 E . 150 2 A . 105 2 B . 125 2 C . 135 2 D . 145 2 E . 155 3 B A 240 3 C . 140 3 D . 150 3 E . 160 4 B . 245 4 C D 300 4 E . 170 ; run; proc sql; create table WANT as select A.YEAR, case when B.CMP is not null then B.CMP else A.COMPANY end as COMPANY, A.SALES from HAVE A left join ( select YEAR, CATS(COMPANY,TAKEOVER) as CMP, TAKEOVER, SALES from HAVE where TAKEOVER is not null ) B on index(B.CMP,A.COMPANY) order by YEAR, COMPANY; quit;
If i have undeerstood your scenario,if below code can help you
proc sql;
select year,company,sum(sales) as sale_amt
from have
group by company,year
order by company,year;
quit;
You can try with case statements in proc sql for creating a merge_number variable
Hi,
try this
data want;
length sumSales 8;
set have;
call missing(sumSales);
if not missing(takeover) then do;
call execute(compbl('
PROC SQL;
UPDATE want
SET sumSales=(SELECT sum(sales)
FROM have
WHERE company in ("'||strip(company)||'","'||strip(takeover)||'")
AND year le '||strip(put(year,best.))||'
)
WHERE company eq "'||strip(company)||'" AND takeover eq "'||strip(takeover)||'";
QUIT;
'));
end;
run;
- Cheers -
My SAS has died at the moment, so this is just a guess, but could you not just replace the company based on the smaller dataset?
E.g.;
proc sql; create table WANT as select A.YEAR, case when B.COMPANY is not null then B.COMPANY else A.COMPANY end as COMPANY, A.TAKEOVER, A.SALES from HAVE A left join HAVE B on A.TAKEOVER=B.COMPANY and A.YEAR < B.YEAR; quit;
Thanks for your answers, but they don't produce what I need.
I would like to have a following dataset at the end:
data want;
input Year new_company :$2. new_sales ;
datalines;
1 AB 220
1 CD 270
1 E 150
2 AB 230
2 CD 280
2 E 155
3 AB 240
3 CD 290
3 E 160
4 AB 245
4 CD 300
4 E 170
;
run;
where AB means sales of synthetic company combined of A and B in each year (e.g. 100+120=220) and CD - joint sales of C and D (e.g. 270=130+140). Of course, starting from year3, sales of AB=sales of B and in year4 sales of CD=sales of C as companies formally merged.
Hi,
Well, I still dont have SAS to try this out but my idea should still work. The idea is to combine data from rows with takeover, with all the other companies in that pair, and creating a code from that:
proc sql; create table WANT as select A.YEAR, case when B.COMPANY is not null then cats(A.COMPANY,B.COMPANY) else A.COMPANY end as COMPANY, A.TAKEOVER, A.SALES from HAVE A left join HAVE B on A.TAKEOVER=B.COMPANY and A.YEAR < B.YEAR; quit;
What the idea is to get your original data to look something like:
1 AD . 100 1 B . 120 1 CD . 130 1 CD . 140 1 E . 150 2 AB . 105 2 B . 125 2 CD . 135 2 CD . 145 2 E . 155 3 AB A 240 3 CD . 140 3 CD . 150 3 E . 160 4 B . 245 4 CD D 300 4 E . 170
Which you can then sum up as normal.
Ok, I think I get the idea now.The problem is that with the code I don't get the "new" company name (the one that "cats(A.COMPANY,B.COMPANY)" should produce) - see the attached file. Any idea why?
Right, got SAS back, this should work:
data have; input year company $1. takeover :$1. sales; datalines; 1 A . 100 1 B . 120 1 C . 130 1 D . 140 1 E . 150 2 A . 105 2 B . 125 2 C . 135 2 D . 145 2 E . 155 3 B A 240 3 C . 140 3 D . 150 3 E . 160 4 B . 245 4 C D 300 4 E . 170 ; run; proc sql; create table WANT as select A.YEAR, case when B.CMP is not null then B.CMP else A.COMPANY end as COMPANY, A.SALES from HAVE A left join ( select YEAR, CATS(COMPANY,TAKEOVER) as CMP, TAKEOVER, SALES from HAVE where TAKEOVER is not null ) B on index(B.CMP,A.COMPANY) order by YEAR, COMPANY; quit;
@RW9 Thanks, now it works.
@kaverisonu1 Thanks, I guess that's another way of doing it, though some fine-tuning would be neccesary, as the output dataset is still a bit different from the one I need.
Since you already know the problem cases perhaps this alternative:
proc format library=work; value $NewComp 'A','B' = 'BA' 'C','D' = 'CD' ; run; data want; set have; NewCompany=put(company,$NewComp.); run;
I often find that Formats are good for mapping /combining changed coded or names.
With the format you may not even need a new data set but apply the format to the variable in the summary or display procedure that does grouping.
Such as
proc summary data=have;
class company;
format company $newcomp. ;
var sales;
run;
You can use the merge column post the code to sum the sales accordingly.
data have;
input year company $1. takeover :$1. sales;
datalines;
1 A . 100
1 B . 120
1 C . 130
1 D . 140
1 E . 150
2 A . 105
2 B . 125
2 C . 135
2 D . 145
2 E . 155
3 B A 240
3 C . 140
3 D . 150
3 E . 160
4 B . 245
4 C D 300
4 E . 170
;
run;
data have1;
set have;
select;
when(year=1 and company in('A','B'))do;
merge="AandB in 1";
end;
when(year=1 and company in ('C','D'))do;
merge="CandD in 2";
end;
when(year=1 and company="E") do;
merge="E in 1";
end;
when(year=2 and company in('A','B'))do;
merge="AandB in 2";
end;
when(year=2 and company in ('C','D'))do;
merge="CandD in 2";
end;
when(year=2 and company='E')do;
merge="E in 2";
end;
otherwise;
end;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.