Hi everyone,
Below is small part of the dataset that I'm working with. The only difference is that my dataset contains many more company_Nm, branch_id, product_id and date.
Company_Nm | Branch_ID | City | Product_ID | Date | Sale | Competitors_sale |
AAA | AAA1 | Gainesville | XX | Jan-11 | 10000 | |
AAA | AAA2 | Yreka | XX | Jan-11 | 8000 | |
AAA | AAA3 | Yreka | XX | Jan-11 | 5000 | |
AAA | AAA4 | San Francisco | XX | Jan-11 | 4000 | |
AAA | AAA5 | Los Angeles | XX | Jan-11 | 7000 | |
AAA | AAA6 | Donalsonville | XX | Jan-11 | 12000 | |
AAA | AAA7 | Savannah | XX | Jan-11 | 80000 | |
BBB | BBB1 | Yreka | XX | Jan-11 | 1000 | |
BBB | BBB2 | San Francisco | XX | Jan-11 | 1000 | |
BBB | BBB3 | Los Angeles | XX | Jan-11 | 1000 | |
BBB | BBB4 | Donalsonville | XX | Jan-11 | 1000 | |
BBB | BBB5 | Savannah | XX | Jan-11 | 1000 |
For each specific Company_nm and branch_id,each row, I want to find other branches located in the same city, let's say competitors. Then I need to take the average of sales among the competitors for the same product_id and date in that row, and put the value in the competitor_sale column.
Previously, I picked only one branch_id as a pilot and did the same procedure for that. I may be able to extend my code to all rows by do loop, but I'm not sure what's the most efficient way. Below is my code, even though I'm think it's so ugly and inefficient.
data AAA1;
set have;
if branch_id='AAA1';
run;
proc sql;
create table comp1 as
select *
from have
where city in (Select city from AAA1);
quit;
proc sql;
create comp2 as
select *
from comp1
where company_nm not in (Select company_nm from AAA1);
quit;
proc sql ;
create table comp_rate as
select product_id, date,
mean(rate) as Ave_comp format = 5.3
from comp2
group by product_id, date;
quit;
data want;
merge AAA1 comp_rate;
by product_id date;
run;
I really appreciate any and all suggestions.
The following will give you the monthly competitor mean sales by product and city, assuming that other branches from the same company are competitors:
data have;
infile datalines truncover;
input Company_Nm $ Branch_Id $ City &$12. Product_ID $ Date :anydtdte6. Sales;
format date yymmdd10.;
datalines;
AAA AAA1 Gainesville XX Jan-11 10000
AAA AAA2 Yreka XX Jan-11 8000
AAA AAA3 Yreka XX Jan-11 5000
AAA AAA4 San Francisco XX Jan-11 4000
AAA AAA5 Los Angeles XX Jan-11 7000
AAA AAA6 Donalsonville XX Jan-11 12000
AAA AAA7 Savannah XX Jan-11 80000
BBB BBB1 Yreka XX Jan-11 1000
BBB BBB2 San Francisco XX Jan-11 1000
BBB BBB3 Los Angeles XX Jan-11 1000
BBB BBB4 Donalsonville XX Jan-11 1000
BBB BBB5 Savannah XX Jan-11 1000
;
proc sql;
select Company_Nm, Branch_Id, Product_Id, Date, Sales,
(select mean(Sales)
from have
where Product_Id=a.Product_Id and date=a.date and not
(Company_Nm = a.Company_Nm and Branch_Id = a.Branch_Id)
) as Competitor_Sale
from have as a;
quit;
PG
This is an untested code , for every branch_id a dataset is created. Hope this is what you are expecting
proc sql;
select distinct branch_id, count(distinct branch_id) into :id1-:id&sysmaxlong, :obs from have;
quit;
%put &obs;
%macro test;
%do i = 1 %to &obs;
data &&id&i;
set have;
if branch_id="&&id&i";
run;
proc sql;
create table comp1 as
select *
from have
where city in (Select city from &&id&i);
quit;
proc sql;
create table comp2 as
select *
from comp1
where company_nm not in (Select company_nm from &&id&i);
quit;
proc sql ;
create table comp_rate as
select product_id, date,
mean(rate) as Ave_comp format = 5.3
from comp2
group by product_id, date;
quit;
data &&id&i;
merge &&id&i comp_rate;
by product_id date;
run;
%end;
%mend;
Thanks,
Jag
The following will give you the monthly competitor mean sales by product and city, assuming that other branches from the same company are competitors:
data have;
infile datalines truncover;
input Company_Nm $ Branch_Id $ City &$12. Product_ID $ Date :anydtdte6. Sales;
format date yymmdd10.;
datalines;
AAA AAA1 Gainesville XX Jan-11 10000
AAA AAA2 Yreka XX Jan-11 8000
AAA AAA3 Yreka XX Jan-11 5000
AAA AAA4 San Francisco XX Jan-11 4000
AAA AAA5 Los Angeles XX Jan-11 7000
AAA AAA6 Donalsonville XX Jan-11 12000
AAA AAA7 Savannah XX Jan-11 80000
BBB BBB1 Yreka XX Jan-11 1000
BBB BBB2 San Francisco XX Jan-11 1000
BBB BBB3 Los Angeles XX Jan-11 1000
BBB BBB4 Donalsonville XX Jan-11 1000
BBB BBB5 Savannah XX Jan-11 1000
;
proc sql;
select Company_Nm, Branch_Id, Product_Id, Date, Sales,
(select mean(Sales)
from have
where Product_Id=a.Product_Id and date=a.date and not
(Company_Nm = a.Company_Nm and Branch_Id = a.Branch_Id)
) as Competitor_Sale
from have as a;
quit;
PG
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.