Solved: Re: Repeat the same procedure for all observations

m1986MM · Posted 08-08-2015 11:06 AM

Hi everyone,

Below is small part of the dataset that I'm working with. The only difference is that my dataset contains many more company_Nm, branch_id, product_id and date.

Company_Nm	Branch_ID	City	Product_ID	Date	Sale	Competitors_sale
AAA	AAA1	Gainesville	XX	Jan-11	10000
AAA	AAA2	Yreka	XX	Jan-11	8000
AAA	AAA3	Yreka	XX	Jan-11	5000
AAA	AAA4	San Francisco	XX	Jan-11	4000
AAA	AAA5	Los Angeles	XX	Jan-11	7000
AAA	AAA6	Donalsonville	XX	Jan-11	12000
AAA	AAA7	Savannah	XX	Jan-11	80000
BBB	BBB1	Yreka	XX	Jan-11	1000
BBB	BBB2	San Francisco	XX	Jan-11	1000
BBB	BBB3	Los Angeles	XX	Jan-11	1000
BBB	BBB4	Donalsonville	XX	Jan-11	1000
BBB	BBB5	Savannah	XX	Jan-11	1000

For each specific Company_nm and branch_id,each row, I want to find other branches located in the same city, let's say competitors. Then I need to take the average of sales among the competitors for the same product_id and date in that row, and put the value in the competitor_sale column.

Previously, I picked only one branch_id as a pilot and did the same procedure for that. I may be able to extend my code to all rows by do loop, but I'm not sure what's the most efficient way. Below is my code, even though I'm think it's so ugly and inefficient.

data AAA1;

set have;

if branch_id='AAA1';

run;

proc sql;

create table comp1 as

select *

from have

where city in (Select city from AAA1);

quit;

proc sql;

create comp2 as

select *

from comp1

where company_nm not in (Select company_nm from AAA1);

quit;

proc sql ;

create table comp_rate as

select product_id, date,

mean(rate) as Ave_comp format = 5.3

from comp2

group by product_id, date;

quit;

data want;

merge AAA1 comp_rate;

by product_id date;

run;

I really appreciate any and all suggestions.

PGStats · Posted 08-08-2015 02:49 PM

The following will give you the monthly competitor mean sales by product and city, assuming that other branches from the same company are competitors:

data have;

infile datalines truncover;

input Company_Nm $ Branch_Id $ City &$12. Product_ID $ Date :anydtdte6. Sales;

format date yymmdd10.;

datalines;

AAA AAA1 Gainesville XX Jan-11 10000

AAA AAA2 Yreka XX Jan-11 8000

AAA AAA3 Yreka XX Jan-11 5000

AAA AAA4 San Francisco XX Jan-11 4000

AAA AAA5 Los Angeles XX Jan-11 7000

AAA AAA6 Donalsonville XX Jan-11 12000

AAA AAA7 Savannah XX Jan-11 80000

BBB BBB1 Yreka XX Jan-11 1000

BBB BBB2 San Francisco XX Jan-11 1000

BBB BBB3 Los Angeles XX Jan-11 1000

BBB BBB4 Donalsonville XX Jan-11 1000

BBB BBB5 Savannah XX Jan-11 1000

;

proc sql;

select Company_Nm, Branch_Id, Product_Id, Date, Sales,

(select mean(Sales)

from have

where Product_Id=a.Product_Id and date=a.date and not

(Company_Nm = a.Company_Nm and Branch_Id = a.Branch_Id)

) as Competitor_Sale

from have as a;

quit;

PG

View solution in original post

Jagadishkatam · Posted 08-08-2015 01:02 PM

This is an untested code , for every branch_id a dataset is created. Hope this is what you are expecting

proc sql;

select distinct branch_id, count(distinct branch_id) into :id1-:id&sysmaxlong, :obs from have;

quit;

%put &obs;

%macro test;

%do i = 1 %to &obs;

data &&id&i;

set have;

if branch_id="&&id&i";

run;

proc sql;

create table comp1 as

select *

from have

where city in (Select city from &&id&i);

quit;

proc sql;

create table comp2 as

select *

from comp1

where company_nm not in (Select company_nm from &&id&i);

quit;

proc sql ;

create table comp_rate as

select product_id, date,

mean(rate) as Ave_comp format = 5.3

from comp2

group by product_id, date;

quit;

data &&id&i;

merge &&id&i comp_rate;

by product_id date;

run;

%end;

%mend;

Thanks,

Jag

Thanks,
Jag

PGStats · Posted 08-08-2015 02:49 PM

The following will give you the monthly competitor mean sales by product and city, assuming that other branches from the same company are competitors:

data have;

infile datalines truncover;

input Company_Nm $ Branch_Id $ City &$12. Product_ID $ Date :anydtdte6. Sales;

format date yymmdd10.;

datalines;

AAA AAA1 Gainesville XX Jan-11 10000

AAA AAA2 Yreka XX Jan-11 8000

AAA AAA3 Yreka XX Jan-11 5000

AAA AAA4 San Francisco XX Jan-11 4000

AAA AAA5 Los Angeles XX Jan-11 7000

AAA AAA6 Donalsonville XX Jan-11 12000

AAA AAA7 Savannah XX Jan-11 80000

BBB BBB1 Yreka XX Jan-11 1000

BBB BBB2 San Francisco XX Jan-11 1000

BBB BBB3 Los Angeles XX Jan-11 1000

BBB BBB4 Donalsonville XX Jan-11 1000

BBB BBB5 Savannah XX Jan-11 1000

;

proc sql;

select Company_Nm, Branch_Id, Product_Id, Date, Sales,

(select mean(Sales)

from have

where Product_Id=a.Product_Id and date=a.date and not

(Company_Nm = a.Company_Nm and Branch_Id = a.Branch_Id)

) as Competitor_Sale

from have as a;

quit;

PG

Repeat the same procedure for all observations

Re: Repeat the same procedure for all observations

Re: Repeat the same procedure for all observations

Re: Repeat the same procedure for all observations

Click image to register for webinar

Classroom Training Available!