DATA Step, Macro, Functions and more

Repeat the same procedure for all observations

Accepted Solution Solved
Reply
Contributor
Posts: 65
Accepted Solution

Repeat the same procedure for all observations

Hi everyone,

Below is small part of the dataset that I'm working with. The only difference is that my dataset contains many more company_Nm, branch_id, product_id and date.

Company_NmBranch_IDCity Product_IDDate SaleCompetitors_sale
AAAAAA1Gainesville XXJan-1110000
AAAAAA2Yreka XXJan-118000
AAAAAA3Yreka XXJan-115000
AAAAAA4San Francisco XXJan-114000
AAAAAA5Los Angeles XXJan-117000
AAAAAA6Donalsonville XXJan-1112000
AAAAAA7Savannah XXJan-1180000
BBBBBB1Yreka XXJan-111000
BBBBBB2San Francisco XXJan-111000
BBBBBB3Los Angeles XXJan-111000
BBBBBB4Donalsonville XXJan-111000
BBBBBB5Savannah XXJan-111000

For each specific Company_nm and branch_id,each row, I want to find other branches located in the same city, let's say competitors. Then I need to take the average of sales among the competitors for the same product_id and date in that row, and put the value in the competitor_sale column.

Previously, I picked only one branch_id as a pilot and did the same procedure for that. I may be able to extend my code to all rows by do loop, but I'm not sure what's the most efficient way. Below is my code, even though I'm think it's so ugly and inefficient.

     data AAA1;

set have;

if branch_id='AAA1';

run;

     proc sql;

create table comp1 as

select *

from  have

where city in (Select city from AAA1);

quit;

      proc sql;

create comp2 as

select *

from  comp1

where company_nm not in (Select company_nm from AAA1);

quit;

     proc sql ;

create table comp_rate as

select product_id, date, 

mean(rate) as Ave_comp format = 5.3

from comp2

group by  product_id, date;

quit;

     data want;

merge AAA1 comp_rate;

by product_id date;

run;

I really appreciate any and all suggestions.


Accepted Solutions
Solution
‎08-08-2015 02:49 PM
Respected Advisor
Posts: 4,919

Re: Repeat the same procedure for all observations

The following will give you the monthly competitor mean sales by product and city, assuming that other branches from the same company are competitors:

data have;

infile datalines truncover;

input Company_Nm $ Branch_Id $ City &$12. Product_ID $ Date :anydtdte6. Sales;

format date yymmdd10.;

datalines;

AAA AAA1 Gainesville  XX Jan-11 10000

AAA AAA2 Yreka  XX Jan-11 8000

AAA AAA3 Yreka  XX Jan-11 5000

AAA AAA4 San Francisco  XX Jan-11 4000

AAA AAA5 Los Angeles  XX Jan-11 7000

AAA AAA6 Donalsonville  XX Jan-11 12000

AAA AAA7 Savannah  XX  Jan-11 80000

BBB BBB1 Yreka  XX Jan-11 1000

BBB BBB2 San Francisco  XX Jan-11 1000

BBB BBB3 Los Angeles  XX Jan-11 1000

BBB BBB4 Donalsonville  XX Jan-11 1000

BBB BBB5 Savannah  XX Jan-11 1000

;

proc sql;

select Company_Nm, Branch_Id, Product_Id, Date, Sales,

    (select mean(Sales)

     from have

     where Product_Id=a.Product_Id and date=a.date and not

      (Company_Nm = a.Company_Nm and Branch_Id = a.Branch_Id)

     ) as Competitor_Sale

from have as a;

quit;

PG

PG

View solution in original post


All Replies
Trusted Advisor
Posts: 1,137

Re: Repeat the same procedure for all observations

This is an untested code , for every branch_id a dataset is created. Hope this is what you are expecting

proc sql;

select distinct branch_id, count(distinct branch_id) into :id1-:id&sysmaxlong, Smiley Surprisedbs from have;

quit;

%put &obs;

%macro test;

%do i = 1 %to &obs;

data &&id&i;

set have;

if branch_id="&&id&i";

run;

     proc sql;

create table comp1 as

select *

from  have

where city in (Select city from &&id&i);

quit;

      proc sql;

create table comp2 as

select *

from  comp1

where company_nm not in (Select company_nm from &&id&i);

quit;

     proc sql ;

create table comp_rate as

select product_id, date,

mean(rate) as Ave_comp format = 5.3

from comp2

group by  product_id, date;

quit;

  data &&id&i;

merge &&id&i comp_rate;

by product_id date;

run;

%end;

%mend;

Thanks,

Jag

Thanks,
Jag
Solution
‎08-08-2015 02:49 PM
Respected Advisor
Posts: 4,919

Re: Repeat the same procedure for all observations

The following will give you the monthly competitor mean sales by product and city, assuming that other branches from the same company are competitors:

data have;

infile datalines truncover;

input Company_Nm $ Branch_Id $ City &$12. Product_ID $ Date :anydtdte6. Sales;

format date yymmdd10.;

datalines;

AAA AAA1 Gainesville  XX Jan-11 10000

AAA AAA2 Yreka  XX Jan-11 8000

AAA AAA3 Yreka  XX Jan-11 5000

AAA AAA4 San Francisco  XX Jan-11 4000

AAA AAA5 Los Angeles  XX Jan-11 7000

AAA AAA6 Donalsonville  XX Jan-11 12000

AAA AAA7 Savannah  XX  Jan-11 80000

BBB BBB1 Yreka  XX Jan-11 1000

BBB BBB2 San Francisco  XX Jan-11 1000

BBB BBB3 Los Angeles  XX Jan-11 1000

BBB BBB4 Donalsonville  XX Jan-11 1000

BBB BBB5 Savannah  XX Jan-11 1000

;

proc sql;

select Company_Nm, Branch_Id, Product_Id, Date, Sales,

    (select mean(Sales)

     from have

     where Product_Id=a.Product_Id and date=a.date and not

      (Company_Nm = a.Company_Nm and Branch_Id = a.Branch_Id)

     ) as Competitor_Sale

from have as a;

quit;

PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 232 views
  • 3 likes
  • 3 in conversation