Hello!
I would sincerely appreciate any insight on the following question below. Basically I need a distinct count of the cust_ids to identify item codes that were ordered on the same day by the same person. I have to complete this analysis by tomorrow. I am grateful for your time and help!
There is a better solution with hash object, I'm sure. But assuming I understand you correctly...
data codes;
infile datalines dlm=',';
length date 8 item_code 8 item_desc $ 10 cust_id 8;
format date date9.;
informat date mmddyy.;
input date Item_Code item_desc cust_id;
datalines;
08/18/2016,1245,Pizza,5555
09/17/2014,1245,Pizza,4357
08/18/2016,3344,Chicken,5555
09/17/2014,6742,Bread,4357
10/10/2015,3217,Salad,3218
08/18/2016,1245,Pizza,9999
09/17/2014,1245,Pizza,4357
08/18/2016,3344,Chicken,9999
;
run;
proc sort data=codes;
by date cust_id Item_Code;
run;
/* assemble combos per day, per customer */
data sameday_combos(keep=date combo cust_id);
length combo $ 40;
retain combo prev_item;
set codes;
by date cust_id Item_Code ;
if prev_item ^= item_code then do;
combo = catx(',',item_code,combo);
prev_item=item_code;
end;
if last.cust_id then do;
output; call missing (combo,prev_item) ;
end;
run;
/* sum the # customers per combo, per day */
proc sql;
create table cust_totals as
select date, combo, count(cust_id) as Customers
from sameday_combos
group by date, combo;
quit;
Output:
You might have wanted each item code to be in its own variable (item_code1, item_code2, etc.) per day. That's more of a transpose operation, and you may be able to adopt the above to get that...
Chris
Please let me know if I need to clariy my question. Thanks!
There is a better solution with hash object, I'm sure. But assuming I understand you correctly...
data codes;
infile datalines dlm=',';
length date 8 item_code 8 item_desc $ 10 cust_id 8;
format date date9.;
informat date mmddyy.;
input date Item_Code item_desc cust_id;
datalines;
08/18/2016,1245,Pizza,5555
09/17/2014,1245,Pizza,4357
08/18/2016,3344,Chicken,5555
09/17/2014,6742,Bread,4357
10/10/2015,3217,Salad,3218
08/18/2016,1245,Pizza,9999
09/17/2014,1245,Pizza,4357
08/18/2016,3344,Chicken,9999
;
run;
proc sort data=codes;
by date cust_id Item_Code;
run;
/* assemble combos per day, per customer */
data sameday_combos(keep=date combo cust_id);
length combo $ 40;
retain combo prev_item;
set codes;
by date cust_id Item_Code ;
if prev_item ^= item_code then do;
combo = catx(',',item_code,combo);
prev_item=item_code;
end;
if last.cust_id then do;
output; call missing (combo,prev_item) ;
end;
run;
/* sum the # customers per combo, per day */
proc sql;
create table cust_totals as
select date, combo, count(cust_id) as Customers
from sameday_combos
group by date, combo;
quit;
Output:
You might have wanted each item code to be in its own variable (item_code1, item_code2, etc.) per day. That's more of a transpose operation, and you may be able to adopt the above to get that...
Chris
That is perfect Chris! I wish I could thank you a million times. I researched several white papers but couldn't find what I was looking for. I really appreciate your time.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.