Hi....I have a dataset where I would like to select or keep only those records(rows) where either the First_Choice or Second_Choice is greater than zero (0) and only keep the records(rows) where the Third_Choice=1 if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID. Thanks.
Have:
Store_ID | Customer _Number | Product_Number | First_Choice | Second_Choice | Third_Choice |
1231 | 23345 | 102011 | 3 | 0 | 0 |
1231 | 23345 | 102333 | 0 | 0 | 1 |
1231 | 23345 | 102441 | 0 | 1 | 0 |
1231 | 24898 | 101457 | 0 | 0 | 1 |
1231 | 24898 | 149845 | 0 | 0 | 1 |
1354 | 28779 | 112786 | 1 | 2 | 0 |
1354 | 28779 | 122654 | 0 | 2 | 0 |
1354 | 28779 | 125234 | 0 | 0 | 1 |
1354 | 28779 | 145889 | 1 | 0 | 0 |
Want:
Store_ID | Customer _Number | Product_Number | First_Choice | Second_Choice | Third_Choice |
1231 | 23345 | 102011 | 3 | 0 | 0 |
1231 | 23345 | 102441 | 0 | 1 | 0 |
1231 | 24898 | 101457 | 0 | 0 | 1 |
1354 | 28779 | 112786 | 1 | 2 | 0 |
1354 | 28779 | 122654 | 0 | 2 | 0 |
1354 | 28779 | 145889 | 1 | 0 | 0 |
"if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID."
You mean Store_ID and Customer_ID appears only once, right?
data have;
infile cards expandtabs truncover;
input Store_ID Customer_Number Product_Number First_Choice Second_Choice Third_Choice;
cards;
1231 23345 102011 3 0 0
1231 23345 102333 0 0 1
1231 23345 102441 0 1 0
1231 24898 101457 0 0 1
1231 24898 149845 0 0 1
1354 28779 112786 1 2 0
1354 28779 122654 0 2 0
1354 28779 125234 0 0 1
1354 28779 145889 1 0 0
;
run;
data key;
set have;
if First_Choice>0 or Second_Choice>0;
run;
data want;
if _n_=1 then do;
if 0 then set key;
declare hash h(dataset:'key');
h.definekey('Store_ID','Customer_Number');
h.definedone();
end;
set have;
if First_Choice>0 or Second_Choice>0 or
(Third_Choice=1 and h.check() ne 0) then do;
output;h.replace();
end;
run;
"if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID."
You mean Store_ID and Customer_ID appears only once, right?
data have;
infile cards expandtabs truncover;
input Store_ID Customer_Number Product_Number First_Choice Second_Choice Third_Choice;
cards;
1231 23345 102011 3 0 0
1231 23345 102333 0 0 1
1231 23345 102441 0 1 0
1231 24898 101457 0 0 1
1231 24898 149845 0 0 1
1354 28779 112786 1 2 0
1354 28779 122654 0 2 0
1354 28779 125234 0 0 1
1354 28779 145889 1 0 0
;
run;
data key;
set have;
if First_Choice>0 or Second_Choice>0;
run;
data want;
if _n_=1 then do;
if 0 then set key;
declare hash h(dataset:'key');
h.definekey('Store_ID','Customer_Number');
h.definedone();
end;
set have;
if First_Choice>0 or Second_Choice>0 or
(Third_Choice=1 and h.check() ne 0) then do;
output;h.replace();
end;
run;
Hi Xia....thanks you for your quick response.
"if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID."
You mean Store_ID and Customer_ID appears only once, right?
What I meant was that a record(row) could be the unique grouped by Store_ID and Customer_ID with Third_Choice = 1, then that record(row) is to be kept. If not and all the entries for Third_Choice is equal to 1 for the same Store_ID and Customer_ID, then only 1 (anyone of how many records) of these records(rows) is kept. I hope that clarifies things a little better. Thanks.
please try the DOW loop approach
data have;
infile cards missover;
input Store_ID Customer_Number Product_Number First_Choice Second_Choice
Third_Choice;
cards;
1231 23345 102011 3 0 0
1231 23345 102333 0 0 1
1231 23345 102441 0 1 0
1231 24898 101457 0 0 1
1231 24898 149845 0 0 1
1354 28779 112786 1 2 0
1354 28779 122654 0 2 0
1354 28779 125234 0 0 1
1354 28779 145889 1 0 0
;
proc sort data=have;
by Store_ID Customer_Number Third_Choice;
where First_Choice>0 or Second_Choice >0 or Third_Choice=1;
run;
data want;
i=0;
do until(last.Third_Choice);
set have;
by Store_ID Customer_Number Third_Choice;
i+1;
end;
do until(last.Third_Choice);
set have;
by Store_ID Customer_Number Third_Choice;
if (First_Choice>0 or Second_Choice >0) or (Third_Choice=1 and i=1);
output;
end;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.