Hi....I have a dataset where I would like to select or keep only those records(rows) where either the First_Choice or Second_Choice is greater than zero (0) and only keep the records(rows) where the Third_Choice=1 if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID. Thanks.
Have:
Store_ID | Customer _Number | Product_Number | First_Choice | Second_Choice | Third_Choice |
1231 | 23345 | 102011 | 3 | 0 | 0 |
1231 | 23345 | 102333 | 0 | 0 | 1 |
1231 | 23345 | 102441 | 0 | 1 | 0 |
1231 | 24898 | 101457 | 0 | 0 | 1 |
1231 | 24898 | 149845 | 0 | 0 | 1 |
1354 | 28779 | 112786 | 1 | 2 | 0 |
1354 | 28779 | 122654 | 0 | 2 | 0 |
1354 | 28779 | 125234 | 0 | 0 | 1 |
1354 | 28779 | 145889 | 1 | 0 | 0 |
Want:
Store_ID | Customer _Number | Product_Number | First_Choice | Second_Choice | Third_Choice |
1231 | 23345 | 102011 | 3 | 0 | 0 |
1231 | 23345 | 102441 | 0 | 1 | 0 |
1231 | 24898 | 101457 | 0 | 0 | 1 |
1354 | 28779 | 112786 | 1 | 2 | 0 |
1354 | 28779 | 122654 | 0 | 2 | 0 |
1354 | 28779 | 145889 | 1 | 0 | 0 |
"if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID."
You mean Store_ID and Customer_ID appears only once, right?
data have;
infile cards expandtabs truncover;
input Store_ID Customer_Number Product_Number First_Choice Second_Choice Third_Choice;
cards;
1231 23345 102011 3 0 0
1231 23345 102333 0 0 1
1231 23345 102441 0 1 0
1231 24898 101457 0 0 1
1231 24898 149845 0 0 1
1354 28779 112786 1 2 0
1354 28779 122654 0 2 0
1354 28779 125234 0 0 1
1354 28779 145889 1 0 0
;
run;
data key;
set have;
if First_Choice>0 or Second_Choice>0;
run;
data want;
if _n_=1 then do;
if 0 then set key;
declare hash h(dataset:'key');
h.definekey('Store_ID','Customer_Number');
h.definedone();
end;
set have;
if First_Choice>0 or Second_Choice>0 or
(Third_Choice=1 and h.check() ne 0) then do;
output;h.replace();
end;
run;
"if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID."
You mean Store_ID and Customer_ID appears only once, right?
data have;
infile cards expandtabs truncover;
input Store_ID Customer_Number Product_Number First_Choice Second_Choice Third_Choice;
cards;
1231 23345 102011 3 0 0
1231 23345 102333 0 0 1
1231 23345 102441 0 1 0
1231 24898 101457 0 0 1
1231 24898 149845 0 0 1
1354 28779 112786 1 2 0
1354 28779 122654 0 2 0
1354 28779 125234 0 0 1
1354 28779 145889 1 0 0
;
run;
data key;
set have;
if First_Choice>0 or Second_Choice>0;
run;
data want;
if _n_=1 then do;
if 0 then set key;
declare hash h(dataset:'key');
h.definekey('Store_ID','Customer_Number');
h.definedone();
end;
set have;
if First_Choice>0 or Second_Choice>0 or
(Third_Choice=1 and h.check() ne 0) then do;
output;h.replace();
end;
run;
Hi Xia....thanks you for your quick response.
"if that record(row) is unique (appears only once) grouped by Store_ID and Customer_ID."
You mean Store_ID and Customer_ID appears only once, right?
What I meant was that a record(row) could be the unique grouped by Store_ID and Customer_ID with Third_Choice = 1, then that record(row) is to be kept. If not and all the entries for Third_Choice is equal to 1 for the same Store_ID and Customer_ID, then only 1 (anyone of how many records) of these records(rows) is kept. I hope that clarifies things a little better. Thanks.
please try the DOW loop approach
data have;
infile cards missover;
input Store_ID Customer_Number Product_Number First_Choice Second_Choice
Third_Choice;
cards;
1231 23345 102011 3 0 0
1231 23345 102333 0 0 1
1231 23345 102441 0 1 0
1231 24898 101457 0 0 1
1231 24898 149845 0 0 1
1354 28779 112786 1 2 0
1354 28779 122654 0 2 0
1354 28779 125234 0 0 1
1354 28779 145889 1 0 0
;
proc sort data=have;
by Store_ID Customer_Number Third_Choice;
where First_Choice>0 or Second_Choice >0 or Third_Choice=1;
run;
data want;
i=0;
do until(last.Third_Choice);
set have;
by Store_ID Customer_Number Third_Choice;
i+1;
end;
do until(last.Third_Choice);
set have;
by Store_ID Customer_Number Third_Choice;
if (First_Choice>0 or Second_Choice >0) or (Third_Choice=1 and i=1);
output;
end;
run;
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.