I have data like this and I want to flag or delete below highlighted records, i.e. value is 0 before 1 per each category and each subject.
Sub category date value
1 a 3/8/2022 0
1 a 3/9/2022 0
1 a 3/10/2022 1
1 c 3/11/2022 0
1 c 3/12/2022 1
2 a 3/8/2022 0
2 a 3/9/2022 1
2 a 3/10/2022 0
2 b 3/11/2022 1
2 b 3/12/2022 0
;
I'm not understanding the condition to create a flag. It seems like you want to flag all zeros. When would you not flag a zero? Please explain further.
data intermediate;
set have;
by sub category;
if first.category then count=0;
count+1;
if value=1 then one_location=count;
run;
proc summary data=intermediate nway;
class sub category;
var one_location;
output out=_max_ max=max_one_location;
run;
data want;
merge intermediate _max_(drop=_:);
by sub category;
if value=0 and count<max_one_location then delete;
run;
You apparently want to delete "leading zeroes" for each sub/category combination. If so then:
data want (drop=_:);
set have;
by sub category;
if first.category=1 then _n_of_ones=value;
else _n_of_ones+value;
if _n_of_ones>0;
run;
This assumes that data are already sorted by sub/category (and presumably you require the data to be sorted by date within each sub/category). So assume data sorted by sub/category/date.
What you haven't explained is what you want to do if a given sub/category has no observations with value^=0. In such a case, the program above will drop all obs for that sub/category.
Additional comment:
You haven't answer @Kurt_Bremser 's question (or my comment above). I.e. what do you want to do if a given sub/category has only zeroes?
If you want to delete them all, then above program works. If you want to keep them all, then the following program will do:
data want (drop=_:);
set have (where=(value=1) in=firstpass) have (in=secondpass);
by sub category;
if first.category then _cutoff_date=date;
retain _cutoff_date;
if secondpass and date>=_cutoff_date;
run;
If there are no 1's, then the _cutoff_date is the first value=0 date. But if there are some 1's, then it is the first value=1 date.
It should be there, I just want to delete 0 before 1 per each subject per category.
i.e. if subject has 1 then only I want to delete 0 which are prior to 1, and 0 value at post 1 or only 0 shouldn't be affect.
You have not answered my question: what if a sub has only 0's and no 1 at all? Should all observations of such a group be deleted or kept?
So we need an additional check for the "non-1" cases.
data want;
set have;
by sub;
if _n_ = 1
then do;
declare hash w1 (dataset:"have (where=(value = 1))");
w1.definekey("sub");
w1.definedone(),
end;
retain keep_flag;
if first.sub
then if w1.check() ne 0
then keep_flag = 1;
else keep_flag = 0;
if value = 1 then keep_flag = 1;
if keep_flag;
drop keep_flag;
run;
data have; input Sub category $ date : $20. value; cards; 1 a 3/8/2022 0 1 a 3/9/2022 0 1 a 3/10/2022 1 1 c 3/11/2022 0 1 c 3/12/2022 1 2 a 3/8/2022 0 2 a 3/9/2022 1 2 a 3/10/2022 0 2 b 3/11/2022 1 2 b 3/12/2022 0 ; data want; set have; by Sub category; retain found; if first.category then found=0; if value then found=1; if found; drop found; run;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.