I have data like this and I want to flag or delete below highlighted records, i.e. value is 0 before 1 per each category and each subject.
Sub category date value
1 a 3/8/2022 0
1 a 3/9/2022 0
1 a 3/10/2022 1
1 c 3/11/2022 0
1 c 3/12/2022 1
2 a 3/8/2022 0
2 a 3/9/2022 1
2 a 3/10/2022 0
2 b 3/11/2022 1
2 b 3/12/2022 0
;
I'm not understanding the condition to create a flag. It seems like you want to flag all zeros. When would you not flag a zero? Please explain further.
data intermediate;
set have;
by sub category;
if first.category then count=0;
count+1;
if value=1 then one_location=count;
run;
proc summary data=intermediate nway;
class sub category;
var one_location;
output out=_max_ max=max_one_location;
run;
data want;
merge intermediate _max_(drop=_:);
by sub category;
if value=0 and count<max_one_location then delete;
run;
You apparently want to delete "leading zeroes" for each sub/category combination. If so then:
data want (drop=_:);
set have;
by sub category;
if first.category=1 then _n_of_ones=value;
else _n_of_ones+value;
if _n_of_ones>0;
run;
This assumes that data are already sorted by sub/category (and presumably you require the data to be sorted by date within each sub/category). So assume data sorted by sub/category/date.
What you haven't explained is what you want to do if a given sub/category has no observations with value^=0. In such a case, the program above will drop all obs for that sub/category.
Additional comment:
You haven't answer @Kurt_Bremser 's question (or my comment above). I.e. what do you want to do if a given sub/category has only zeroes?
If you want to delete them all, then above program works. If you want to keep them all, then the following program will do:
data want (drop=_:);
set have (where=(value=1) in=firstpass) have (in=secondpass);
by sub category;
if first.category then _cutoff_date=date;
retain _cutoff_date;
if secondpass and date>=_cutoff_date;
run;
If there are no 1's, then the _cutoff_date is the first value=0 date. But if there are some 1's, then it is the first value=1 date.
It should be there, I just want to delete 0 before 1 per each subject per category.
i.e. if subject has 1 then only I want to delete 0 which are prior to 1, and 0 value at post 1 or only 0 shouldn't be affect.
You have not answered my question: what if a sub has only 0's and no 1 at all? Should all observations of such a group be deleted or kept?
So we need an additional check for the "non-1" cases.
data want;
set have;
by sub;
if _n_ = 1
then do;
declare hash w1 (dataset:"have (where=(value = 1))");
w1.definekey("sub");
w1.definedone(),
end;
retain keep_flag;
if first.sub
then if w1.check() ne 0
then keep_flag = 1;
else keep_flag = 0;
if value = 1 then keep_flag = 1;
if keep_flag;
drop keep_flag;
run;
data have; input Sub category $ date : $20. value; cards; 1 a 3/8/2022 0 1 a 3/9/2022 0 1 a 3/10/2022 1 1 c 3/11/2022 0 1 c 3/12/2022 1 2 a 3/8/2022 0 2 a 3/9/2022 1 2 a 3/10/2022 0 2 b 3/11/2022 1 2 b 3/12/2022 0 ; data want; set have; by Sub category; retain found; if first.category then found=0; if value then found=1; if found; drop found; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.