BookmarkSubscribeRSS Feed
KalaBhairava
Quartz | Level 8

I have data like this and I want to flag or delete below highlighted records, i.e. value is  0 before 1 per each category and each subject.

Sub  category   date           value
1          a            3/8/2022        0
1          a            3/9/2022        0
1          a            3/10/2022      1
1          c            3/11/2022      0
1          c            3/12/2022      1
2          a            3/8/2022        0
2          a           3/9/2022        1
2          a            3/10/2022      0
2          b            3/11/2022     1
2          b            3/12/2022      0

11 REPLIES 11
PaigeMiller
Diamond | Level 26

I'm not understanding the condition to create a flag. It seems like you want to flag all zeros. When would you not flag a zero? Please explain further.

--
Paige Miller
KalaBhairava
Quartz | Level 8
Can you check Now, updated my question.
PaigeMiller
Diamond | Level 26
data intermediate;
    set have;
    by sub category;
    if first.category then count=0;
    count+1;
    if value=1 then one_location=count;
run;
proc summary data=intermediate nway;
	class sub category;
	var one_location;
	output out=_max_ max=max_one_location;
run;
data want;
    merge intermediate _max_(drop=_:);
    by sub category;
    if value=0 and count<max_one_location then delete;
run;
--
Paige Miller
Astounding
PROC Star
When you come across a 0, you obviously don't know whether a 1 will appear later. So you have to go through the data twice. Here is one approach. The top loop goes through a sub/category combination and counts the 1's. Then the bottom loop examines the same observations and chooses which observations to output.

data want;
last1 = 0;
n_1 = 0;
Obsno = 0;
do until (last.category);
set have;
by sub category;
if value = 1 then last1 + 1;
end;
do until (last.category);
set have;
by sub category;
if value = 1 then n_1 + 1;
if value = 1 or n_1 < last_1 then output;
end;
run;

Working from a cell phone screen so I will need to look this over after I post it.
mkeintz
PROC Star

You apparently want to delete "leading zeroes" for each sub/category combination.  If so then:

 

data want (drop=_:);
  set have;
  by sub category;
  if first.category=1 then _n_of_ones=value;
  else _n_of_ones+value;
  if _n_of_ones>0;
run;

This assumes that data are already sorted by sub/category (and presumably you require the data to be sorted by date within each sub/category).  So assume data sorted by sub/category/date.

 

What you haven't explained is what you want to do if a given sub/category has no observations with value^=0.  In such a case, the program above will drop all obs for that sub/category.

 

 

Additional comment:  

You haven't answer @Kurt_Bremser 's question (or my comment above).  I.e. what do you want to do if a given sub/category has only zeroes?

 

If you want to delete them all, then above program works.  If you want to keep them all, then the following program will do:

 

data want (drop=_:);
  set have (where=(value=1) in=firstpass) have (in=secondpass);
  by sub category;
  if first.category then _cutoff_date=date;
  retain _cutoff_date;
  if secondpass and date>=_cutoff_date;
run;

If there are no 1's, then the _cutoff_date is the first value=0 date.  But if there are some 1's, then it is the first value=1 date.

 

 

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
KalaBhairava
Quartz | Level 8

It should be there, I just want to delete 0 before 1 per each subject per category.

i.e. if subject has 1 then only I want to delete 0 which are prior to 1, and 0 value at post 1 or only 0 shouldn't be affect.

Kurt_Bremser
Super User

So we need an additional check for the "non-1" cases.

data want;
set have;
by sub;
if _n_ = 1
then do;
  declare hash w1 (dataset:"have (where=(value = 1))");
  w1.definekey("sub");
  w1.definedone(),
end;
retain keep_flag;
if first.sub
then if w1.check() ne 0
  then keep_flag = 1;
  else keep_flag = 0;
if value = 1 then keep_flag = 1;
if keep_flag;
drop keep_flag;
run;
Ksharp
Super User
data have;
input Sub  category  $ date   : $20.        value;
cards;
1          a            3/8/2022        0
1          a            3/9/2022        0
1          a            3/10/2022      1
1          c            3/11/2022      0
1          c            3/12/2022      1
2          a            3/8/2022        0
2          a           3/9/2022        1
2          a            3/10/2022      0
2          b            3/11/2022     1
2          b            3/12/2022      0
; 

data want;
 set have;
 by  Sub  category;
 retain found;
 if first.category then found=0;
 if value then found=1;
 if  found;
 drop found;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 2998 views
  • 1 like
  • 6 in conversation