BookmarkSubscribeRSS Feed
KalaBhairava
Quartz | Level 8

I have data like this and I want to flag or delete below highlighted records, i.e. value is  0 before 1 per each category and each subject.

Sub  category   date           value
1          a            3/8/2022        0
1          a            3/9/2022        0
1          a            3/10/2022      1
1          c            3/11/2022      0
1          c            3/12/2022      1
2          a            3/8/2022        0
2          a           3/9/2022        1
2          a            3/10/2022      0
2          b            3/11/2022     1
2          b            3/12/2022      0

11 REPLIES 11
PaigeMiller
Diamond | Level 26

I'm not understanding the condition to create a flag. It seems like you want to flag all zeros. When would you not flag a zero? Please explain further.

--
Paige Miller
KalaBhairava
Quartz | Level 8
Can you check Now, updated my question.
PaigeMiller
Diamond | Level 26
data intermediate;
    set have;
    by sub category;
    if first.category then count=0;
    count+1;
    if value=1 then one_location=count;
run;
proc summary data=intermediate nway;
	class sub category;
	var one_location;
	output out=_max_ max=max_one_location;
run;
data want;
    merge intermediate _max_(drop=_:);
    by sub category;
    if value=0 and count<max_one_location then delete;
run;
--
Paige Miller
Astounding
PROC Star
When you come across a 0, you obviously don't know whether a 1 will appear later. So you have to go through the data twice. Here is one approach. The top loop goes through a sub/category combination and counts the 1's. Then the bottom loop examines the same observations and chooses which observations to output.

data want;
last1 = 0;
n_1 = 0;
Obsno = 0;
do until (last.category);
set have;
by sub category;
if value = 1 then last1 + 1;
end;
do until (last.category);
set have;
by sub category;
if value = 1 then n_1 + 1;
if value = 1 or n_1 < last_1 then output;
end;
run;

Working from a cell phone screen so I will need to look this over after I post it.
mkeintz
PROC Star

You apparently want to delete "leading zeroes" for each sub/category combination.  If so then:

 

data want (drop=_:);
  set have;
  by sub category;
  if first.category=1 then _n_of_ones=value;
  else _n_of_ones+value;
  if _n_of_ones>0;
run;

This assumes that data are already sorted by sub/category (and presumably you require the data to be sorted by date within each sub/category).  So assume data sorted by sub/category/date.

 

What you haven't explained is what you want to do if a given sub/category has no observations with value^=0.  In such a case, the program above will drop all obs for that sub/category.

 

 

Additional comment:  

You haven't answer @Kurt_Bremser 's question (or my comment above).  I.e. what do you want to do if a given sub/category has only zeroes?

 

If you want to delete them all, then above program works.  If you want to keep them all, then the following program will do:

 

data want (drop=_:);
  set have (where=(value=1) in=firstpass) have (in=secondpass);
  by sub category;
  if first.category then _cutoff_date=date;
  retain _cutoff_date;
  if secondpass and date>=_cutoff_date;
run;

If there are no 1's, then the _cutoff_date is the first value=0 date.  But if there are some 1's, then it is the first value=1 date.

 

 

 

 

 

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
KalaBhairava
Quartz | Level 8

It should be there, I just want to delete 0 before 1 per each subject per category.

i.e. if subject has 1 then only I want to delete 0 which are prior to 1, and 0 value at post 1 or only 0 shouldn't be affect.

Kurt_Bremser
Super User

So we need an additional check for the "non-1" cases.

data want;
set have;
by sub;
if _n_ = 1
then do;
  declare hash w1 (dataset:"have (where=(value = 1))");
  w1.definekey("sub");
  w1.definedone(),
end;
retain keep_flag;
if first.sub
then if w1.check() ne 0
  then keep_flag = 1;
  else keep_flag = 0;
if value = 1 then keep_flag = 1;
if keep_flag;
drop keep_flag;
run;
Ksharp
Super User
data have;
input Sub  category  $ date   : $20.        value;
cards;
1          a            3/8/2022        0
1          a            3/9/2022        0
1          a            3/10/2022      1
1          c            3/11/2022      0
1          c            3/12/2022      1
2          a            3/8/2022        0
2          a           3/9/2022        1
2          a            3/10/2022      0
2          b            3/11/2022     1
2          b            3/12/2022      0
; 

data want;
 set have;
 by  Sub  category;
 retain found;
 if first.category then found=0;
 if value then found=1;
 if  found;
 drop found;
run;
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 11 replies
  • 3063 views
  • 1 like
  • 6 in conversation