BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Joshua017
Fluorite | Level 6

I have about 200,000 records of cases and I want to identify the true consecutive Negative result (can't have Positive result after it). 

Here's some examples:

A: Keep only the result after first two positive (Obs 3-5)

B: Obs 6-7 need to be drop because these is a Positive after these Negative.

C: Need to be drop

😧 Only 15-17 can be keep since 13 have Positive after them as well.

 

The basic concept is to keep the Negative results after the last positive for each case (group?) and give them a seq# if possible.

Thank you so much for your help. I tried to add seq# but many times it starting with 2 or even 4 when I use

if first.Case then Seq+1;

ObsCaseTest DateTest ResultRecord WantedSeq#
1A01/01/2020Missing  
2A01/02/2021Positive  
3A01/03/2021NegativeV1
4A01/04/2021NegativeV2
5A01/05/2021NegativeV3
6B01/01/2020Negative  
7B01/02/2021Negative  
8B01/03/2021Positive  
9C01/02/2021Missing  
10D01/01/2020Positive  
11D01/02/2021Positive  
12D01/03/2021Missing  
13D01/04/2021Negative  
14D01/01/2021Positive  
15D01/02/2022NegativeV1
16D01/03/2022NegativeV2
17D01/04/2022NegativeV3
1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

Welcome to the SAS Communities. 

You need to read the source table twice because you can only figure out which rows to select once you've processed the last row within a group.

Below code should do what you're after.

data want;
  set have;
  by notsorted case;
  retain start_row n_row;
  if first.case then call missing(start_row, n_row);
  if Test_Result ne 'Negative' then call missing(start_row, n_row);
  else if missing(start_row) then start_row=_n_;
  if not missing(start_row) then n_row+1;
  if last.case and not missing(start_row) then 
    do;
      stop_row=start_row+n_row-1;
      do i=start_row to stop_row;
        set have point=i;
        output;
      end;
    end;
/*  drop start_row stop_row n_row i;*/
run;

proc print data=want;
run;

View solution in original post

4 REPLIES 4
Joshua017
Fluorite | Level 6

I got something like below but it's not working expected.

 

data Test_Negative;

   set Tests;

   by Case;

   if Test_Result^="Negative" then T_Seq=0;

   else if Test_Result="Negative" then T_Seq+1;

   if first.Case & Test_Result^="Negative" then T_Seq=0;

   else if first.Case & Test_Result="Negative" then T_Seq+1;

run;

Patrick
Opal | Level 21

Welcome to the SAS Communities. 

You need to read the source table twice because you can only figure out which rows to select once you've processed the last row within a group.

Below code should do what you're after.

data want;
  set have;
  by notsorted case;
  retain start_row n_row;
  if first.case then call missing(start_row, n_row);
  if Test_Result ne 'Negative' then call missing(start_row, n_row);
  else if missing(start_row) then start_row=_n_;
  if not missing(start_row) then n_row+1;
  if last.case and not missing(start_row) then 
    do;
      stop_row=start_row+n_row-1;
      do i=start_row to stop_row;
        set have point=i;
        output;
      end;
    end;
/*  drop start_row stop_row n_row i;*/
run;

proc print data=want;
run;
PGStats
Opal | Level 21

A double do until() comes in handy for this task:

 

data have;
input Obs Case :$1. Date :mmddyy10. Result :$12.;
format date yymmdd10.;
datalines;
1   A   01/01/2020  Missing      
2   A   01/02/2021  Positive         
3   A   01/03/2021  Negative    V   1
4   A   01/04/2021  Negative    V   2
5   A   01/05/2021  Negative    V   3
6   B   01/01/2020  Negative         
7   B   01/02/2021  Negative         
8   B   01/03/2021  Positive         
9   C   01/02/2021  Missing      
10  D   01/01/2020  Positive         
11  D   01/02/2021  Positive         
12  D   01/03/2021  Missing      
13  D   01/04/2021  Negative         
14  D   01/01/2021  Positive         
15  D   01/02/2022  Negative    V   1
16  D   01/03/2022  Negative    V   2
17  D   01/04/2022  Negative    V   3
;

data want;
do n = 1 by 1 until (last.case);
	set have; by case;
	if result = "Negative" then do;
		if missing(firstPos) then firstPos = n;
		lastPos = n;
		end;
	else call missing(firstPos);
	end;
seq = 0;
do n = 1 by 1 until (last.case);
	set have; by case;
	if firstPos > 0 then
		if lastPos > firstPos + 1 and n >= firstPos then do;
			seq + 1;
			output;
			end;
	end;
drop n firstPos lastPos;
run;
PG
Ksharp
Super User
data have;
input Obs Case :$1. Date :mmddyy10. Result :$12.;
format date yymmdd10.;
datalines;
1   A   01/01/2020  Missing      
2   A   01/02/2021  Positive         
3   A   01/03/2021  Negative    V   1
4   A   01/04/2021  Negative    V   2
5   A   01/05/2021  Negative    V   3
6   B   01/01/2020  Negative         
7   B   01/02/2021  Negative         
8   B   01/03/2021  Positive         
9   C   01/02/2021  Missing      
10  D   01/01/2020  Positive         
11  D   01/02/2021  Positive         
12  D   01/03/2021  Missing      
13  D   01/04/2021  Negative         
14  D   01/01/2021  Positive         
15  D   01/02/2022  Negative    V   1
16  D   01/03/2022  Negative    V   2
17  D   01/04/2022  Negative    V   3
;
data want;
do i=1 by 1 until(last.case);
 set have;
 by case ;
 if result='Positive' then  _i=i;
end;

do i=1 by 1 until(last.case);
 set have;
 by case result notsorted;
 if first.result then count=.;
 count+1;
 if i le _i or result ne 'Negative' then call missing(count);
 output;
end;
drop i _i;
run;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 1455 views
  • 3 likes
  • 4 in conversation