BookmarkSubscribeRSS Feed
ChiSAS25
Fluorite | Level 6

Can you help me program this algorithm that is giving me trouble?

I have a dataset (as the one I've attached) with patient ID and the difference in days between two dates, with a range of -21 to +7.

I need to identify patients who have two consecutive values of flag_KPI = 1 within the range [-21, 0).

If they have only one in the range [-21, 0), then I consider the case where they have two consecutive values of flag_KPI = 1, one in the window [-21, 0) and the other in the window [0, 7].

4 REPLIES 4
LinusH
Tourmaline | Level 20
Please attach data using a data step and datalines.
Also, try to visualize your requirement as an output data set.
What have yo tried so far?
Data never sleeps
ChiSAS25
Fluorite | Level 6

Thnks for your reply
 
Hereafter:
data kpi_data;
    infile datalines dsd truncover;
    input SubjectId :8.
          differences_in_days :8.
          flag_KPI :8.;
datalines;
1,-21,1
1,-14,1
1,-7,1
2,-16,1
2,0,1
2,6,1
3,-16,
3,-9,1
3,5,
4,1,
5,-21,1
5,0,1
6,0,
7,-11,
7,1,
7,6,
8,-21,1
8,-11,1
8,0,1
9,-14,
9,-9,
9,-6,
9,1,
9,2,
10,-19,
10,-14,
10,-8,1
10,-5,
10,0,
11,-10,
11,-3,1
11,0,1
11,7,
12,-21,1
12,-4,1
12,1,1
12,2,1
12,2,1
12,7,
13,-15,1
13,2,
14,-15,1
14,-8,
14,3,1
15,-8,
15,1,1
16,-18,1
16,-15,1
16,-11,
16,-8,1
16,-4,
16,-3,
16,-2,
16,-1,1
16,5,1
16,6,1
17,-20,1
17,-14,1
17,-7,1
17,0,1
17,7,1
18,-21,1
18,-20,1
18,-13,1
18,-4,1
18,0,1
18,4,1
18,7,1
13,-17,1
13,-1,1
14,-15,1
14,-8,1
14,0,1
14,7,1
15,-21,1
15,-13,1
15,0,1
16,-21,1
16,-14,1
16,-7,1
16,0,1
16,1,1
17,-16,
17,-9,
17,-2,
17,0,
17,3,
17,4,
18,-17,1
18,-5,1
18,-2,
18,2,
18,5,
19,-15,
19,-1,1
20,-6,
20,-2,
20,6,
21,-21,
21,-20,
21,-18,1
21,-7,1
22,-20,1
22,-15,1
22,-6,1
22,4,
23,-18,
23,-14,
23,-11,
23,1,1
24,-14,1
24,-10,1
24,5,1
25,-1,1
25,-18,
25,-3,
26,-15,
26,0,1
27,-3,1
27,-4,1
28,-14,1
28,3,1
29,-10,1
29,6,1
30,-14,1
30,-11,
30,-8,
30,-1,1
30,3,1
;
run;
 
I tried using retain by creating two counters (one for days < 0 and one for days ≥ 0), but I wasn’t successful.
 
Thanks for any help
 

 

 

LinusH
Tourmaline | Level 20

Ok, here's my shot at it:

data want;
	set work.kpi_data;
	retain prev_flag_kpi prev_diff;
	by SubjectId notsorted;
	if first.subjectId then do;
		prev_flag_kpi = .;
		prev_diff = .;
	end;
	else do;
		if prev_flag_kpi = 1 and flag_KPI then do;
			if  (-21 <= prev_diff <= 0 and 0 <= differences_in_days <=7) or
				(-21 <= differences_in_days <= 0 and 0 <= prev_diff <=7)
				then Flag_Consecutive = 1;
		end;
	end;
	if not last.SubjectId then do;
		prev_flag_kpi = flag_KPI;
		prev_diff = differences_in_days;
	end; 
run;

I'm noting the data is not sorted on SubjectId. As a data person I'm missing a clear unique key for each row, like a sequence no, or date.

 

Data never sleeps
ballardw
Super User

@ChiSAS25 wrote:

Can you help me program this algorithm that is giving me trouble?

I have a dataset (as the one I've attached) with patient ID and the difference in days between two dates, with a range of -21 to +7.

I need to identify patients who have two consecutive values of flag_KPI = 1 within the range [-21, 0).

If they have only one in the range [-21, 0), then I consider the case where they have two consecutive values of flag_KPI = 1, one in the window [-21, 0) and the other in the window [0, 7].


Somethings that will help get a good result:

What is the output data set supposed to look like after this process?

What is the result if a patient only has one observation in the data?

What is the result if no consecutive results meet either criteria?

How does the result indicate which criteria was met?

Do any of the patients involved have more than one sequence of values? If so, how might different starts and intervals modify the result?

 

A general note about about any data set involving date or time intervals. It is usually a very good idea to include variables that contain actual date, time or datetime values with the "base" value included. There are procedures that are intended to work with such for some of these interval types of questions.

A data set as shown without actual dates might get sorted and if you have the same patient with 2 or more sequences then it might be quite awkward getting observations back into the correct order.

sas-innovate-2026-white.png



April 27 – 30 | Gaylord Texan | Grapevine, Texas

Registration is open

Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 128 views
  • 0 likes
  • 3 in conversation