BookmarkSubscribeRSS Feed
AmitParmar
Obsidian | Level 7

Difference between end of current drug and start of next same drug is more than 60,consider them two different usages otherwise club them as single usages with minimum start and maximum end.
INPUT DATA
DRUG   S.DATE           E.DATE
A           24-Sep-19      03-Dec-19
A         17-Jan-20            21-Feb-20
A       21 -MAY-20           20-Jun-20
B      30-Jun-20           29-Aug-20
B       13-OCT-20           12-NOV-20


OUPUT DATA
DRUG     S.DATE      E.DATE
A         24-Sep-19            21-Feb-20
A       21-May-20            21-Jun-20
B     30-Jul-20               12-NOV-20

12 REPLIES 12
ChrisNZ
Tourmaline | Level 20

Something like this should work. Untested as you did not provide the data as a program that we could use.

data WANT;
  set HAVE;
  by DRUG S_DATE;
  retain S;
  L=lag(E_DATE);
  if first.DRUG then S=S_DATE;
  if S_DATE - L > 60 then do;
    E=L;
    output;
    S=S_DATE;
  end;
  if last.DRUG then do;
    E=L;
    output;
  end;
  keep DRUG S E;
run;
AmitParmar
Obsidian | Level 7

data a;
length sdate edate$15.;
input DRUG$       SDATE $        EDATE $;
cards;
A                     24-Sep-19                  03-Dec-19
A                   17-Jan-20                     21-Feb-20
A                      21-MAY-20              20-Jun-20
B                30-Jun-20                    29-Aug-20
B                    13-OCT-20                  12-NOV-20
;
run;

 

Here is the data and the solution provided does not provide the desired result it gives the following result

Obs DRUG S E12
A17-Jan-2020-Jun-20
B13-OCT-2012-NOV-20
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @AmitParmar 

 

The general idea would be to look ahead using a second set statement with firstobs=2. 

Save the start date and compare the end date to the start date in next record.

If the difference > 60 or next record is a new drug then set start = saved start date and output.

Otherwise continue.

 

If you provide working code to create the input data set I will make a working example for you. Your real data must contain other variables beside the drug and dates, and these will be taken from the last record unless something is done to save these too.   

AmitParmar
Obsidian | Level 7

Specify first consecutive pair where the gap between two visit date is >50.
ID VISIT_DATE
A 24-Jul-15
A 13-Aug-15
A 12-Sep-15
A 16-Dec-15
A 15-Mar-16
B 24-Jul-15
B 22-Oct-15
B 25-Jan-16
B 04-Feb-16
B 15-Mar-16

 

I guess we cannot use INTCK function so how gonna we calculate the time period and happy the condition.

andreas_lds
Jade | Level 19

I have merged, please don't double-post questions.

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @AmitParmar 

I have tried to make a different code, you could use for experiments. It is split up in two parts:

The first data step performs a full analysis, and the next extracts relevant records based on the analysis. 

 


data have;
	informat Visit_Date anydtdte.;
	format Visit_Date date.;
	input ID$ Visit_Date;
	cards;
A 24-Jul-15
A 13-Aug-15
A 12-Sep-15
A 16-Dec-15
A 15-Mar-16
B 24-Jul-15
B 22-Oct-15
B 25-Jan-16
B 04-Feb-16
B 15-Mar-16
;
run;

* Sort data to make sure input is ordered for further processing;
proc sort data=have;
	by ID Visit_Date;
run;

* Perform af full analysis;
* Using LAG function to get previous date and Look-ahead SET statement to get nex date;
data temp (drop=Next_ID);
	set have end=eof;
	by ID;
	format Prev_Visit_Date date.;

	* Initiate break point variables;
	Retain Break_Before Break_After;
	if first.ID then call missing(Break_Before, Break_After);

	* Calculate difference to previous record within same ID;
	Prev_Visit_Date = lag(Visit_Date);
	if not first.ID then  Diff_To_Prev = (Visit_Date - Prev_Visit_Date);
	else Diff_To_Prev = .;

	* Get values from next record; 
	if not eof then set have ( firstobs=2 keep=Visit_Date ID rename=(ID = Next_ID Visit_Date=Next_Visit_Date));

	* Calculate difference to next record within same ID;
	if not last.ID then Diff_To_Next = (Next_Visit_Date - Visit_Date);

	* Set Break Points;
	if Diff_To_Prev > 50 then Break_Before + 1;
	else Break_Before = .;
	if Diff_To_Next > 50 then Break_After + 1;	
	else Break_After = .;
run;

* Get wanted data;
* This gives the first pair;
data want;
	set temp;
 	keep ID Visit_Date;
	if Break_Before = 1 or Break_After = 1;
run;

* Get wanted data;
* This gives all pairs, doubling records being in two pairs;
data want2;
	set temp;
	keep ID Visit_Date;
	if Break_Before ne . then output; 
	if Break_After ne . then output;
run;
AmitParmar
Obsidian | Level 7

Thank you so much I appreciate your help, sir...

AmitParmar
Obsidian | Level 7

Hi Erik,

I have provided the dataset above please check and help me with the desired output.

 

Thanks

ErikLund_Jensen
Rhodochrosite | Level 12

Hi @AmitParmar 

 

Please check

data a;
length sdate edate$15.;
input DRUG$       SDATE $        EDATE $;
cards;
A                     24-Sep-19                  03-Dec-19
A                   17-Jan-20                     21-Feb-20
A                      21-MAY-20              20-Jun-20
B                30-Jun-20                    29-Aug-20
B                    13-OCT-20                  12-NOV-20
;
run;

* Convert the dates to SAS dates;
* It must be "real" SAS dates to perform date calculations;
* Internal SAS date value is the number of days since 01jan1960;
data have (drop = sdate edate);
	set a;
	format Date_Start Date_End date.;
	Date_Start = input(sdate,anydtdte.);
	Date_End = input(edate,anydtdte.);
run;

* Sort data;
* Date_End included in sort to get highest Date_End
    in case of more than one record with same drug/Date_Start; 
proc sort data=have;
	by DRUG Date_Start Date_End;
run;

data want (drop=Date_Start_current Date_Start_Next Drug_Next);
	set have (rename=(Date_Start=Date_Start_current)) end=eof;
	by DRUG;
	retain Date_Start;
	format Date_Start date.;

	if not eof then do;
		* Get values from next record;
		set have (firstobs=2 keep=Date_Start DRUG rename=(Date_Start=Date_Start_Next DRUG=Drug_Next));
	end;

	if first.DRUG then Date_Start = Date_Start_Current;
	if Drug_Next ne DRUG or eof or (Date_Start_Next - Date_End) > 50 then do;
		output;
		Date_Start = Date_Start_Current;
		Date_Start_Next = .;
	end;
	else do;
		if Date_Start_Next = . then Date_Start_Next = Date_Start_Current;
	end;
run;

if this helps:

 

AmitParmar
Obsidian | Level 7

Thank you so much for this explanation it's crystal clear.

AmitParmar
Obsidian | Level 7

Hi Sir, Can you please look this up for me

 

Specify first consecutive pair where the gap between two visit date is >50.
ID VISIT_DATE
A 24-Jul-15
A 13-Aug-15
A 12-Sep-15
A 16-Dec-15
A 15-Mar-16
B 24-Jul-15
B 22-Oct-15
B 25-Jan-16
B 04-Feb-16
B 15-Mar-16

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 2711 views
  • 6 likes
  • 5 in conversation