BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
huhuhu
Obsidian | Level 7

Hello,

 

I got stuck in the process to calculate the difference between two dates under a specific requirement. My data looks like below (Diff and Count are the variables that I desire to have)

ID   Date          FirstPositive  Diff       Count

A     1/1/2020        .                .              .

A      1/3/2020      1                 .             1 

A      1/4/2020       .                 1             2

A      1/5/2020       .                 1             3

B      1/4/2020      1                 .              1

B      1/5/2020       .                 1             2

I want to calculate the difference between dates for each ID, but start from the row when FirstPositive =1. I also would like to have the count variable which count the number of rows for each ID start from the row when FirstPositive =1.

 

Any advice would be appreciated!

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

See if this gets you started.

data have;
   input ID $  Date :mmddyy10.  FirstPositive;
   format date mmddyy10.;
datalines;
A      1/1/2020        .    
A      1/3/2020      1     
A      1/4/2020       .    
A      1/5/2020       .    
B      1/4/2020      1     
B      1/5/2020       .    
;

data want;
   set have;
   by id;
   retain count  posflag  ;
   difdate= dif(date);
   if first.id then call missing(count,posflag);
   if posflag then diff=difdate;
   if firstPositive=1 then posflag=1;
   if posflag then count+1;
   drop posflag difdate;
run;

You may have to sort your data set by ID and Date prior to the Want data set.

If you have not seen these functions before:

Retain keeps variable values from iteration of the data step to the next.

DIF is a function that returns the current value of a variable minus the previous value.

When using BY statement SAS creates automatic variables First. and Last. that indicate whether the current is the first or last that level of a by variable.

Call missing is a function that can set a number of variables to missing values.

Timing of calculations is the main part of this problem with the when to set the diff value in relation to the iteration of the count.

View solution in original post

6 REPLIES 6
Jagadishkatam
Amethyst | Level 16

Please try the below code

 

data have;
input ID$ Date:mmddyy10. FirstPositive;
cards;
A 1/1/2020 .         
A 1/3/2020 1          
A 1/4/2020 .         
A 1/5/2020 .         
B 1/4/2020 1          
B 1/5/2020 .        
;
 
data want;
set have;
by id notsorted;
retain FirstPositive2;
if first.id then do;FirstPositive2=.;count=.;end;
if FirstPositive ne . then FirstPositive2=FirstPositive;
count+FirstPositive2;
if FirstPositive2 ne . then diff=Date-lag(date);
if first.id then diff=.;
run;
Thanks,
Jag
ballardw
Super User

See if this gets you started.

data have;
   input ID $  Date :mmddyy10.  FirstPositive;
   format date mmddyy10.;
datalines;
A      1/1/2020        .    
A      1/3/2020      1     
A      1/4/2020       .    
A      1/5/2020       .    
B      1/4/2020      1     
B      1/5/2020       .    
;

data want;
   set have;
   by id;
   retain count  posflag  ;
   difdate= dif(date);
   if first.id then call missing(count,posflag);
   if posflag then diff=difdate;
   if firstPositive=1 then posflag=1;
   if posflag then count+1;
   drop posflag difdate;
run;

You may have to sort your data set by ID and Date prior to the Want data set.

If you have not seen these functions before:

Retain keeps variable values from iteration of the data step to the next.

DIF is a function that returns the current value of a variable minus the previous value.

When using BY statement SAS creates automatic variables First. and Last. that indicate whether the current is the first or last that level of a by variable.

Call missing is a function that can set a number of variables to missing values.

Timing of calculations is the main part of this problem with the when to set the diff value in relation to the iteration of the count.

huhuhu
Obsidian | Level 7
Thank you so much. That's exactly what I'm looking for. I also appreciate your detailed explanation of functions used here. so helpful!
smantha
Lapis Lazuli | Level 10
data sample(Drop = diff count);
informat ID $3. Date mmddyy10. FirstPositive Diff Count 3.;
input ID $3. Date FirstPositive Diff Count;
format date mmddyy10.;
datalines;
A 1/1/2020 . . .
A 1/3/2020 1 . 1 
A 1/4/2020 . 1 2
A 1/5/2020 . 1 3
B 1/4/2020 1 . 1
B 1/5/2020 . 1 2
;
proc sort; by ID Date; run;

data sample(drop=prv_date);
set sample;
by id;
retain count Diff prv_date 0;
if first.id then do;
count=.; Diff=.;
prv_date = .;
end;

if FirstPositive = 1 then do;
count=1;
prv_date = Date;
end;
else if count > 0 then do;
count = count+1;
Diff = Date-prv_date;
prv_date=date;
end;
run;
novinosrin
Tourmaline | Level 20

Hi @huhuhu  Your case presents a nice scenario for yet another "dorfmanisms" aka automatic variables usage-

data have;
   input ID $  Date :mmddyy10.  FirstPositive;
   format date mmddyy10.;
datalines;
A      1/1/2020        .    
A      1/3/2020      1     
A      1/4/2020       .    
A      1/5/2020       .    
B      1/4/2020      1     
B      1/5/2020       .    
;

data want;
 do until(last.id);
  set have;
  by id;
  diff=dif(date);
  if _n_ then diff=.;
  if FirstPositive then _n_=0;
  if _n_=0  then  count=sum(count,1);
  output;
 end;
run;
r_behata
Barite | Level 11
data have;
input ID  $ Date : mmddyy10.  FirstPositive;
format date mmddyy10.;
cards;
A 1/1/2020 .
A 1/3/2020 1
A 1/4/2020 .
A 1/5/2020 .
B 1/4/2020 1
B 1/5/2020 .
;
run;



data want;
	Set have;
	by ID FirstPositive notsorted;

	retain FirstPositive_;

	if FirstPositive=1 then FirstPositive_=FirstPositive;

	if first.id then count=.;
		if FirstPositive_=1  then count+ (1 * FirstPositive_);	else count =.;

	dif=ifn(first.ID=0 and count > FirstPositive_ ,dif(date),.);

	drop FirstPositive_ ;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 680 views
  • 1 like
  • 6 in conversation