BookmarkSubscribeRSS Feed
knveraraju91
Barite | Level 11

Dear,

 

In my data I need to calculate number of days between visit dates.  The "date" variable in the below data set contains numeric dates. With my below do loop code i am able to calculate difference two adjacent dates. But I need to calculate difference between all.

 

eg: In first row there are 5 date values. My code calculating difference between  first and second, third and second and so on. But I also need to calculate third and first, fourth and first, fourth and second and so on. How to code this process to work in the below code

Please help. Thank you

data one;
input a $1-26 date $27-80;
datalines;
PR PR PR PR CR CR          21109 21171 21227 21290 21350 21433
PR CR CR CR CR CR CR CR CR 20803 20853 20915 20978 21041 21105 21188 21279 21357
CR CR CR CR CR             21119 21187 21243 21312 21370
CR CR PR                   21272 21329 21384
;

data two;
set one;
cw = countw(a);
array diff{9} ;
do i= 1 to cw;
 diff(i)=scan(date,i+1)-scan(date,i);
end;
run;
    
5 REPLIES 5
PGStats
Opal | Level 21

Put your data in long form first. Then perform a cartesian self join with SQL:

 

data one;
input a $1-26 date $27-80;
datalines;
PR PR PR PR CR CR          21109 21171 21227 21290 21350 21433
PR CR CR CR CR CR CR CR CR 20803 20853 20915 20978 21041 21105 21188 21279 21357
CR CR CR CR CR             21119 21187 21243 21312 21370
CR CR PR                   21272 21329 21384
;

data two;
set one;
line = _n_;
length r $8;
do i = 1 to countw(a);
    r = scan(a, i);
    dt = input(scan(date, i), best.);
    output;
    end;
format dt date9.;
keep line r dt i;
run;

proc sql;
select 
    a.line, 
    a.r as a_1, 
    a.dt as dt_1, 
    b.r as a_2, 
    b.dt as dt_2,
    intck('day', a.dt, b.dt)
from two as a inner join two as b on a.line=b.line and a.i < b.i
order by line, a.i, b.i;
quit;

 

 

PG
knveraraju91
Barite | Level 11

Dear,

 

Thank you very much for the code.  The worked for me. But i need to add two more logic in my code to make efficient when new data added.

I need help in proc sql code below for id=7 and 8. These subjects have two or more 'NE' between two 'CR' and two 'PR'.  I am attaching a document showing how the output look like in after proc sql step. For highlighted in yellow OBS, I need to apply "intck('day', a.rdate,a.dt)"  if two or more "NE" between two 'CR' (for id=7).  I need to apply "intck('day', a.dt, b.dt)" if two or 'NE' between two PR (for id=8). Please help. Thank you very much.

 

output needed:

for OBS=70 the 'nod' variable value  should be 665.

for obs =73 the 'nod' variable value should be as shown in output(193)

 

  

 

 

 

data one;
input id rd $10. a $14-39 date $41-93;
datalines;
1 2016-01-01 PR CR CR CR CR CR CR CR CR 20803 20853 20915 20978 21041 21105 21188 21279 21357
2 2016-01-01 PR PR PR PR CR CR          21109 21171 21227 21290 21350 21433
3 2016-01-01 CR CR CR CR CR             21119 21187 21243 21312 21370
4 2016-01-01 CR CR PR                   21272 21329 21384
5 2016-01-01 CR NE                      21272 21329
6 2016-01-01 CR NE CR                   21272 21329 21384
7 2016-01-01 CR NE NE CR                21119 21187 21243 21312
8 2016-01-01 PR NE NE PR                21119 21187 21243 21312
;

data two;
set one;
line = _n_;
length r $8;
do i = 1 to countw(a);
    r = scan(a, i);
    dt = input(scan(date, i), best.);
	rdate=input(rd,yymmdd10.);
    output;
    end;
	
format dt rdate date9.;
keep id line r dt i rdate;
run;

proc sql;
create table four as
select 
    a.id, a.rdate,
    a.r as a_1, 
    a.dt as dt_1, 
    b.r as a_2, 
    b.dt as dt_2,
	case 
     when a_1 in ('CR' 'PR') and a_2 in ('CR' 'PR')then 
    intck('day', a.dt, b.dt) 
    when a_1 in ('CR' 'PR') and a_2 not in ('CR' 'PR') then intck('day', a.rdate,a.dt) end as nod
from two as a inner join two as b on a.id=b.id and a.i < b.i
where calculated nod ^= .
order by id, a.i, b.i;
quit;
PGStats
Opal | Level 21

I don't see how to express te condition " if two or more "NE" between two 'CR' " in SQL. It is not clear which comparisons will be affected; those between NEs, or those between NEs and CRs ?

PG
knveraraju91
Barite | Level 11

Thank you very much for reply. I will try to explain what i need in my pgm. If not clear sorry. I will try if i can write the logic in data step. 

 

1. In my pgm, I calculate the difference between any two dates if both responses are in 'CR' 'PR'.

 

eg. if the first response is either 'CR' or 'PR'  and second(NEXT) response also has to be 'CR' and 'PR'.

 

2. if the first response is either 'CR' or 'PR'  and second(NEXT) response not in 'CR' or 'PR'. then i calculate difference between date of first response date and randamization date(rdate in my pgm).

 

The Proc sql pgm works fine for this. 

 

But for id=7, where responses are  CR NE NE CR ,  the sql code calculates as below.

 

1.CR(first) -NE(first)  (date of CR-date of rdate)

2.CR-NE(second one)   (date of CR-date of rdate)

3. CR(first)-CR(second) (date of CR-date of CR)  

 

I need help in  step 3 where it should be calculating    (date of CR(first) -date of rdate) as there are 2 or more NE  between two CR

 

 

 

 

 

 

 

 

 

 

Astounding
PROC Star

With up to 9 dates, that would require 36 differences.  What do you plan to name those variables?

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1648 views
  • 3 likes
  • 3 in conversation