Dear,
In my data I need to calculate number of days between visit dates. The "date" variable in the below data set contains numeric dates. With my below do loop code i am able to calculate difference two adjacent dates. But I need to calculate difference between all.
eg: In first row there are 5 date values. My code calculating difference between first and second, third and second and so on. But I also need to calculate third and first, fourth and first, fourth and second and so on. How to code this process to work in the below code
Please help. Thank you
data one;
input a $1-26 date $27-80;
datalines;
PR PR PR PR CR CR 21109 21171 21227 21290 21350 21433
PR CR CR CR CR CR CR CR CR 20803 20853 20915 20978 21041 21105 21188 21279 21357
CR CR CR CR CR 21119 21187 21243 21312 21370
CR CR PR 21272 21329 21384
;
data two;
set one;
cw = countw(a);
array diff{9} ;
do i= 1 to cw;
diff(i)=scan(date,i+1)-scan(date,i);
end;
run;
Put your data in long form first. Then perform a cartesian self join with SQL:
data one;
input a $1-26 date $27-80;
datalines;
PR PR PR PR CR CR 21109 21171 21227 21290 21350 21433
PR CR CR CR CR CR CR CR CR 20803 20853 20915 20978 21041 21105 21188 21279 21357
CR CR CR CR CR 21119 21187 21243 21312 21370
CR CR PR 21272 21329 21384
;
data two;
set one;
line = _n_;
length r $8;
do i = 1 to countw(a);
r = scan(a, i);
dt = input(scan(date, i), best.);
output;
end;
format dt date9.;
keep line r dt i;
run;
proc sql;
select
a.line,
a.r as a_1,
a.dt as dt_1,
b.r as a_2,
b.dt as dt_2,
intck('day', a.dt, b.dt)
from two as a inner join two as b on a.line=b.line and a.i < b.i
order by line, a.i, b.i;
quit;
Dear,
Thank you very much for the code. The worked for me. But i need to add two more logic in my code to make efficient when new data added.
I need help in proc sql code below for id=7 and 8. These subjects have two or more 'NE' between two 'CR' and two 'PR'. I am attaching a document showing how the output look like in after proc sql step. For highlighted in yellow OBS, I need to apply "intck('day', a.rdate,a.dt)" if two or more "NE" between two 'CR' (for id=7). I need to apply "intck('day', a.dt, b.dt)" if two or 'NE' between two PR (for id=8). Please help. Thank you very much.
output needed:
for OBS=70 the 'nod' variable value should be 665.
for obs =73 the 'nod' variable value should be as shown in output(193)
data one;
input id rd $10. a $14-39 date $41-93;
datalines;
1 2016-01-01 PR CR CR CR CR CR CR CR CR 20803 20853 20915 20978 21041 21105 21188 21279 21357
2 2016-01-01 PR PR PR PR CR CR 21109 21171 21227 21290 21350 21433
3 2016-01-01 CR CR CR CR CR 21119 21187 21243 21312 21370
4 2016-01-01 CR CR PR 21272 21329 21384
5 2016-01-01 CR NE 21272 21329
6 2016-01-01 CR NE CR 21272 21329 21384
7 2016-01-01 CR NE NE CR 21119 21187 21243 21312
8 2016-01-01 PR NE NE PR 21119 21187 21243 21312
;
data two;
set one;
line = _n_;
length r $8;
do i = 1 to countw(a);
r = scan(a, i);
dt = input(scan(date, i), best.);
rdate=input(rd,yymmdd10.);
output;
end;
format dt rdate date9.;
keep id line r dt i rdate;
run;
proc sql;
create table four as
select
a.id, a.rdate,
a.r as a_1,
a.dt as dt_1,
b.r as a_2,
b.dt as dt_2,
case
when a_1 in ('CR' 'PR') and a_2 in ('CR' 'PR')then
intck('day', a.dt, b.dt)
when a_1 in ('CR' 'PR') and a_2 not in ('CR' 'PR') then intck('day', a.rdate,a.dt) end as nod
from two as a inner join two as b on a.id=b.id and a.i < b.i
where calculated nod ^= .
order by id, a.i, b.i;
quit;
I don't see how to express te condition " if two or more "NE" between two 'CR' " in SQL. It is not clear which comparisons will be affected; those between NEs, or those between NEs and CRs ?
Thank you very much for reply. I will try to explain what i need in my pgm. If not clear sorry. I will try if i can write the logic in data step.
1. In my pgm, I calculate the difference between any two dates if both responses are in 'CR' 'PR'.
eg. if the first response is either 'CR' or 'PR' and second(NEXT) response also has to be 'CR' and 'PR'.
2. if the first response is either 'CR' or 'PR' and second(NEXT) response not in 'CR' or 'PR'. then i calculate difference between date of first response date and randamization date(rdate in my pgm).
The Proc sql pgm works fine for this.
But for id=7, where responses are CR NE NE CR , the sql code calculates as below.
1.CR(first) -NE(first) (date of CR-date of rdate)
2.CR-NE(second one) (date of CR-date of rdate)
3. CR(first)-CR(second) (date of CR-date of CR)
I need help in step 3 where it should be calculating (date of CR(first) -date of rdate) as there are 2 or more NE between two CR
With up to 9 dates, that would require 36 differences. What do you plan to name those variables?
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.