Dear All,
I have a data file below. I want to run counting process approach to analyze effect of status on event. visit_date is CT measurement date. lastvisit is last clinical visit date. I want to create two variables
data A;
input ID status Event visit_date lastvisit;
datalines;
1 Normal 0 8-Jun-09 18-Aug-11
1 Normal 0 10-Dec-09 18-Aug-11
1 Normal 0 1-Jul-10 18-Aug-11
1 Normal 0 18-Aug-11 18-Aug-11
2 Normal 0 30-Jul-09 7-Jun-11
2 Normal 0 26-Jan-10 7-Jun-11
3 Low 0 7-Aug-09 13-Jan-11
3 Normal 0 3-Feb-10 13-Jan-11
3 Normal 1 12-Aug-10 13-Jan-11
;
I want to create two variables: 'Start' and 'Exit'. New data file is like below:
ID |
status |
event |
visit_date |
lastvisit |
Start |
Exit |
1 |
Normal |
0 |
8-Jun-09 |
18-Aug-11 |
|
|
1 |
Normal |
0 |
10-Dec-09 |
18-Aug-11 |
|
|
1 |
Normal |
0 |
1-Jul-10 |
18-Aug-11 |
|
|
1 |
Normal |
0 |
18-Aug-11 |
18-Aug-11 |
||
2 |
Normal |
0 |
30-Jul-09 |
7-Jun-11 |
|
|
2 |
Normal |
0 |
26-Jan-10 |
7-Jun-11 |
|
|
3 |
Low |
1 |
7-Aug-09 |
13-Jan-11 |
|
|
3 |
Normal |
1 |
3-Feb-10 |
13-Jan-11 |
|
|
3 |
Normal |
1 |
12-Aug-10 |
13-Jan-11 |
|
|
For each subject, 1st observation, start=0, exit=2nd visit_date - 1st visit_date;
2nd observation, start=2nd visit_date - 1st visit_date, exit=3rd visit_date - 1st visit_date;
3rd observation, start=3rd visit_date - 1st visit_date, exit=4th visit_date - 1st visit_date;
4th or last obvervation, if visit_date ≥ lastvisit, then start and exit count as missing. if visit_date < lastvisit, then start=exit of upper observation, exit=lastvisit - 1st visit_date.
Please help,
Thanks
I think you need to clarify what is meant by "exit of upper observation ".
And to clarify, when you say "2nd visit_date - 1st visit_date" do you mean the number of days between the two dates?
Do you ever have more than 4 records per ID value?
I run syntax ( data want part is from mohamed_zaki previously) below:
data A;
input ID $ status $ Event $ visit_date:DATE8. lastvisit:date8.;
format visit_date lastvisit DATE8.;
datalines;
1 Normal 0 8-Jun-09 18-Aug-11
1 Normal 0 10-Dec-09 18-Aug-11
1 Normal 0 1-Jul-10 18-Aug-11
1 Normal 0 18-Aug-11 18-Aug-11
2 Normal 0 30-Jul-09 7-Jun-11
2 Normal 0 26-Jan-10 7-Jun-11
3 Low 0 7-Aug-09 13-Jan-11
3 Normal 0 3-Feb-10 13-Jan-11
3 Normal 1 12-Aug-10 13-Jan-11
;
run;
data want;
set A;
by ID;
retain interval start exit;
lagdate=lag(visit_date);
if first.ID then interval=0;
else interval =interval+intck('day',lagdate,visit_date);
drop lagdate;
run;
proc print data=want; run;
Output:
ID | status | Event | visit_date | lastvisit | interval |
---|---|---|---|---|---|
1 | Normal | 0 | 08JUN09 | 18AUG11 | 0 |
1 | Normal | 0 | 10DEC09 | 18AUG11 | 185 |
1 | Normal | 0 | 01JUL10 | 18AUG11 | 388 |
1 | Normal | 0 | 18AUG11 | 18AUG11 | 801 |
2 | Normal | 0 | 30JUL09 | 07JUN11 | 0 |
2 | Normal | 0 | 26JAN10 | 07JUN11 | 180 |
3 | Low | 0 | 07AUG09 | 13JAN11 | 0 |
3 | Normal | 0 | 03FEB10 | 13JAN11 | 180 |
3 | Normal | 1 | 12AUG10 | 13JAN11 | 370 |
I have difficulty to have syntax to create variables 'start' and 'exit' based on 'interval'.
I have 100 subjects, each subject have 1 to 4 measurements, the exam date is 'visit_date', last clinic visit date 'lastvisit'.
For example, ID 2: 1st observation, start=0, exit=180; 2nd observation, because 'lastvisit'>'visit_date, then start=180, exit=('lastvisit' - 1st 'visit_date(30JUL09)), if 'lastvisit'<='visit_date', then 'start' and 'exit' are counted as missing.
Thanks for your help,
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.