Dear All,
I have a data file below. I want to run counting process approach to analyze effect of status on event. visit_date is CT measurement date. lastvisit is last clinical visit date. I want to create two variables
data A;
input ID status Event visit_date lastvisit;
datalines;
1 Normal 0 8-Jun-09 18-Aug-11
1 Normal 0 10-Dec-09 18-Aug-11
1 Normal 0 1-Jul-10 18-Aug-11
1 Normal 0 18-Aug-11 18-Aug-11
2 Normal 0 30-Jul-09 7-Jun-11
2 Normal 0 26-Jan-10 7-Jun-11
3 Low 0 7-Aug-09 13-Jan-11
3 Normal 0 3-Feb-10 13-Jan-11
3 Normal 1 12-Aug-10 13-Jan-11
;
I want to create two variables: 'Start' and 'Exit'. New data file is like below:
ID |
status |
event |
visit_date |
lastvisit |
Start |
Exit |
1 |
Normal |
0 |
8-Jun-09 |
18-Aug-11 |
|
|
1 |
Normal |
0 |
10-Dec-09 |
18-Aug-11 |
|
|
1 |
Normal |
0 |
1-Jul-10 |
18-Aug-11 |
|
|
1 |
Normal |
0 |
18-Aug-11 |
18-Aug-11 |
||
2 |
Normal |
0 |
30-Jul-09 |
7-Jun-11 |
|
|
2 |
Normal |
0 |
26-Jan-10 |
7-Jun-11 |
|
|
3 |
Low |
1 |
7-Aug-09 |
13-Jan-11 |
|
|
3 |
Normal |
1 |
3-Feb-10 |
13-Jan-11 |
|
|
3 |
Normal |
1 |
12-Aug-10 |
13-Jan-11 |
|
|
For each subject, 1st observation, start=0, exit=2nd visit_date - 1st visit_date;
2nd observation, start=2nd visit_date - 1st visit_date, exit=3rd visit_date - 1st visit_date;
3rd observation, start=3rd visit_date - 1st visit_date, exit=4th visit_date - 1st visit_date;
4th or last obvervation, if visit_date ≥ lastvisit, then start and exit count as missing. if visit_date < lastvisit, then start=exit of upper observation, exit=lastvisit - 1st visit_date.
Please help,
Thanks
I think you need to clarify what is meant by "exit of upper observation ".
And to clarify, when you say "2nd visit_date - 1st visit_date" do you mean the number of days between the two dates?
Do you ever have more than 4 records per ID value?
I run syntax ( data want part is from mohamed_zaki previously) below:
data A;
input ID $ status $ Event $ visit_date:DATE8. lastvisit:date8.;
format visit_date lastvisit DATE8.;
datalines;
1 Normal 0 8-Jun-09 18-Aug-11
1 Normal 0 10-Dec-09 18-Aug-11
1 Normal 0 1-Jul-10 18-Aug-11
1 Normal 0 18-Aug-11 18-Aug-11
2 Normal 0 30-Jul-09 7-Jun-11
2 Normal 0 26-Jan-10 7-Jun-11
3 Low 0 7-Aug-09 13-Jan-11
3 Normal 0 3-Feb-10 13-Jan-11
3 Normal 1 12-Aug-10 13-Jan-11
;
run;
data want;
set A;
by ID;
retain interval start exit;
lagdate=lag(visit_date);
if first.ID then interval=0;
else interval =interval+intck('day',lagdate,visit_date);
drop lagdate;
run;
proc print data=want; run;
Output:
ID | status | Event | visit_date | lastvisit | interval |
---|---|---|---|---|---|
1 | Normal | 0 | 08JUN09 | 18AUG11 | 0 |
1 | Normal | 0 | 10DEC09 | 18AUG11 | 185 |
1 | Normal | 0 | 01JUL10 | 18AUG11 | 388 |
1 | Normal | 0 | 18AUG11 | 18AUG11 | 801 |
2 | Normal | 0 | 30JUL09 | 07JUN11 | 0 |
2 | Normal | 0 | 26JAN10 | 07JUN11 | 180 |
3 | Low | 0 | 07AUG09 | 13JAN11 | 0 |
3 | Normal | 0 | 03FEB10 | 13JAN11 | 180 |
3 | Normal | 1 | 12AUG10 | 13JAN11 | 370 |
I have difficulty to have syntax to create variables 'start' and 'exit' based on 'interval'.
I have 100 subjects, each subject have 1 to 4 measurements, the exam date is 'visit_date', last clinic visit date 'lastvisit'.
For example, ID 2: 1st observation, start=0, exit=180; 2nd observation, because 'lastvisit'>'visit_date, then start=180, exit=('lastvisit' - 1st 'visit_date(30JUL09)), if 'lastvisit'<='visit_date', then 'start' and 'exit' are counted as missing.
Thanks for your help,
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.