Good evening everyone,
I am working on clinical trials data and want to flag all the ID's who have less than 30 day gap between studies, also calculating the mean, sd and range of number of days in between trials. Usually this kind of data is little complicated and easy to be anlalysed in excel, but this time I had to submit a SAS report. Could someone please help me coding this report.
data trial;
input ID Nproj proj_id start_dt end_dt;
format start_dt end_dt date9.;
informat start_dt end_dt date9.;
datalines;
2 3 72 20-Jan-11 20-Jan-11
2 3 33 31-May-11 20-Feb-12
2 3 65 13-Jun-12 27-Jun-12
3 2 68 3-Feb-16 15-Feb-16
3 2 82 16-Jul-16 .
4 2 86 11-Dec-12 21-Jan-13
4 2 177 7-Nov-13 16-Dec-13
5 2 183 28-Apr-08 24-Jun-08
5 2 102 17-Feb-10 15-Apr-10
6 2 107 21-May-13 18-Jun-13
6 2 81 22-May-13 18-Jun-13
7 2 21 26-Jan-09 20-Feb-09
7 2 208 30-Apr-09 3-Aug-09
8 7 67 19-Oct-07 12-Feb-08
8 7 47 17-Jan-08 17-Apr-08
8 7 3 27-Jun-08 13-Oct-08
8 7 208 16-Mar-09 3-Aug-09
8 7 129 12-Oct-12 25-Oct-12
8 7 213 9-Apr-13 22-May-13
8 7 164 19-Dec-13 20-Dec-13
;
For all the individuals who participated in two studies, the total number of days between studies is (152+290+603-27+69)= 1087. The mean is 1087/5= 217.4 and the range is (-27, 603). The ID with '-27 days' should be flagged as it did not maintain a gap of 30 days between studies.
The report I am looking for is:
Number of studies | |||
2 | 3 | 7 | |
Days between studies | 1087 | 245 | 1742 |
Mean | 217.4 | 122.5 | 290.3 |
SD | |||
Range | (-27,603) | (114,131) | (-26,1166) |
Many many thanks in advance.
Jennifer
Here's a start:
data want; set trial; by id; daysbetween= start_dt-lag(end_dt); if first.id then daysbetween=.; run; proc tabulate data=want; class nproj; var daysbetween; table daysbetween * (sum mean min max) , nproj ; run;
I'll leave the trivial flag for daysbetween to you and if you really need to reformat the min and max you can find a way.
1. What is the difficult part in doing this in SAS?
In general, asking for a full report isn't a fair ask.
You also haven't clearly defined your rules. How are the durations between trials calculated? Is that dealing with the data on teh same row, or diff rows?
Some things that may be helpful:
LAG()
DIF()
BY Group
PROC MEANS/TABULATE - will give you your summary report.
Here's a start:
data want; set trial; by id; daysbetween= start_dt-lag(end_dt); if first.id then daysbetween=.; run; proc tabulate data=want; class nproj; var daysbetween; table daysbetween * (sum mean min max) , nproj ; run;
I'll leave the trivial flag for daysbetween to you and if you really need to reformat the min and max you can find a way.
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.