I need to sum some total event times.
For example
Event | Start | Stop |
a | 1 | 10 |
b | 8 | 12 |
c | 16 |
20 |
Now if I wanted to sum the total time an event was occurring it would be days 1-12 and 16-20 as I want to not double count overlapping dates.
Now this could be done simply with the following code.
DATA part1;
SET have;
DO day=start to stop;
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE want AS SELECT
distinct(count(distinct(day)) as totaldays
FROM part1;
QUIT;
However once we get to many events and many subjects the dataset in part1 can become very very large. Is there a more elegant solution I am missing?
* There may be syntax errors in the code above, I am not bothered about that, it is just my typing out loud without running it.
This works:
data have;
infile datalines dsd dlm=",";
input Grp $ Event $ Start Stop;
datalines;
X,a,1,10
X,b,8,12
X,c,16,20
X,d,1,10
Y,e,3,4
;
run;
DATA want(keep = grp totaldays);
SET have;
by grp;
array days [1:100] _temporary_;
DO day=start to stop;
days(day) = 1;
END;
if last.grp then do;
totaldays = sum(of days(*));
output;
call missing(of days(*));
end;
RUN;
Your query is nothing more than
PROC SQL;
CREATE TABLE want AS SELECT
*,stop-start as totaldays
FROM have;
QUIT;
Sorry no this would not work,I was not clear enough.
I want the sum of total days . In this case 17. (days 1,2,3,4,5,6,7,8,9,10,11,12,16,17,18,19,20).
So as a better example
Grp | Event | Start | Stop |
X | a | 1 | 10 |
X | b | 8 | 12 |
X | c | 16 | 20 |
X | d | 1 | 10 |
Y | e | 3 | 4 |
Cleaning up the code some.
DATA part1;
SET have;
DO day=start to stop;
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE want AS SELECT
distinct grp, count(distinct(day)) as totaldays
FROM part1
GROUP BY grp;
QUIT;
Grp | totaldays |
X | 17 |
Y | 2 |
This works:
data have;
infile datalines dsd dlm=",";
input Grp $ Event $ Start Stop;
datalines;
X,a,1,10
X,b,8,12
X,c,16,20
X,d,1,10
Y,e,3,4
;
run;
DATA want(keep = grp totaldays);
SET have;
by grp;
array days [1:100] _temporary_;
DO day=start to stop;
days(day) = 1;
END;
if last.grp then do;
totaldays = sum(of days(*));
output;
call missing(of days(*));
end;
RUN;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.