I need to sum some total event times.
For example
| Event | Start | Stop |
| a | 1 | 10 |
| b | 8 | 12 |
| c | 16 |
20 |
Now if I wanted to sum the total time an event was occurring it would be days 1-12 and 16-20 as I want to not double count overlapping dates.
Now this could be done simply with the following code.
DATA part1;
SET have;
DO day=start to stop;
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE want AS SELECT
distinct(count(distinct(day)) as totaldays
FROM part1;
QUIT;
However once we get to many events and many subjects the dataset in part1 can become very very large. Is there a more elegant solution I am missing?
* There may be syntax errors in the code above, I am not bothered about that, it is just my typing out loud without running it.
This works:
data have;
infile datalines dsd dlm=",";
input Grp $ Event $ Start Stop;
datalines;
X,a,1,10
X,b,8,12
X,c,16,20
X,d,1,10
Y,e,3,4
;
run;
DATA want(keep = grp totaldays);
SET have;
by grp;
array days [1:100] _temporary_;
DO day=start to stop;
days(day) = 1;
END;
if last.grp then do;
totaldays = sum(of days(*));
output;
call missing(of days(*));
end;
RUN;
Your query is nothing more than
PROC SQL;
CREATE TABLE want AS SELECT
*,stop-start as totaldays
FROM have;
QUIT;
Sorry no this would not work,I was not clear enough.
I want the sum of total days . In this case 17. (days 1,2,3,4,5,6,7,8,9,10,11,12,16,17,18,19,20).
So as a better example
| Grp | Event | Start | Stop |
| X | a | 1 | 10 |
| X | b | 8 | 12 |
| X | c | 16 | 20 |
| X | d | 1 | 10 |
| Y | e | 3 | 4 |
Cleaning up the code some.
DATA part1;
SET have;
DO day=start to stop;
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE want AS SELECT
distinct grp, count(distinct(day)) as totaldays
FROM part1
GROUP BY grp;
QUIT;
| Grp | totaldays |
| X | 17 |
| Y | 2 |
This works:
data have;
infile datalines dsd dlm=",";
input Grp $ Event $ Start Stop;
datalines;
X,a,1,10
X,b,8,12
X,c,16,20
X,d,1,10
Y,e,3,4
;
run;
DATA want(keep = grp totaldays);
SET have;
by grp;
array days [1:100] _temporary_;
DO day=start to stop;
days(day) = 1;
END;
if last.grp then do;
totaldays = sum(of days(*));
output;
call missing(of days(*));
end;
RUN;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.