I need to sum some total event times.
For example
Event | Start | Stop |
a | 1 | 10 |
b | 8 | 12 |
c | 16 |
20 |
Now if I wanted to sum the total time an event was occurring it would be days 1-12 and 16-20 as I want to not double count overlapping dates.
Now this could be done simply with the following code.
DATA part1;
SET have;
DO day=start to stop;
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE want AS SELECT
distinct(count(distinct(day)) as totaldays
FROM part1;
QUIT;
However once we get to many events and many subjects the dataset in part1 can become very very large. Is there a more elegant solution I am missing?
* There may be syntax errors in the code above, I am not bothered about that, it is just my typing out loud without running it.
This works:
data have;
infile datalines dsd dlm=",";
input Grp $ Event $ Start Stop;
datalines;
X,a,1,10
X,b,8,12
X,c,16,20
X,d,1,10
Y,e,3,4
;
run;
DATA want(keep = grp totaldays);
SET have;
by grp;
array days [1:100] _temporary_;
DO day=start to stop;
days(day) = 1;
END;
if last.grp then do;
totaldays = sum(of days(*));
output;
call missing(of days(*));
end;
RUN;
Your query is nothing more than
PROC SQL;
CREATE TABLE want AS SELECT
*,stop-start as totaldays
FROM have;
QUIT;
Sorry no this would not work,I was not clear enough.
I want the sum of total days . In this case 17. (days 1,2,3,4,5,6,7,8,9,10,11,12,16,17,18,19,20).
So as a better example
Grp | Event | Start | Stop |
X | a | 1 | 10 |
X | b | 8 | 12 |
X | c | 16 | 20 |
X | d | 1 | 10 |
Y | e | 3 | 4 |
Cleaning up the code some.
DATA part1;
SET have;
DO day=start to stop;
OUTPUT;
END;
RUN;
PROC SQL;
CREATE TABLE want AS SELECT
distinct grp, count(distinct(day)) as totaldays
FROM part1
GROUP BY grp;
QUIT;
Grp | totaldays |
X | 17 |
Y | 2 |
This works:
data have;
infile datalines dsd dlm=",";
input Grp $ Event $ Start Stop;
datalines;
X,a,1,10
X,b,8,12
X,c,16,20
X,d,1,10
Y,e,3,4
;
run;
DATA want(keep = grp totaldays);
SET have;
by grp;
array days [1:100] _temporary_;
DO day=start to stop;
days(day) = 1;
END;
if last.grp then do;
totaldays = sum(of days(*));
output;
call missing(of days(*));
end;
RUN;
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.