The LAG2 values in the 5th observation of an ID would come from the 3rd observation (because 5-2=3), hence would not qualify for output if the first three observations are to be excluded. Therefore, _i>5 is a necessary condition for the OUTPUT statement to be executed.
Then you'd change the IF conditions and the length of the LAG queues as follows:
... if _i>8 then do; /* conditional only for performance improvement */ _v=lag7(value); _dt=lag7(datetime); end; if _i>15 & last.id=0 then do; ...
To make the code flexible, you could use macro expressions involving a macro variable, say, d=8, in these four places. But, again, this code doesn't care for seconds, only for observations.
@AK100 wrote:
@FreelanceReinhard. So what if I wanted to remove the first and last 8 seconds.
Are we talking about seconds here or measurements (observations)?
Then you should try my suggestion. It cuts off with the time value, not the observation count.
I fully agree with KurtBremser. But then you need to explain why the 9th observation of your sample data (with Value 87 and DateTime 7.8) did not qualify for removal, although it is clearly within the last 3 seconds -- if these "3 seconds" are measured from the last DateTime for the ID, i.e. 10.1 in this case.
@FreelanceReinh @Kurt_Bremser @PeterClemmensen
I tried all your suggestions. Im sure they all would work but the point is that it takes hours to run ... I did not wait for the result because my SAS EG started to crash and it took me to long.
In the usable data I showed you above I indeed asked for the first 3 and last 3 seconds, but in the original data 1 measurement is much much longer (approx. 40 seconds and it has around 1500 records). And every day has maybe hundreds of measurements and the I have data of more then 1,5 year.. So you can already count how much records I have. Should be millions+.
Got such a headache from this continually crashing...
@AK100 wrote:
@FreelanceReinh @Kurt_Bremser @PeterClemmensen
I tried all your suggestions. Im sure they all would work ...
Well, they yield different results even on your sample data and neither result matches your desired outcome (as stated in the initial post). So, I think you should clarify the requirements before we go on and address performance issues.
The data below has just short of 150M obs. @Kurt_Bremsers code takes about 2 minutes to run on my laptop.
Try running this code and see what happens. Otherwise, post a representable sample of your actual data.
data have(drop = n i);
do ID = 1 to (1000 * 365 * 2);
n = 100 + floor(rand("uniform") * (1 + 300 - 100));
do i = 1 to n;
DateTime = sum(DateTime, round(rand('uniform'), .1));
value = ceil(rand('uniform') * 100);
output;
end;
end;
run;
data want;
do until (last.id);
set have;
by id;
if first.id then start = datetime;
if last.id then end = datetime;
end;
do until (last.id);
set have;
by id;
if start + 3 le datetime le end - 3 then output;
end;
drop start end;
run;
I have found a little potential for performance improvement:
data want;
do until (last.id);
set have;
by id;
if first.id then start = datetime;
end;
end = datetime;
do until (last.id);
set have;
by id;
if start + 3 le datetime le end - 3 then output;
end;
drop start end;
run;
One comparison less in the first loop.
@AK100 If that code causes a crash, then something else is the issue. Run a PROC CONTENTS on your real "have" dataset and show us the output.
AND post the code you really use on that dataset.
@Kurt_Bremser This is the code I tried and its still running now for few minutes...
data remove8seconds; do until (last.UniekID); set tms.datameting; by UniekID; if first.UniekID then start = Tijd; end; end = Tijd; do until (last.UniekID); set tms.datameting; by UniekID; if start + 8 le Tijd le end - 8 then output; end; drop start end; run;
Ouch. You are running this against a DBMS table through ODBC. No miracle this performs like watching the paint dry.
Transfer the table to your SAS environment first, then run the code against it.
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.
Find more tutorials on the SAS Users YouTube channel.