BookmarkSubscribeRSS Feed
statadm
Fluorite | Level 6
I have two groups of 37 time variables (formatted time8.):

id time1-time37 timefinal1-timefinal37

I need to match up the times for the two groups of variables, but within + or - 5 seconds.

Also, time1 may not actually be the same as timefinal1, but all times are in consecutive order. I need to line up the times in timefinal1-timefinal37 to match time1-time37 while skipping over times that do not match.

Thanks!
17 REPLIES 17
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10
A DATA step using one or more ARRAY definitions (with your two sets of variables, maybe combined as one array - not sure) and a DO/END code approach to compare consecutive variables in the array(s) using IF / THEN comparison technique should work.

Suggest searching the SAS support http://support.sas.com/ website for DOC and topic-related conference/technical papers on "array processing" - here is a Google advanced search argument to get you started:

array processing numeric comparison site:sas.com

Scott Barry
SBBWorks, Inc.
deleted_user
Not applicable
I assume you know that times are stored as the number of seconds, so you only have to compare abs(t_one - t_two) to 5.

Is it possible for there to be more than one match (within 5 seconds)? This could make a big difference in the array logic, as you'd have to worry about picking the best of multiple matches.
ChrisNZ
Tourmaline | Level 20
You need to provide a bit more info.
In particular, is there a reference? i.e if there are times every 3 seconds (say t1=3, t2=6, t3=9), where do you cut off? the first 2 could go together, the last 2 too, but not all 3. How do pick the prefered reference to form a group around?
statadm
Fluorite | Level 6
The separate times should be no closer than 15 seconds apart and if they are , there is a problem. However, the seconds may not be precise due to reader variability, this is why I want to account for plus or minus 5 seconds so that I capture the correct times.

I'm still working on this, but I haven't yet figured out how to put it all together. I can match the times using an array, but I would have to ignore seconds and then I can end up having two times within the same minute. So, I have to figure out how to match the times within a range of 10 seconds.

Hope this makes sense.

Thanks for your suggestions, I will give some a try.
deleted_user
Not applicable
Why don't you post (at least) a single example row of (real) data? That way, people can get a better feel for what is actually involved.
statadm
Fluorite | Level 6
id time1 time2 time3 time4 time5 timef1 timef2 timef3 timef4 timef5
1 4:56:00 5:00:35 5:03:10 5:09:00 5:13:40 4:56:04 5:00:38 5:03:12 5:09:01 5:13:37
2 3:09:00 3:09:55 3:11:00 3:13:05 3:14:45 3:09:00 3:10:25 3:11:02 3:13:05 3:14:43

Here are two samples of real data looking at the first 5 times.
Where time1-time5 should match timef1-timef5 within a 10 second range.

Also, the original variables were entered as separate hours minutes and seconds. I took this data and created a complete time variable using:

Time1 = HMS(time1_HR,time1_MIN,time1_SEC);

If it would be easier to match using the separate variables, I could do this as well. Message was edited by: statadm
deleted_user
Not applicable
In these two examples, though, the match is perfect -- correct?
statadm
Fluorite | Level 6
Almost.

In example 2:

2 3:09:00 3:09:55 3:11:00 3:13:05 3:14:45 3:09:00 3:10:25 3:11:02 3:13:05 3:14:43

The second times would not match, so I would need to skip over this and move all other times forward, or even better, to output the non-matching times to another dataset.
deleted_user
Not applicable
Ah, yes, I missed that. The program below is a very rough first attempt. It doesn't attempt to move anything "forward" or output the mis-matched times (though that would be easy to do). It also handles the possibility that there might be more than one match, which I'm not sure is true!


data test1;

informat t1-t5 ft1-ft5 time8.;

input id t1-t5 ft1-ft5;

array _a {*} t1-t5;
array _b {*} ft1-ft5;
array _match(5) _temporary_;

do i=1 to dim(_a);
bestmatch=6;
do j=1 to dim(_b);
deltatime=abs(_a(i) - _b(j));
if deltatime le 5 then do;
if deltatime lt bestmatch then do;
bestmatch = deltatime;
_match(i) = j;
end;
end;
end;
if bestmatch le 5 then
put id= i= _match(i)= bestmatch=;
else
put id= i= +1 "no match for this time point";
end;


cards;
1 4:56:00 5:00:35 5:03:10 5:09:00 5:13:40 4:56:04 5:00:38 5:03:12 5:09:01 5:13:37
2 3:09:00 3:09:55 3:11:00 3:13:05 3:14:45 3:09:00 3:10:25 3:11:02 3:13:05 3:14:43
;
run;
statadm
Fluorite | Level 6
Thank you, I will try to modify this for my program and see if I can find away to move everything forward.

I appreciate the help with doing the plus or minus 5 seconds.
Flip
Fluorite | Level 6
This will compare the times and move them forward.

do i = 1 to 5;
if abs(itime(i) - ftime(i)) > 5 then do j = i to 4;
itime(j) = itime(j+1);
ftime(j) = ftime(J+1);
itime(j+1) = .;
ftime(J+1) = .;
if i=j then i=i-1;
end;
end;
statadm
Fluorite | Level 6
You are all great! Thanks for the code, this program is working great for what I need.
statadm
Fluorite | Level 6
I am still having trouble getting this to work and some parts have changed.

I have to match times within 5 seconds the first time around, and take those that didn't match and match them within 15 seconds. Anything that does not match the second time, will be dropped from the final dataset for those variables (missing data) .

I also have 13 additional variables that are associated with the times, so when I match up the times, those associated variables also have to stay with those times.

So, i would have ID, time1-time37, timef1-timef37, a1-a37, b1-b37, c1-c37, etc.

If I take real data:

id time1 time2 time3 time4 time5 timef1 timef2 timef3 timef4 timef5
1 5:56:33 6:01:44 6:10:47 6:16:00 6:17:55 5:56:17 6:01:45 6:10:35 6:15:58 6:17:54
2 2:05:03 2:14:05 2:18:48 2:31:15 . 2:05:04 2:08:55 2:14:00 2:18:44 2:23:35
3 8:16:19 8:18:18 8:21:57 8:25:04 8:27:41 8:16:18 8:18:07 8:20:17 8:25:04 8:27:20

Some match within 5 seconds, some within 15 seconds and some not at all.

In the end, I need to have the times ordered 1-37 in time order, but only if they match and with associated time descriptive variables (a1-a37, b1-b37, etc).

So there will be the same amount of time1-time37 values as timef1-timef37 all matching one another where Time1=timef1 time2=timef2, etc. within 5 or 15 seconds depending on how it was selected.

What I envision doing with SAS is matching up the times within 5 seconds, output to a dataset, match up the leftover times within 15 seconds (not including those that are already matched within 5) and ouput them to another dataset. Remove all other times, merge back the other two datasets and sort and reorder the variables by time. The times should all fall within an ~30 minute time range.

I've been mulling over this for a week and writing code with no success other than to determine which times match and which do not.

Thanks again!
ChrisNZ
Tourmaline | Level 20
Still not sure I fully understand what you want, but it seems to me that you don't care where times are stored (time1, time2, timef25, etc), you just want to find times which are close to each other, and match them.
How about you put all times in one variable (one record per time stamp), sort, and then derive the difference with the previous record.
So your table will be:
TIME DIF
2:08:55 231
2:14:00 305
2:14:05 5
2:14:05 0
2:18:44 279
2:18:48 4

This way you can more easily use logic to define groups of time as you like. Once you have defined groups, go back to your data and apply the groups to the times.

Another even easier way, if time groups are well separated as you first told me, is just to round all times to the nearest x seconds and get your groups from the rounded values.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1616 views
  • 0 likes
  • 5 in conversation