BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ihlayyel
Fluorite | Level 6
TIME XYclassID
16:00:06103.78571.456331A
16:05:06109.09592.824492A2
16:08:06105.98167.844561A6
16:11:05103.46171.424523F4
16:15:05103.767471.473821C55
16:17:26107.476273.34453E4
16:18:06105.277387.581182O8
16:19:21103.785871.442932H77
16:33:18103.720171.379743XX
16:40:11107.343288.275213X8
17:43:06100.849667.380972D
17:45:06110.100666.267992A
17:50:12105.854287.770792D4
17:55:0996.1428561.997892F4
18:02:08103.782971.483272RR3
18:09:33103.927571.493212U7
18:25:12104.722690.435023S
18:30:05109.694266.226993S33
18:33:04109.712897.242883T17
18:40:4456.1242571.135223G22
18:44:0293.2967563.892213II1
18:50:13109.702371.477561S9
 

1- Above is a sample of a data frame which has thousands of records.

2- How can group it by the time (20 min) and class number. Then calculate the Euclidean distance between x, y using :

dist = sqrt( (x2 - x1)**2 + (y2 - y1)**2 )

3- An iteration of every 20 min for all data, no value duplicates in the single iteration. I have tried to implement this sample. but still not working.

The expected results for the 20min in two iteration: 

ID  CLASS  distnace 
A,A6 1 4.22695 A,C55 1 0.024806 A6, C55 1 4.251038 A2,O8 2 6.485861 A2,H77 2 22.030843 O8,H77 2 16.207033 F4, E4 3 4.4506 A,A6 1 17.279585

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
infile cards expandtabs;
input TIME $ 	X	Y	class	ID $;
cards;
16:00:06	103.785	71.45633	1	A
16:05:06	109.095	92.82449	2	A2
16:08:06	105.981	67.84456	1	A6
16:11:05	103.461	71.42452	3	F4
16:15:05	103.7674	71.47382	1	C55
16:17:26	107.4762	73.3445	3	E4
16:18:06	105.2773	87.58118	2	O8
16:19:21	103.7858	71.44293	2	H77
16:33:18	103.7201	71.37974	3	XX
16:40:11	107.3432	88.27521	3	X8
17:43:06	100.8496	67.38097	2	D
17:45:06	110.1006	66.26799	2	A
17:50:12	105.8542	87.77079	2	D4
17:55:09	96.14285	61.99789	2	F4
18:02:08	103.7829	71.48327	2	RR3
18:09:33	103.9275	71.49321	2	U7
18:25:12	104.7226	90.43502	3	S
18:30:05	109.6942	66.22699	3	S33
18:33:04	109.7128	97.24288	3	T17
18:40:44	56.12425	71.13522	3	G22
18:44:02	93.29675	63.89221	3	II1
18:50:13	109.7023	71.47756	1	S9
;
run;
proc sql;
create table want as
 select  catx(',',a.id,b.id) as id,Euclid(a.x-b.x,a.y-b.y) as distance
  from have as a,have as b
   where a.class=b.class and a.id ne b.id ;
quit;

View solution in original post

27 REPLIES 27
Kurt_Bremser
Super User

Since SAS time (and datetime) values are counts of seconds, you can normalize them to 20-minute frames by doing

tim20 = int(time / 1200) * 1200;
format tim20 time8.;

and then use that in your by statement(s).

PeterClemmensen
Tourmaline | Level 20

You can create groups based on intervals like this

 

data have;
input TIME:time8. X Y class ID$;
format time time8.;
datalines;
16:00:06 103.785 71.45633 1 A 
16:05:06 109.095 92.82449 2 A2 
16:08:06 105.981 67.84456 1 A6 
16:11:05 103.461 71.42452 3 F4 
16:15:05 103.7674 71.47382 1 C55 
16:17:26 107.4762 73.3445 3 E4 
16:18:06 105.2773 87.58118 2 O8 
16:19:21 103.7858 71.44293 2 H77 
16:33:18 103.7201 71.37974 3 XX 
16:40:11 107.3432 88.27521 3 X8 
17:43:06 100.8496 67.38097 2 D 
17:45:06 110.1006 66.26799 2 A 
17:50:12 105.8542 87.77079 2 D4 
17:55:09 96.14285 61.99789 2 F4 
18:02:08 103.7829 71.48327 2 RR3 
18:09:33 103.9275 71.49321 2 U7 
18:25:12 104.7226 90.43502 3 S 
18:30:05 109.6942 66.22699 3 S33 
18:33:04 109.7128 97.24288 3 T17 
18:40:44 56.12425 71.13522 3 G22 
18:44:02 93.29675 63.89221 3 II1 
18:50:13 109.7023 71.47756 1 S9 
;

data grouped;
	set have;
	interval= intck('minute20', '16:00:00't, time)+1;
run;

Regarding your desired output.. I don't understand why the Euclidian distance between A and A6 for Class 1 appears twice?

 

ihlayyel
Fluorite | Level 6
the output is for 2 iteration.kinldy look at the time itration 1 = 16:00:06 to 16:19:21 iteration2 = 16:33:18 to 16:40:11
Ksharp
Super User
data have;
infile cards expandtabs;
input TIME $ 	X	Y	class	ID $;
cards;
16:00:06	103.785	71.45633	1	A
16:05:06	109.095	92.82449	2	A2
16:08:06	105.981	67.84456	1	A6
16:11:05	103.461	71.42452	3	F4
16:15:05	103.7674	71.47382	1	C55
16:17:26	107.4762	73.3445	3	E4
16:18:06	105.2773	87.58118	2	O8
16:19:21	103.7858	71.44293	2	H77
16:33:18	103.7201	71.37974	3	XX
16:40:11	107.3432	88.27521	3	X8
17:43:06	100.8496	67.38097	2	D
17:45:06	110.1006	66.26799	2	A
17:50:12	105.8542	87.77079	2	D4
17:55:09	96.14285	61.99789	2	F4
18:02:08	103.7829	71.48327	2	RR3
18:09:33	103.9275	71.49321	2	U7
18:25:12	104.7226	90.43502	3	S
18:30:05	109.6942	66.22699	3	S33
18:33:04	109.7128	97.24288	3	T17
18:40:44	56.12425	71.13522	3	G22
18:44:02	93.29675	63.89221	3	II1
18:50:13	109.7023	71.47756	1	S9
;
run;
proc sql;
create table want as
 select  catx(',',a.id,b.id) as id,Euclid(a.x-b.x,a.y-b.y) as distance
  from have as a,have as b
   where a.class=b.class and a.id ne b.id ;
quit;
ihlayyel
Fluorite | Level 6
That is nice, thank you so much. but have you consider the time interval?
Ksharp
Super User

OK. No problem.

 

data have;
infile cards expandtabs;
input TIME : time8.	X	Y	class	ID $;
retain _time ;
format time _time time8.;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;end;
cards;
16:00:06	103.785	71.45633	1	A
16:05:06	109.095	92.82449	2	A2
16:08:06	105.981	67.84456	1	A6
16:11:05	103.461	71.42452	3	F4
16:15:05	103.7674	71.47382	1	C55
16:17:26	107.4762	73.3445	3	E4
16:18:06	105.2773	87.58118	2	O8
16:19:21	103.7858	71.44293	2	H77
16:33:18	103.7201	71.37974	3	XX
16:40:11	107.3432	88.27521	3	X8
17:43:06	100.8496	67.38097	2	D
17:45:06	110.1006	66.26799	2	A
17:50:12	105.8542	87.77079	2	D4
17:55:09	96.14285	61.99789	2	F4
18:02:08	103.7829	71.48327	2	RR3
18:09:33	103.9275	71.49321	2	U7
18:25:12	104.7226	90.43502	3	S
18:30:05	109.6942	66.22699	3	S33
18:33:04	109.7128	97.24288	3	T17
18:40:44	56.12425	71.13522	3	G22
18:44:02	93.29675	63.89221	3	II1
18:50:13	109.7023	71.47756	1	S9
;
run;
proc sql;
create table want as
 select  catx(',',a.id,b.id) as id,a.class as class,Euclid(a.x-b.x,a.y-b.y) as distance
  from have as a,have as b
   where a.class=b.class and a.group=b.group and a.id ne b.id ;
quit;
ihlayyel
Fluorite | Level 6
That is awesome. but iam facing a small issue here. iam reading data from csv file. so the code will be like that ?

data work.people;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;
end;
run;

proc sql;
create table want as
select catx(',',a.id,b.id) as id,a.class as class,Euclid(a.x-b.x,a.y-b.y) as distance
from have as a,have as b
where a.class=b.class and a.group=b.group and a.id ne b.id ;
quit;

Ksharp
Super User
proc import datafile='c:\temp\x.csv' out=have dbms=csv replace;
guessingrows=32767;
run;

The code above could import csv file . and check if TIME variable is numeric variable and have TIME. format .

Ksharp
Super User

Or use data step.

 

data have;
infile 'c:\temp\x.csv' truncover dsd;
input TIME : time8.	X	Y	class	ID : $40.;
ihlayyel
Fluorite | Level 6
I think you didn't get my point. I have import the data into sas. so when i call the data using this. it returns empty cell.

data work.people;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;
end;
run;
Ksharp
Super User

OK. just use SET statement to replace INFILE . And remove INPUT.

Make sure time is numeric variable. 

 

data have;
set have;

retain _time ;
format time _time time8.;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;end;
run;

  Note: fixed 

ihlayyel
Fluorite | Level 6
i have implemented that: I got this error
The time format is same as the sample I provided. Do i need to change it to other format?
ERROR: No DATALINES or INFILE statement.
Ksharp
Super User

That is weird. Your table name is PEOPLE ? and it is in WORK libarary ?

 

data have;
set people  ;

retain _time ;
format time _time time8.;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;end;
run;

proc sql;
.............
ihlayyel
Fluorite | Level 6
Thank you so much. the code works perfectly. 
when i run the code  
 
this what I get. 
 
78 where a.floor=b.floor and a.uid ne b.uid ;
ERROR: Insufficient space in file WORK.WANT.DATA.
ERROR: File WORK.WANT.DATA is damaged. I/O processing did not complete.
 
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
79 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
 
the total number of my data is 1048576 .  Do you have any suggestion for that? 
I really appreciate your help . 
Thank you 

SAS Innovate 2025: Register Now

Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 27 replies
  • 3316 views
  • 2 likes
  • 4 in conversation