BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
ihlayyel
Fluorite | Level 6
TIME XYclassID
16:00:06103.78571.456331A
16:05:06109.09592.824492A2
16:08:06105.98167.844561A6
16:11:05103.46171.424523F4
16:15:05103.767471.473821C55
16:17:26107.476273.34453E4
16:18:06105.277387.581182O8
16:19:21103.785871.442932H77
16:33:18103.720171.379743XX
16:40:11107.343288.275213X8
17:43:06100.849667.380972D
17:45:06110.100666.267992A
17:50:12105.854287.770792D4
17:55:0996.1428561.997892F4
18:02:08103.782971.483272RR3
18:09:33103.927571.493212U7
18:25:12104.722690.435023S
18:30:05109.694266.226993S33
18:33:04109.712897.242883T17
18:40:4456.1242571.135223G22
18:44:0293.2967563.892213II1
18:50:13109.702371.477561S9
 

1- Above is a sample of a data frame which has thousands of records.

2- How can group it by the time (20 min) and class number. Then calculate the Euclidean distance between x, y using :

dist = sqrt( (x2 - x1)**2 + (y2 - y1)**2 )

3- An iteration of every 20 min for all data, no value duplicates in the single iteration. I have tried to implement this sample. but still not working.

The expected results for the 20min in two iteration: 

ID  CLASS  distnace 
A,A6 1 4.22695 A,C55 1 0.024806 A6, C55 1 4.251038 A2,O8 2 6.485861 A2,H77 2 22.030843 O8,H77 2 16.207033 F4, E4 3 4.4506 A,A6 1 17.279585

 

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User
data have;
infile cards expandtabs;
input TIME $ 	X	Y	class	ID $;
cards;
16:00:06	103.785	71.45633	1	A
16:05:06	109.095	92.82449	2	A2
16:08:06	105.981	67.84456	1	A6
16:11:05	103.461	71.42452	3	F4
16:15:05	103.7674	71.47382	1	C55
16:17:26	107.4762	73.3445	3	E4
16:18:06	105.2773	87.58118	2	O8
16:19:21	103.7858	71.44293	2	H77
16:33:18	103.7201	71.37974	3	XX
16:40:11	107.3432	88.27521	3	X8
17:43:06	100.8496	67.38097	2	D
17:45:06	110.1006	66.26799	2	A
17:50:12	105.8542	87.77079	2	D4
17:55:09	96.14285	61.99789	2	F4
18:02:08	103.7829	71.48327	2	RR3
18:09:33	103.9275	71.49321	2	U7
18:25:12	104.7226	90.43502	3	S
18:30:05	109.6942	66.22699	3	S33
18:33:04	109.7128	97.24288	3	T17
18:40:44	56.12425	71.13522	3	G22
18:44:02	93.29675	63.89221	3	II1
18:50:13	109.7023	71.47756	1	S9
;
run;
proc sql;
create table want as
 select  catx(',',a.id,b.id) as id,Euclid(a.x-b.x,a.y-b.y) as distance
  from have as a,have as b
   where a.class=b.class and a.id ne b.id ;
quit;

View solution in original post

27 REPLIES 27
Kurt_Bremser
Super User

Since SAS time (and datetime) values are counts of seconds, you can normalize them to 20-minute frames by doing

tim20 = int(time / 1200) * 1200;
format tim20 time8.;

and then use that in your by statement(s).

PeterClemmensen
Tourmaline | Level 20

You can create groups based on intervals like this

 

data have;
input TIME:time8. X Y class ID$;
format time time8.;
datalines;
16:00:06 103.785 71.45633 1 A 
16:05:06 109.095 92.82449 2 A2 
16:08:06 105.981 67.84456 1 A6 
16:11:05 103.461 71.42452 3 F4 
16:15:05 103.7674 71.47382 1 C55 
16:17:26 107.4762 73.3445 3 E4 
16:18:06 105.2773 87.58118 2 O8 
16:19:21 103.7858 71.44293 2 H77 
16:33:18 103.7201 71.37974 3 XX 
16:40:11 107.3432 88.27521 3 X8 
17:43:06 100.8496 67.38097 2 D 
17:45:06 110.1006 66.26799 2 A 
17:50:12 105.8542 87.77079 2 D4 
17:55:09 96.14285 61.99789 2 F4 
18:02:08 103.7829 71.48327 2 RR3 
18:09:33 103.9275 71.49321 2 U7 
18:25:12 104.7226 90.43502 3 S 
18:30:05 109.6942 66.22699 3 S33 
18:33:04 109.7128 97.24288 3 T17 
18:40:44 56.12425 71.13522 3 G22 
18:44:02 93.29675 63.89221 3 II1 
18:50:13 109.7023 71.47756 1 S9 
;

data grouped;
	set have;
	interval= intck('minute20', '16:00:00't, time)+1;
run;

Regarding your desired output.. I don't understand why the Euclidian distance between A and A6 for Class 1 appears twice?

 

ihlayyel
Fluorite | Level 6
the output is for 2 iteration.kinldy look at the time itration 1 = 16:00:06 to 16:19:21 iteration2 = 16:33:18 to 16:40:11
Ksharp
Super User
data have;
infile cards expandtabs;
input TIME $ 	X	Y	class	ID $;
cards;
16:00:06	103.785	71.45633	1	A
16:05:06	109.095	92.82449	2	A2
16:08:06	105.981	67.84456	1	A6
16:11:05	103.461	71.42452	3	F4
16:15:05	103.7674	71.47382	1	C55
16:17:26	107.4762	73.3445	3	E4
16:18:06	105.2773	87.58118	2	O8
16:19:21	103.7858	71.44293	2	H77
16:33:18	103.7201	71.37974	3	XX
16:40:11	107.3432	88.27521	3	X8
17:43:06	100.8496	67.38097	2	D
17:45:06	110.1006	66.26799	2	A
17:50:12	105.8542	87.77079	2	D4
17:55:09	96.14285	61.99789	2	F4
18:02:08	103.7829	71.48327	2	RR3
18:09:33	103.9275	71.49321	2	U7
18:25:12	104.7226	90.43502	3	S
18:30:05	109.6942	66.22699	3	S33
18:33:04	109.7128	97.24288	3	T17
18:40:44	56.12425	71.13522	3	G22
18:44:02	93.29675	63.89221	3	II1
18:50:13	109.7023	71.47756	1	S9
;
run;
proc sql;
create table want as
 select  catx(',',a.id,b.id) as id,Euclid(a.x-b.x,a.y-b.y) as distance
  from have as a,have as b
   where a.class=b.class and a.id ne b.id ;
quit;
ihlayyel
Fluorite | Level 6
That is nice, thank you so much. but have you consider the time interval?
Ksharp
Super User

OK. No problem.

 

data have;
infile cards expandtabs;
input TIME : time8.	X	Y	class	ID $;
retain _time ;
format time _time time8.;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;end;
cards;
16:00:06	103.785	71.45633	1	A
16:05:06	109.095	92.82449	2	A2
16:08:06	105.981	67.84456	1	A6
16:11:05	103.461	71.42452	3	F4
16:15:05	103.7674	71.47382	1	C55
16:17:26	107.4762	73.3445	3	E4
16:18:06	105.2773	87.58118	2	O8
16:19:21	103.7858	71.44293	2	H77
16:33:18	103.7201	71.37974	3	XX
16:40:11	107.3432	88.27521	3	X8
17:43:06	100.8496	67.38097	2	D
17:45:06	110.1006	66.26799	2	A
17:50:12	105.8542	87.77079	2	D4
17:55:09	96.14285	61.99789	2	F4
18:02:08	103.7829	71.48327	2	RR3
18:09:33	103.9275	71.49321	2	U7
18:25:12	104.7226	90.43502	3	S
18:30:05	109.6942	66.22699	3	S33
18:33:04	109.7128	97.24288	3	T17
18:40:44	56.12425	71.13522	3	G22
18:44:02	93.29675	63.89221	3	II1
18:50:13	109.7023	71.47756	1	S9
;
run;
proc sql;
create table want as
 select  catx(',',a.id,b.id) as id,a.class as class,Euclid(a.x-b.x,a.y-b.y) as distance
  from have as a,have as b
   where a.class=b.class and a.group=b.group and a.id ne b.id ;
quit;
ihlayyel
Fluorite | Level 6
That is awesome. but iam facing a small issue here. iam reading data from csv file. so the code will be like that ?

data work.people;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;
end;
run;

proc sql;
create table want as
select catx(',',a.id,b.id) as id,a.class as class,Euclid(a.x-b.x,a.y-b.y) as distance
from have as a,have as b
where a.class=b.class and a.group=b.group and a.id ne b.id ;
quit;

Ksharp
Super User
proc import datafile='c:\temp\x.csv' out=have dbms=csv replace;
guessingrows=32767;
run;

The code above could import csv file . and check if TIME variable is numeric variable and have TIME. format .

Ksharp
Super User

Or use data step.

 

data have;
infile 'c:\temp\x.csv' truncover dsd;
input TIME : time8.	X	Y	class	ID : $40.;
ihlayyel
Fluorite | Level 6
I think you didn't get my point. I have import the data into sas. so when i call the data using this. it returns empty cell.

data work.people;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;
end;
run;
Ksharp
Super User

OK. just use SET statement to replace INFILE . And remove INPUT.

Make sure time is numeric variable. 

 

data have;
set have;

retain _time ;
format time _time time8.;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;end;
run;

  Note: fixed 

ihlayyel
Fluorite | Level 6
i have implemented that: I got this error
The time format is same as the sample I provided. Do i need to change it to other format?
ERROR: No DATALINES or INFILE statement.
Ksharp
Super User

That is weird. Your table name is PEOPLE ? and it is in WORK libarary ?

 

data have;
set people  ;

retain _time ;
format time _time time8.;
if _n_=1 then _time=time;
if time > _time+20*60 then do;group+1;_time=_time+20*60;end;
run;

proc sql;
.............
ihlayyel
Fluorite | Level 6
Thank you so much. the code works perfectly. 
when i run the code  
 
this what I get. 
 
78 where a.floor=b.floor and a.uid ne b.uid ;
ERROR: Insufficient space in file WORK.WANT.DATA.
ERROR: File WORK.WANT.DATA is damaged. I/O processing did not complete.
 
NOTE: PROC SQL set option NOEXEC and will continue to check the syntax of statements.
79 quit;
NOTE: The SAS System stopped processing this step because of errors.
NOTE: PROCEDURE SQL used (Total process time):
 
the total number of my data is 1048576 .  Do you have any suggestion for that? 
I really appreciate your help . 
Thank you 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 27 replies
  • 2224 views
  • 2 likes
  • 4 in conversation