Hi guys!
Help me please to solve the problem. I try to find score. ID=A gives 1 point, ID=B minus 1 point, but it should be in chronological order, like in attached file. How is it possible to solve it? The result for code 111 should be 1 point.
Thanks in advance.
OK. You want this one ?
data have;
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35
B 22222222 30Mar2020 10:49:35
A 33333333 30Apr2020 16:09:14
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
B 44444444 01Oct2020 8:56:40
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
B 33333333 17Nov2020 9:31:58
B 66666666 20Nov2020 14:26:29
A 66666666 26Feb2021 11:44:59
;run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))',multidata:'y');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
max=999999999;
rc=h.find();
do while(rc=0);
if date<n_dtstamp and (n_dtstamp-date)<max then do; want=n_dtstamp;max=(n_dtstamp-date);end;
rc=h.find_next();
end;
if want then do;
rc=h.find();
do while(rc=0);
if want=n_dtstamp then h.removedup();
rc=h.find_next();
end;
end;
drop n_dtstamp rc max;
format want datetime.;
run;
Please post the data you have in usable form (a datastep using datalines, nothing else) and show the expected result.
Data:
data want;1
input id evnt_d;2
datalines; 3
A 05Mar2020 13:02:00
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
A 16Mar2020 14:26:17
A 19Mar2020 14:01:21
A 20Mar2020 18:01:21
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
A 28Mar2020 10:42:16
B 29Mar2020 10:42:16
; 3
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
Please fix your data step, so that it runs without ERROR messages and creates the dataset you have.
Log from your code:
74 data want; 75 input id evnt_d; 76 datalines; NOTE: Invalid data for id in line 77 1-1. NOTE: Invalid data for evnt_d in line 77 6-14. REGEL: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 77 A 05Mar2020 13:02:00 id=. evnt_d=. _ERROR_=1 _N_=1 NOTE: Invalid data for id in line 78 1-1. NOTE: Invalid data for evnt_d in line 78 6-14. 78 B 06Mar2020 13:45:11 id=. evnt_d=. _ERROR_=1 _N_=2 NOTE: Invalid data for id in line 79 1-1. NOTE: Invalid data for evnt_d in line 79 6-14. 79 B 08Mar2020 13:15:42 id=. evnt_d=. _ERROR_=1 _N_=3 NOTE: Invalid data for id in line 80 1-1. NOTE: Invalid data for evnt_d in line 80 6-14. 80 B 14Mar2020 12:36:22 id=. evnt_d=. _ERROR_=1 _N_=4 NOTE: Invalid data for id in line 81 1-1. NOTE: Invalid data for evnt_d in line 81 6-14. 81 A 16Mar2020 14:26:17 id=. evnt_d=. _ERROR_=1 _N_=5 NOTE: Invalid data for id in line 82 1-1. NOTE: Invalid data for evnt_d in line 82 6-14. 82 A 19Mar2020 14:01:21 id=. evnt_d=. _ERROR_=1 _N_=6 NOTE: Invalid data for id in line 83 1-1. NOTE: Invalid data for evnt_d in line 83 6-14. 83 A 20Mar2020 18:01:21 id=. evnt_d=. _ERROR_=1 _N_=7 NOTE: Invalid data for id in line 84 1-1. NOTE: Invalid data for evnt_d in line 84 6-14. 84 B 23Mar2020 8:56:01 id=. evnt_d=. _ERROR_=1 _N_=8 NOTE: Invalid data for id in line 85 1-1. NOTE: Invalid data for evnt_d in line 85 6-14. 85 B 27Mar2020 8:56:40 id=. evnt_d=. _ERROR_=1 _N_=9 NOTE: Invalid data for id in line 86 1-1. NOTE: Invalid data for evnt_d in line 86 6-14. 86 A 28Mar2020 10:42:16 id=. evnt_d=. _ERROR_=1 _N_=10 NOTE: Invalid data for id in line 87 1-1. NOTE: Invalid data for evnt_d in line 87 6-14. 87 B 29Mar2020 10:42:16 id=. evnt_d=. _ERROR_=1 _N_=11
data want;
input id$ evnt_d$21.;
cards;
A 05Mar2020 13:02:00
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
A 16Mar2020 14:26:17
A 19Mar2020 14:01:21
A 20Mar2020 18:01:21
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
A 28Mar2020 10:42:16
B 29Mar2020 10:42:16
;
run;
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
@J_J_J wrote:
data want;
input id$ evnt_d$21.;
cards;
A 05Mar2020 13:02:00
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
A 16Mar2020 14:26:17
A 19Mar2020 14:01:21
A 20Mar2020 18:01:21
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
A 28Mar2020 10:42:16
B 29Mar2020 10:42:16
;
run;
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
Sorry, but i can't see how the logic you have described in the starting message could create the dataset you want.
I'll try to explain other way. How to select next latest than group_id date from table B for each group_id from table A, provided that this date wasn't used for the previous group? Group_id and evnt_d are in chronological order.
data A;
input group_id$21.;
cards;
05Mar2020 13:02:00
16Mar2020 14:26:17
19Mar2020 14:01:21
20Mar2020 18:01:21
28Mar2020 10:42:16
;
run;
data B;
input evnt_d$21.;
cards;
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
B 29Mar2020 10:42:16
;
run;
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
For each A record, you want the closest following B record that has not already been paired with an A record:
You can do this be using two SET statements:
data have;
input id $ dtstamp datetime18.;
format dtstamp datetime18. ;
cards;
A 05Mar2020:13:02:00
B 06Mar2020:13:45:11
B 08Mar2020:13:15:42
B 14Mar2020:12:36:22
A 16Mar2020:14:26:17
A 19Mar2020:14:01:21
A 20Mar2020:18:01:21
B 23Mar2020:8:56:01
B 27Mar2020:8:56:40
A 28Mar2020:10:42:16
B 29Mar2020:10:42:16
run;
data want (drop=_:);
set have (where=(id='A') rename=(dtstamp=dtstampa));
do until (dtstampb>dtstampa or end_of_b);
if end_of_b=0 then set have (where=(_id='B') rename=(id=_id dtstamp=dtstampb)) end=end_of_b;
else dtstampb=.;
end;
run;
Note the SET statement for _ID='B' uses an "if end_of_b=0" test to make sure that the data step does not prematurely stop because all the B records have been exhausted.
As to the "rename" parameters.
Edited note: This program assumes the data are sorted by DTSTAMP.
mkeintz, thank you very much!
I like this question.
data have;
input id $ dtstamp datetime18.;
format dtstamp datetime18. ;
cards;
A 05Mar2020:13:02:00
B 06Mar2020:13:45:11
B 08Mar2020:13:15:42
B 14Mar2020:12:36:22
A 16Mar2020:14:26:17
A 19Mar2020:14:01:21
A 20Mar2020:18:01:21
B 23Mar2020:8:56:01
B 27Mar2020:8:56:40
A 28Mar2020:10:42:16
B 29Mar2020:10:42:16
;
run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(dtstamp=n_dtstamp));
declare hash h(dataset:'b(rename=(dtstamp=n_dtstamp))',ordered:'a');
declare hiter hi('h');
h.definekey('n_dtstamp');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
rc=hi.first();
do while(rc=0);
if dtstamp<n_dtstamp then do;want=n_dtstamp; leave;end;
rc=hi.next();
end;
if want then do;
rc=hi.next();
h.remove(key:want);
end;
drop n_dtstamp rc ;
format want datetime.;
run;
Could you help me to do it with a key please?
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
ID key date
A 4129 21Jan2021 15:31:53
A 5951 21Jan2021 15:33:13
A 1093 27Jan2021 13:04:08
A 5794 04Feb2021 16:25:07
A 6095 04Feb2021 16:26:41
B 5794 12Feb2021 14:55:57
B 6095 15Feb2021 8:19:39
A 1031 15Feb2021 14:08:23
A 2166 26Feb2021 11:44:59
;
run;
Expected result:
key datea dateb
4129 21Jan2021 15:31:53
5951 21Jan2021 15:33:13
1093 27Jan2021 13:04:08
5794 04Feb2021 16:25:07 12Feb2021 14:55:57
6095 04Feb2021 16:26:41 15Feb2021 8:19:39
1031 15Feb2021 14:08:23
2166 26Feb2021 11:44:59
So you just simply merge it by KEY ?
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 4129 21Jan2021 15:31:53
A 5951 21Jan2021 15:33:13
A 1093 27Jan2021 13:04:08
A 5794 04Feb2021 16:25:07
A 6095 04Feb2021 16:26:41
B 5794 12Feb2021 14:55:57
B 6095 15Feb2021 8:19:39
A 1031 15Feb2021 14:08:23
A 2166 26Feb2021 11:44:59
;
run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
call missing(n_dtstamp);
rc=h.find();
drop rc;
run;
Ksharp, something goes wrong with this code.
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35
B 22222222 30Mar2020 10:49:35
A 33333333 30Apr2020 16:09:14
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
B 44444444 01Oct2020 8:56:40
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
B 33333333 17Nov2020 9:31:58
B 66666666 20Nov2020 14:26:29
A 66666666 26Feb2021 11:44:59
;
run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
call missing(n_dtstamp);
rc=h.find();
drop rc;
run;
Expected result is:
id key date n_dtstamp
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17 30Mar2020 10:49:35
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00 01Oct2020 8:56:40
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35 20Nov2020 14:26:29
A 33333333 30Apr2020 16:09:14 17Nov2020 9:31:58
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
A 66666666 26Feb2021 11:44:59
But I get:
id key n_dtstamp date
A 11111111 31Oct2019 10:50:42
A 22222222 30Mar2020 10:49:35 31Oct2019 10:51:17
A 22222222 30Mar2020 10:49:35 02Mar2020 15:12:33
A 44444444 01Oct2020 8:56:40 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 20Nov2020 14:26:29 27Mar2020 13:35:35
A 33333333 16Mar2020 12:23:46 30Apr2020 16:09:14
A 22222222 30Mar2020 10:49:35 12Jun2020 15:32:04
A 66666666 20Nov2020 14:26:29 12Jun2020 16:42:03
A 33333333 16Mar2020 12:23:46 19Aug2020 9:16:55
A 33333333 16Mar2020 12:23:46 18Sep2020 9:40:37
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
A 66666666 20Nov2020 14:26:29 26Feb2021 11:44:59
OK. You want this one ?
data have;
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35
B 22222222 30Mar2020 10:49:35
A 33333333 30Apr2020 16:09:14
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
B 44444444 01Oct2020 8:56:40
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
B 33333333 17Nov2020 9:31:58
B 66666666 20Nov2020 14:26:29
A 66666666 26Feb2021 11:44:59
;run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))',multidata:'y');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
max=999999999;
rc=h.find();
do while(rc=0);
if date<n_dtstamp and (n_dtstamp-date)<max then do; want=n_dtstamp;max=(n_dtstamp-date);end;
rc=h.find_next();
end;
if want then do;
rc=h.find();
do while(rc=0);
if want=n_dtstamp then h.removedup();
rc=h.find_next();
end;
end;
drop n_dtstamp rc max;
format want datetime.;
run;
Ksharp, thank you very much!👍 It works!
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.