Hi guys!
Help me please to solve the problem. I try to find score. ID=A gives 1 point, ID=B minus 1 point, but it should be in chronological order, like in attached file. How is it possible to solve it? The result for code 111 should be 1 point.
Thanks in advance.
OK. You want this one ?
data have;
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35
B 22222222 30Mar2020 10:49:35
A 33333333 30Apr2020 16:09:14
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
B 44444444 01Oct2020 8:56:40
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
B 33333333 17Nov2020 9:31:58
B 66666666 20Nov2020 14:26:29
A 66666666 26Feb2021 11:44:59
;run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))',multidata:'y');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
max=999999999;
rc=h.find();
do while(rc=0);
if date<n_dtstamp and (n_dtstamp-date)<max then do; want=n_dtstamp;max=(n_dtstamp-date);end;
rc=h.find_next();
end;
if want then do;
rc=h.find();
do while(rc=0);
if want=n_dtstamp then h.removedup();
rc=h.find_next();
end;
end;
drop n_dtstamp rc max;
format want datetime.;
run;
Please post the data you have in usable form (a datastep using datalines, nothing else) and show the expected result.
Data:
data want;1
input id evnt_d;2
datalines; 3
A 05Mar2020 13:02:00
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
A 16Mar2020 14:26:17
A 19Mar2020 14:01:21
A 20Mar2020 18:01:21
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
A 28Mar2020 10:42:16
B 29Mar2020 10:42:16
; 3
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
Please fix your data step, so that it runs without ERROR messages and creates the dataset you have.
Log from your code:
74 data want; 75 input id evnt_d; 76 datalines; NOTE: Invalid data for id in line 77 1-1. NOTE: Invalid data for evnt_d in line 77 6-14. REGEL: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8----+----9----+----0 77 A 05Mar2020 13:02:00 id=. evnt_d=. _ERROR_=1 _N_=1 NOTE: Invalid data for id in line 78 1-1. NOTE: Invalid data for evnt_d in line 78 6-14. 78 B 06Mar2020 13:45:11 id=. evnt_d=. _ERROR_=1 _N_=2 NOTE: Invalid data for id in line 79 1-1. NOTE: Invalid data for evnt_d in line 79 6-14. 79 B 08Mar2020 13:15:42 id=. evnt_d=. _ERROR_=1 _N_=3 NOTE: Invalid data for id in line 80 1-1. NOTE: Invalid data for evnt_d in line 80 6-14. 80 B 14Mar2020 12:36:22 id=. evnt_d=. _ERROR_=1 _N_=4 NOTE: Invalid data for id in line 81 1-1. NOTE: Invalid data for evnt_d in line 81 6-14. 81 A 16Mar2020 14:26:17 id=. evnt_d=. _ERROR_=1 _N_=5 NOTE: Invalid data for id in line 82 1-1. NOTE: Invalid data for evnt_d in line 82 6-14. 82 A 19Mar2020 14:01:21 id=. evnt_d=. _ERROR_=1 _N_=6 NOTE: Invalid data for id in line 83 1-1. NOTE: Invalid data for evnt_d in line 83 6-14. 83 A 20Mar2020 18:01:21 id=. evnt_d=. _ERROR_=1 _N_=7 NOTE: Invalid data for id in line 84 1-1. NOTE: Invalid data for evnt_d in line 84 6-14. 84 B 23Mar2020 8:56:01 id=. evnt_d=. _ERROR_=1 _N_=8 NOTE: Invalid data for id in line 85 1-1. NOTE: Invalid data for evnt_d in line 85 6-14. 85 B 27Mar2020 8:56:40 id=. evnt_d=. _ERROR_=1 _N_=9 NOTE: Invalid data for id in line 86 1-1. NOTE: Invalid data for evnt_d in line 86 6-14. 86 A 28Mar2020 10:42:16 id=. evnt_d=. _ERROR_=1 _N_=10 NOTE: Invalid data for id in line 87 1-1. NOTE: Invalid data for evnt_d in line 87 6-14. 87 B 29Mar2020 10:42:16 id=. evnt_d=. _ERROR_=1 _N_=11
data want;
input id$ evnt_d$21.;
cards;
A 05Mar2020 13:02:00
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
A 16Mar2020 14:26:17
A 19Mar2020 14:01:21
A 20Mar2020 18:01:21
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
A 28Mar2020 10:42:16
B 29Mar2020 10:42:16
;
run;
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
@J_J_J wrote:
data want;
input id$ evnt_d$21.;
cards;
A 05Mar2020 13:02:00
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
A 16Mar2020 14:26:17
A 19Mar2020 14:01:21
A 20Mar2020 18:01:21
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
A 28Mar2020 10:42:16
B 29Mar2020 10:42:16
;
run;
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
Sorry, but i can't see how the logic you have described in the starting message could create the dataset you want.
I'll try to explain other way. How to select next latest than group_id date from table B for each group_id from table A, provided that this date wasn't used for the previous group? Group_id and evnt_d are in chronological order.
data A;
input group_id$21.;
cards;
05Mar2020 13:02:00
16Mar2020 14:26:17
19Mar2020 14:01:21
20Mar2020 18:01:21
28Mar2020 10:42:16
;
run;
data B;
input evnt_d$21.;
cards;
B 06Mar2020 13:45:11
B 08Mar2020 13:15:42
B 14Mar2020 12:36:22
B 23Mar2020 8:56:01
B 27Mar2020 8:56:40
B 29Mar2020 10:42:16
;
run;
Expected result:
A 05Mar2020 13:02:00 06Mar2020 13:45:11
A 16Mar2020 14:26:17 23Mar2020 8:56:01
A 19Mar2020 14:01:21 27Mar2020 8:56:40
A 20Mar2020 18:01:21 29Mar2020 10:42:16
A 28Mar2020 10:42:16
For each A record, you want the closest following B record that has not already been paired with an A record:
You can do this be using two SET statements:
data have;
input id $ dtstamp datetime18.;
format dtstamp datetime18. ;
cards;
A 05Mar2020:13:02:00
B 06Mar2020:13:45:11
B 08Mar2020:13:15:42
B 14Mar2020:12:36:22
A 16Mar2020:14:26:17
A 19Mar2020:14:01:21
A 20Mar2020:18:01:21
B 23Mar2020:8:56:01
B 27Mar2020:8:56:40
A 28Mar2020:10:42:16
B 29Mar2020:10:42:16
run;
data want (drop=_:);
set have (where=(id='A') rename=(dtstamp=dtstampa));
do until (dtstampb>dtstampa or end_of_b);
if end_of_b=0 then set have (where=(_id='B') rename=(id=_id dtstamp=dtstampb)) end=end_of_b;
else dtstampb=.;
end;
run;
Note the SET statement for _ID='B' uses an "if end_of_b=0" test to make sure that the data step does not prematurely stop because all the B records have been exhausted.
As to the "rename" parameters.
Edited note: This program assumes the data are sorted by DTSTAMP.
mkeintz, thank you very much!
I like this question.
data have;
input id $ dtstamp datetime18.;
format dtstamp datetime18. ;
cards;
A 05Mar2020:13:02:00
B 06Mar2020:13:45:11
B 08Mar2020:13:15:42
B 14Mar2020:12:36:22
A 16Mar2020:14:26:17
A 19Mar2020:14:01:21
A 20Mar2020:18:01:21
B 23Mar2020:8:56:01
B 27Mar2020:8:56:40
A 28Mar2020:10:42:16
B 29Mar2020:10:42:16
;
run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(dtstamp=n_dtstamp));
declare hash h(dataset:'b(rename=(dtstamp=n_dtstamp))',ordered:'a');
declare hiter hi('h');
h.definekey('n_dtstamp');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
rc=hi.first();
do while(rc=0);
if dtstamp<n_dtstamp then do;want=n_dtstamp; leave;end;
rc=hi.next();
end;
if want then do;
rc=hi.next();
h.remove(key:want);
end;
drop n_dtstamp rc ;
format want datetime.;
run;
Could you help me to do it with a key please?
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
ID key date
A 4129 21Jan2021 15:31:53
A 5951 21Jan2021 15:33:13
A 1093 27Jan2021 13:04:08
A 5794 04Feb2021 16:25:07
A 6095 04Feb2021 16:26:41
B 5794 12Feb2021 14:55:57
B 6095 15Feb2021 8:19:39
A 1031 15Feb2021 14:08:23
A 2166 26Feb2021 11:44:59
;
run;
Expected result:
key datea dateb
4129 21Jan2021 15:31:53
5951 21Jan2021 15:33:13
1093 27Jan2021 13:04:08
5794 04Feb2021 16:25:07 12Feb2021 14:55:57
6095 04Feb2021 16:26:41 15Feb2021 8:19:39
1031 15Feb2021 14:08:23
2166 26Feb2021 11:44:59
So you just simply merge it by KEY ?
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 4129 21Jan2021 15:31:53
A 5951 21Jan2021 15:33:13
A 1093 27Jan2021 13:04:08
A 5794 04Feb2021 16:25:07
A 6095 04Feb2021 16:26:41
B 5794 12Feb2021 14:55:57
B 6095 15Feb2021 8:19:39
A 1031 15Feb2021 14:08:23
A 2166 26Feb2021 11:44:59
;
run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
call missing(n_dtstamp);
rc=h.find();
drop rc;
run;
Ksharp, something goes wrong with this code.
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35
B 22222222 30Mar2020 10:49:35
A 33333333 30Apr2020 16:09:14
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
B 44444444 01Oct2020 8:56:40
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
B 33333333 17Nov2020 9:31:58
B 66666666 20Nov2020 14:26:29
A 66666666 26Feb2021 11:44:59
;
run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
call missing(n_dtstamp);
rc=h.find();
drop rc;
run;
Expected result is:
id key date n_dtstamp
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17 30Mar2020 10:49:35
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00 01Oct2020 8:56:40
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35 20Nov2020 14:26:29
A 33333333 30Apr2020 16:09:14 17Nov2020 9:31:58
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
A 66666666 26Feb2021 11:44:59
But I get:
id key n_dtstamp date
A 11111111 31Oct2019 10:50:42
A 22222222 30Mar2020 10:49:35 31Oct2019 10:51:17
A 22222222 30Mar2020 10:49:35 02Mar2020 15:12:33
A 44444444 01Oct2020 8:56:40 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 20Nov2020 14:26:29 27Mar2020 13:35:35
A 33333333 16Mar2020 12:23:46 30Apr2020 16:09:14
A 22222222 30Mar2020 10:49:35 12Jun2020 15:32:04
A 66666666 20Nov2020 14:26:29 12Jun2020 16:42:03
A 33333333 16Mar2020 12:23:46 19Aug2020 9:16:55
A 33333333 16Mar2020 12:23:46 18Sep2020 9:40:37
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
A 66666666 20Nov2020 14:26:29 26Feb2021 11:44:59
OK. You want this one ?
data have;
data have;
input id $ key $ date datetime18.;
format date datetime18. ;
cards;
A 11111111 31Oct2019 10:50:42
A 22222222 31Oct2019 10:51:17
A 22222222 02Mar2020 15:12:33
B 33333333 16Mar2020 12:23:46
A 44444444 25Mar2020 13:02:00
A 55555555 27Mar2020 13:20:07
A 66666666 27Mar2020 13:35:35
B 22222222 30Mar2020 10:49:35
A 33333333 30Apr2020 16:09:14
A 22222222 12Jun2020 15:32:04
A 66666666 12Jun2020 16:42:03
A 33333333 19Aug2020 9:16:55
A 33333333 18Sep2020 9:40:37
B 44444444 01Oct2020 8:56:40
A 11111111 12Oct2020 12:32:40
A 11111111 09Nov2020 14:19:16
B 33333333 17Nov2020 9:31:58
B 66666666 20Nov2020 14:26:29
A 66666666 26Feb2021 11:44:59
;run;
data a b;
set have;
if id='A' then output a;
else output b;
run;
data want;
if _n_=1 then do;
if 0 then set b(rename=(date=n_dtstamp));
declare hash h(dataset:'b(rename=(date=n_dtstamp))',multidata:'y');
h.definekey('key');
h.definedata('n_dtstamp');
h.definedone();
end;
set a;
max=999999999;
rc=h.find();
do while(rc=0);
if date<n_dtstamp and (n_dtstamp-date)<max then do; want=n_dtstamp;max=(n_dtstamp-date);end;
rc=h.find_next();
end;
if want then do;
rc=h.find();
do while(rc=0);
if want=n_dtstamp then h.removedup();
rc=h.find_next();
end;
end;
drop n_dtstamp rc max;
format want datetime.;
run;
Ksharp, thank you very much!👍 It works!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.