Hi all,
I have a data file with each Person_ID having multiple records associated with Agent_IDs. The date information (Begin_Date and End_Date) is also available. I wanted to have the data sorted by Person_ID, dates and then Agent_ID and assign new ID (Agent_ID_Assigned in data file "want") for each Agent_ID by sorting order. Essentially Agent_ID_Assigned takes squence id ordered by date per Person_ID. The variable Agent_ID_Assigned is the target field to generate as shown in "want" data file below.
Any suggestion is greatly appreciated.
data have;
infile datalines dlm="09"x dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111
22222 1/11/2018 1/15/2018 A1234
22222 1/28/2018 2/2/2018 A1234
22222 2/8/2018 2/10/2018 A1200
22222 2/24/2018 2/26/2018 A1211
22222 8/17/2018 8/22/2018 A4111
33333 3/17/2018 3/23/2018 A2111
44444 3/13/2018 3/15/2018 A3111
44444 4/24/2018 4/26/2018 A3211
44444 7/13/2018 7/14/2018 A3211
44444 7/14/2018 7/19/2018 A1212
44444 7/26/2018 7/30/2018 A1212
44444 11/17/2018 11/19/2018 A1212
44444 11/19/2018 11/26/2018 A1212
;
data want;
infile datalines dlm="09"x dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
;
Using your sample data below should work.
data want;
infile datalines dlm=' ' dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
;
data derived(drop=_:);
set want;
by person_id Begin_Date;
_lag_Agent_ID=lag(Agent_ID);
if first.person_id then agent_id_derived=1;
else if _lag_Agent_ID ne Agent_ID then agent_id_derived+1;
run;
Question:
It's not a case in your sample data but in case it's possible: What should happen if for the same person a previous agent_id shows up again? like: agent A, agent B, agent A.
Using your sample data below should work.
data want;
infile datalines dlm=' ' dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
;
data derived(drop=_:);
set want;
by person_id Begin_Date;
_lag_Agent_ID=lag(Agent_ID);
if first.person_id then agent_id_derived=1;
else if _lag_Agent_ID ne Agent_ID then agent_id_derived+1;
run;
Question:
It's not a case in your sample data but in case it's possible: What should happen if for the same person a previous agent_id shows up again? like: agent A, agent B, agent A.
Just in case you need it later below a code option that assigns per person_id only one ID per agent.
data want;
infile datalines dlm=' ' dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
44444 3/13/2019 3/15/2018 A3111 1
;
data derived;
if _n_=1 then
do;
dcl hash h1();
h1.defineKey('agent_id');
h1.defineData('agent_id_derived');
h1.defineDone();
end;
set want;
by person_id Begin_Date;
length agent_id_derived 8;
if first.person_id then h1.clear();
if h1.find() ne 0 then
do;
agent_id_derived=h1.num_items+1;
h1.add();
end;
run;
proc print data=derived;
run;
Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.