Hi all,
I have a data file with each Person_ID having multiple records associated with Agent_IDs. The date information (Begin_Date and End_Date) is also available. I wanted to have the data sorted by Person_ID, dates and then Agent_ID and assign new ID (Agent_ID_Assigned in data file "want") for each Agent_ID by sorting order. Essentially Agent_ID_Assigned takes squence id ordered by date per Person_ID. The variable Agent_ID_Assigned is the target field to generate as shown in "want" data file below.
Any suggestion is greatly appreciated.
data have;
infile datalines dlm="09"x dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111
22222 1/11/2018 1/15/2018 A1234
22222 1/28/2018 2/2/2018 A1234
22222 2/8/2018 2/10/2018 A1200
22222 2/24/2018 2/26/2018 A1211
22222 8/17/2018 8/22/2018 A4111
33333 3/17/2018 3/23/2018 A2111
44444 3/13/2018 3/15/2018 A3111
44444 4/24/2018 4/26/2018 A3211
44444 7/13/2018 7/14/2018 A3211
44444 7/14/2018 7/19/2018 A1212
44444 7/26/2018 7/30/2018 A1212
44444 11/17/2018 11/19/2018 A1212
44444 11/19/2018 11/26/2018 A1212
;
data want;
infile datalines dlm="09"x dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
;
Using your sample data below should work.
data want;
infile datalines dlm=' ' dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
;
data derived(drop=_:);
set want;
by person_id Begin_Date;
_lag_Agent_ID=lag(Agent_ID);
if first.person_id then agent_id_derived=1;
else if _lag_Agent_ID ne Agent_ID then agent_id_derived+1;
run;
Question:
It's not a case in your sample data but in case it's possible: What should happen if for the same person a previous agent_id shows up again? like: agent A, agent B, agent A.
Using your sample data below should work.
data want;
infile datalines dlm=' ' dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
;
data derived(drop=_:);
set want;
by person_id Begin_Date;
_lag_Agent_ID=lag(Agent_ID);
if first.person_id then agent_id_derived=1;
else if _lag_Agent_ID ne Agent_ID then agent_id_derived+1;
run;
Question:
It's not a case in your sample data but in case it's possible: What should happen if for the same person a previous agent_id shows up again? like: agent A, agent B, agent A.
Just in case you need it later below a code option that assigns per person_id only one ID per agent.
data want;
infile datalines dlm=' ' dsd truncover;
input
person_id $
Begin_Date :mmddyy10.
End_Date :mmddyy10.
Agent_ID $
Agent_ID_Assigned
;
format
Begin_Date
End_Date yymmdd10.
;
datalines;
11111 3/2/2018 3/13/2018 A1111 1
22222 1/11/2018 1/15/2018 A1234 1
22222 1/28/2018 2/2/2018 A1234 1
22222 2/8/2018 2/10/2018 A1200 2
22222 2/24/2018 2/26/2018 A1211 3
22222 8/17/2018 8/22/2018 A4111 4
33333 3/17/2018 3/23/2018 A2111 1
44444 3/13/2018 3/15/2018 A3111 1
44444 4/24/2018 4/26/2018 A3211 2
44444 7/13/2018 7/14/2018 A3211 2
44444 7/14/2018 7/19/2018 A1212 3
44444 7/26/2018 7/30/2018 A1212 3
44444 11/17/2018 11/19/2018 A1212 3
44444 11/19/2018 11/26/2018 A1212 3
44444 3/13/2019 3/15/2018 A3111 1
;
data derived;
if _n_=1 then
do;
dcl hash h1();
h1.defineKey('agent_id');
h1.defineData('agent_id_derived');
h1.defineDone();
end;
set want;
by person_id Begin_Date;
length agent_id_derived 8;
if first.person_id then h1.clear();
if h1.find() ne 0 then
do;
agent_id_derived=h1.num_items+1;
h1.add();
end;
run;
proc print data=derived;
run;
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.