Hi, Please help me for the following query. I have two datasets and want to update “Plan” in Dataset B from Dataset A based on employee id and attendance/ admission date. If admission date is not available in dataset A, then I want to go back by 7 days from admission date and want to update latest/ previous plan from dataset A to dataset B.
Dataset A | xxx | Dataset B | xxx | Dataset Want | ||||||
Emp_Id | ATTENDANCE_DATE | Plan | Employee_Id | Date_Of_Admission | Plan | Employee_Id | Date_Of_Admission | Plan | ||
1100589 | 01-02-2023 | Silver | 1100589 | 01-02-2023 | Silver | 1100589 | 01-02-2023 | Silver | ||
1100589 | 02-02-2023 | Silver | 1100589 | 07-02-2023 | 1100589 | 07-02-2023 | Silver | |||
1100589 | 03-02-2023 | Silver | 1100589 | 21-04-2023 | 1100589 | 21-04-2023 | . | |||
1100589 | 04-02-2023 | Silver | 1100589 | 11-07-2023 | 1100589 | 11-07-2023 | Bronze | |||
1100589 | 05-02-2023 | Silver | 1100589 | 30-10-2023 | Bronze | 1100589 | 30-10-2023 | Bronze | ||
1100589 | 06-02-2023 | Silver | ||||||||
1100589 | 19-05-2023 | Platinum | ||||||||
1100589 | 19-05-2023 | Gold | ||||||||
1100589 | 23-05-2023 | Bronze | ||||||||
1100589 | 25-05-2023 | Bronze | ||||||||
1100589 | 27-05-2023 | Bronze | ||||||||
1100589 | 06-07-2023 | Gold | ||||||||
1100589 | 07-07-2023 | Bronze | ||||||||
1100589 | 12-07-2023 | Bronze | ||||||||
1100589 | 30-10-2023 | Bronze |
Should be simple if you interleave the dataset by ID and DATE and remember the date and the plan from the "A" dataset.
You will want the variable names to match so you can combine the two datasets. Let's convert your listings into actual datasets we can program with. (PS Displaying dates in either MDY or DMY order will confuse 50% of your audience.)
data A;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 02-02-2023 Silver
1100589 03-02-2023 Silver
1100589 04-02-2023 Silver
1100589 05-02-2023 Silver
1100589 06-02-2023 Silver
1100589 19-05-2023 Platinum
1100589 19-05-2023 Gold
1100589 23-05-2023 Bronze
1100589 25-05-2023 Bronze
1100589 27-05-2023 Bronze
1100589 06-07-2023 Gold
1100589 07-07-2023 Bronze
1100589 12-07-2023 Bronze
1100589 30-10-2023 Bronze
;
data B;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 07-02-2023 .
1100589 21-04-2023 .
1100589 11-07-2023 .
1100589 30-10-2023 Bronze
;
data expect;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 07-02-2023 Silver
1100589 21-04-2023 .
1100589 11-07-2023 Bronze
1100589 30-10-2023 Bronze
;
Now interleave A and B. When you see a record from A then remember the plan and date. When you see a record from B then test the plan and dates and if available update the missing plan with the old plan.
data want;
set a(in=in1) b(in=in2);
by Employee_Id Date ;
if in1 then do;
old_plan = plan ;
old_date = date ;
end;
format old_date date9.;
retain old_plan old_date;
if first.Employee_id and in2 then call missing(old_plan,old_date);
if in2 then do;
if missing(plan) and sum(old_date,7) >= date then plan=old_plan;
output;
end;
run;
Let's check if the result matches what you expected.
proc compare data=want compare=expect;
id employee_id date;
var plan;
run;
Result:
Dataset Created Modified NVar NObs WORK.WANT . . 5 5 WORK.EXPECT . . 3 5 Variables Summary Number of Variables in Common: 3. Number of Variables in WORK.WANT but not in WORK.EXPECT: 2. Number of ID Variables: 2. Number of VAR Statement Variables: 1. Observation Summary Observation Base Compare ID First Obs 1 1 Employee_Id=1100589 Date=01FEB2023 Last Obs 5 5 Employee_Id=1100589 Date=30OCT2023 Number of Observations in Common: 5. Total Number of Observations Read from WORK.WANT: 5. Total Number of Observations Read from WORK.EXPECT: 5. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 5. NOTE: No unequal values were found.
Should be simple if you interleave the dataset by ID and DATE and remember the date and the plan from the "A" dataset.
You will want the variable names to match so you can combine the two datasets. Let's convert your listings into actual datasets we can program with. (PS Displaying dates in either MDY or DMY order will confuse 50% of your audience.)
data A;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 02-02-2023 Silver
1100589 03-02-2023 Silver
1100589 04-02-2023 Silver
1100589 05-02-2023 Silver
1100589 06-02-2023 Silver
1100589 19-05-2023 Platinum
1100589 19-05-2023 Gold
1100589 23-05-2023 Bronze
1100589 25-05-2023 Bronze
1100589 27-05-2023 Bronze
1100589 06-07-2023 Gold
1100589 07-07-2023 Bronze
1100589 12-07-2023 Bronze
1100589 30-10-2023 Bronze
;
data B;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 07-02-2023 .
1100589 21-04-2023 .
1100589 11-07-2023 .
1100589 30-10-2023 Bronze
;
data expect;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 07-02-2023 Silver
1100589 21-04-2023 .
1100589 11-07-2023 Bronze
1100589 30-10-2023 Bronze
;
Now interleave A and B. When you see a record from A then remember the plan and date. When you see a record from B then test the plan and dates and if available update the missing plan with the old plan.
data want;
set a(in=in1) b(in=in2);
by Employee_Id Date ;
if in1 then do;
old_plan = plan ;
old_date = date ;
end;
format old_date date9.;
retain old_plan old_date;
if first.Employee_id and in2 then call missing(old_plan,old_date);
if in2 then do;
if missing(plan) and sum(old_date,7) >= date then plan=old_plan;
output;
end;
run;
Let's check if the result matches what you expected.
proc compare data=want compare=expect;
id employee_id date;
var plan;
run;
Result:
Dataset Created Modified NVar NObs WORK.WANT . . 5 5 WORK.EXPECT . . 3 5 Variables Summary Number of Variables in Common: 3. Number of Variables in WORK.WANT but not in WORK.EXPECT: 2. Number of ID Variables: 2. Number of VAR Statement Variables: 1. Observation Summary Observation Base Compare ID First Obs 1 1 Employee_Id=1100589 Date=01FEB2023 Last Obs 5 5 Employee_Id=1100589 Date=30OCT2023 Number of Observations in Common: 5. Total Number of Observations Read from WORK.WANT: 5. Total Number of Observations Read from WORK.EXPECT: 5. Number of Observations with Some Compared Variables Unequal: 0. Number of Observations with All Compared Variables Equal: 5. NOTE: No unequal values were found.
data A;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 02-02-2023 Silver
1100589 03-02-2023 Silver
1100589 04-02-2023 Silver
1100589 05-02-2023 Silver
1100589 06-02-2023 Silver
1100589 19-05-2023 Platinum
1100589 19-05-2023 Gold
1100589 23-05-2023 Bronze
1100589 25-05-2023 Bronze
1100589 27-05-2023 Bronze
1100589 06-07-2023 Gold
1100589 07-07-2023 Bronze
1100589 12-07-2023 Bronze
1100589 30-10-2023 Bronze
;
data B;
input Employee_Id Date :ddmmyy. Plan $ ;
format Date date9. ;
cards;
1100589 01-02-2023 Silver
1100589 07-02-2023 .
1100589 21-04-2023 .
1100589 11-07-2023 .
1100589 30-10-2023 Bronze
;
data want;
if _n_=1 then do;
if 0 then set A;
declare hash h(dataset:'A');
h.definekey('Employee_Id','Date');
h.definedata('Plan');
h.definedone();
end;
set B;
do k=Date to Date-7 by -1;
if h.find(key:Employee_Id,key:k)=0 then leave;
end;
drop k;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.