I have two tables, users table and leave table which stores all the information for the user who have captured leave.
I need help in adding people who did not apply leave in table Leave_new with 0 value under leave_days column
data users; input users $50.; datalines; Andrew Lup Dan Louis Mike Mora Farren Farrao Joseph Mia ; run; data leave; input calender_month leave_days leave_type_code username $50.; datalines; 202101 1 2 Andrew Lup 202101 1 2 Andrew Lup 202101 1 2 Sol Sebe 202001 1 2 Ant Joe 202101 1 2 Dan Louis 202101 1 2 Mike Mora 202101 1 2 Mike Mora 202101 1 2 Mike Mora 202101 1 2 Mike Mora ; proc sql; create table Leave_new as select distinct a.username, a.calender_month, sum(a.leave_days) as leave_days from have as a where a.username in (select users from users) group by 1,2 ;quit;
Data Want
username | calender_month | leave_days |
Andrew Lup | 202101 | 2 |
Dan Louis | 202101 | 1 |
Farren Farrao | 202101 | 0 |
Joseph Mia | 202101 | 0 |
Mike Mora | 202101 | 4 |
Not sure why in your have data there is also calendar_month 202001 but in your desired data there is only 202101.
Check below code. It shouldn't be hard to change in case you only want a table for the most recent calendar month in your data or you need to exclude users from your table that don't exist yet/anymore for an specific calendar month (would require from/to columns in your users table).
data users;
input users $50.;
datalines;
Andrew Lup
Dan Louis
Mike Mora
Farren Farrao
Joseph Mia
;
data leave;
input calender_month leave_days leave_type_code username $50.;
datalines;
202101 1 2 Andrew Lup
202101 1 2 Andrew Lup
202101 1 2 Sol Sebe
202001 1 2 Ant Joe
202101 1 2 Dan Louis
202001 1 2 Mike Mora
202001 1 2 Mike Mora
202001 1 2 Mike Mora
202001 1 2 Mike Mora
;
proc sql;
select
l2.username,
l2.calender_month,
coalesce(r2.leave_days,0) as leave_days
from
(
select
l1.users as username,
r1.calender_month
from
users l1,
(select distinct calender_month from leave) r1
) l2
left join
(
select
username,
calender_month,
sum(leave_days) as leave_days
from leave
group by username, calender_month
) r2
on l2.username=r2.username and l2.calender_month=r2.calender_month
;
quit;
Can a user have different calender_month in the leave table?
Not sure why in your have data there is also calendar_month 202001 but in your desired data there is only 202101.
Check below code. It shouldn't be hard to change in case you only want a table for the most recent calendar month in your data or you need to exclude users from your table that don't exist yet/anymore for an specific calendar month (would require from/to columns in your users table).
data users;
input users $50.;
datalines;
Andrew Lup
Dan Louis
Mike Mora
Farren Farrao
Joseph Mia
;
data leave;
input calender_month leave_days leave_type_code username $50.;
datalines;
202101 1 2 Andrew Lup
202101 1 2 Andrew Lup
202101 1 2 Sol Sebe
202001 1 2 Ant Joe
202101 1 2 Dan Louis
202001 1 2 Mike Mora
202001 1 2 Mike Mora
202001 1 2 Mike Mora
202001 1 2 Mike Mora
;
proc sql;
select
l2.username,
l2.calender_month,
coalesce(r2.leave_days,0) as leave_days
from
(
select
l1.users as username,
r1.calender_month
from
users l1,
(select distinct calender_month from leave) r1
) l2
left join
(
select
username,
calender_month,
sum(leave_days) as leave_days
from leave
group by username, calender_month
) r2
on l2.username=r2.username and l2.calender_month=r2.calender_month
;
quit;
@Solly7 wrote:
Hi, code is not working..sorry im still a learner..im just confused by alias names 'l2,l1,r1,r2' from the solution you have provided..and one more thing the calender month values is only 202101, i just made typing error..i only provided 202101 for example purposes
You've marked my previous post as solution so I assume you made it work now.
"code is not working" doesn't tell me much. What is not working? Does it throw an error or just not return the desired result.
To get a more "step by step" view what the SQL does you could also formulate it as below.
data users;
input users $50.;
datalines;
Andrew Lup
Dan Louis
Mike Mora
Farren Farrao
Joseph Mia
;
data leave;
input calender_month leave_days leave_type_code username $50.;
datalines;
202101 1 2 Andrew Lup
202101 1 2 Andrew Lup
202101 1 2 Sol Sebe
202101 1 2 Ant Joe
202101 1 2 Dan Louis
202101 1 2 Mike Mora
202101 1 2 Mike Mora
202101 1 2 Mike Mora
202101 1 2 Mike Mora
;
proc sql;
create view distinct_calendar_mt as
select distinct
calender_month
from leave
;
quit;
proc sql;
create view user_per_calender_mth as
select
l.users as username,
r.calender_month
from
users l,
distinct_calendar_mt r
;
quit;
proc sql;
create view sum_leave_days_per_user as
select
username,
calender_month,
sum(leave_days) as leave_days
from leave
group by username, calender_month
;
quit;
proc sql;
create table want as
select
l.username,
l.calender_month,
coalesce(r.leave_days,0) as leave_days
from
user_per_calender_mth l
left join
sum_leave_days_per_user r
on
l.username=r.username
and l.calender_month=r.calender_month
;
quit;
proc print data=want noobs;
run;
Think of a SQL view as encapsulated SQL code that only gets executed when the view gets used. In looking at it this way it's only the very last SQL (creating table WANT) that actually executes all the SQL code - and then if you would in this last SQL replace the view names with the SQL creating the views you'd end up pretty much with the SQL I've posted initially.
Splitting things up gives you the opportunity to look at the intermediary results (the views) and though makes it eventually easier for you to understand what's happening/the logic used.
And just as a side note: You could have above SQL also wrapped into a single Proc SQL; Quit; block (but that makes debugging harder).
proc sql;
create view distinct_calendar_mt as
select distinct
calender_month
from leave
;
create view user_per_calender_mth as
select
l.users as username,
r.calender_month
from
users l,
distinct_calendar_mt r
;
create view sum_leave_days_per_user as
select
username,
calender_month,
sum(leave_days) as leave_days
from leave
group by username, calender_month
;
create table want as
select
l.username,
l.calender_month,
coalesce(r.leave_days,0) as leave_days
from
user_per_calender_mth l
left join
sum_leave_days_per_user r
on
l.username=r.username
and l.calender_month=r.calender_month
;
quit;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.