Desktop productivity for business analysts and programmers

Adding a column from different dataset

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 89
Accepted Solution

Adding a column from different dataset

I have two slightly different datasets.

One (hourly):

YearMonthDayHour
201821912
201821913
201821914
201821915

Second (daily):

YearMonthDayHoliday
20182150
20182160
20182170
20182180
20182191
20182200

Now I want to add Holiday variable to the first one. The problem is that the first one is hourly so it has over 80000 obs and the second is daily so it has much less observations. How can I combine them so if it is holiday it would be '1' for all hours that day ?


Accepted Solutions
Solution
2 weeks ago
Super User
Posts: 13,920

Re: Adding a column from different dataset

If you are using common US holidays:

data example;
input Year Month Day Hour ;
   Holiday= not missing(holidayname(mdy(month,day,year)));
datalines;
2018 2 18 12 
2018 2 19 12 
2018 2 19 13 
2018 2 19 14 
2018 2 19 15 
2018 2 20 15 
;
run;

@novinosrin's solution likely needs to include year and month as well

 

proc sql;
   create table want as
   select a.*,b.holiday
   from hours a left join daily b
   on a.day=b.day
   and a.year=b.year
   and a.month=b.month
   ;
quit;

You may find it very helpful to actually create a date valued variable using the MDY function as there are many things that are doable with a date variable that are a serious pain to attempt with 3 variables, such as calculating days between values or using formats to create groups of records without have to add variables.

 

View solution in original post


All Replies
Super User
Posts: 2,049

Re: Adding a column from different dataset

Join using day variable

data hours;
input Year	Month	Day	Hour;
cards;
2018	2	19	12
2018	2	19	13
2018	2	19	14
2018	2	19	15
;

data daily;
input Year	Month	Day	Holiday;
cards;
2018	2	15	0
2018	2	16	0
2018	2	17	0
2018	2	18	0
2018	2	19	1
2018	2	20	0
;

proc sql;
create table want as
select a.*,b.holiday
from hours a left join daily b
on a.day=b.day;
quit;
Solution
2 weeks ago
Super User
Posts: 13,920

Re: Adding a column from different dataset

If you are using common US holidays:

data example;
input Year Month Day Hour ;
   Holiday= not missing(holidayname(mdy(month,day,year)));
datalines;
2018 2 18 12 
2018 2 19 12 
2018 2 19 13 
2018 2 19 14 
2018 2 19 15 
2018 2 20 15 
;
run;

@novinosrin's solution likely needs to include year and month as well

 

proc sql;
   create table want as
   select a.*,b.holiday
   from hours a left join daily b
   on a.day=b.day
   and a.year=b.year
   and a.month=b.month
   ;
quit;

You may find it very helpful to actually create a date valued variable using the MDY function as there are many things that are doable with a date variable that are a serious pain to attempt with 3 variables, such as calculating days between values or using formats to create groups of records without have to add variables.

 

Super User
Posts: 2,049

Re: Adding a column from different dataset

Good catch @ballardw Thank you. Morning Coffee wasn;t strong enough, will  have to claim my money back Smiley Embarassed

Super User
Posts: 13,920

Re: Adding a column from different dataset

Posted in reply to novinosrin

@novinosrin wrote:

Good catch @ballardw Thank you. Morning Coffee wasn;t strong enough, will  have to claim my money back Smiley Embarassed


It doesn't help when the example data is from a very small subset that could imply that only Feb 2018 is involved

Frequent Contributor
Posts: 89

Re: Adding a column from different dataset

Thank you, these are really helpful
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 5 replies
  • 102 views
  • 0 likes
  • 3 in conversation