Hi friends,
I am very new to the SAS world. While working with a dataset I have faced an issue. Hope you will help me. (I use SAS 9.4).
I have a dataset which includes the tweets of several companies during 2014-2018. I want to label each tweet based on the month and the week it was posted (i.e. a week and a month column). The way I want to do this is I want to start from January 1st 2014, and go seven days by seven days. so the tweets that were posted any time between January 2nd 2014 to January 8th 2014 (inclusive) would be month 1 and week 1. Below are my two questions:
1. How can I create these two columns?
2. Because the labeling is seven days by seven days, it will not completely overlap with calendar weeks. For example, the fourth 7-day period of the month January ends on January 29th. The next seven day period would be January 30th to February 5th (inclusive). I want this period to be labeled month 2, week 6. In other words, when a seven-day period includes days of more than one month, the labeling should be based on the later month.
The last thing is that for each given year, there should always be a week that starts from January 2nd.
I'd really appreciate if you could give me a pretty specific code I could use (with variable names made up). I have attached my desired output as an excel file.
Thank you so much again in advance.
Jan 1, 2014 is a Wednesday. Are you saying you only want weeks that start on Wednesdays and end on the following Tuesdays?
Thanks @mkeintz
I am not sure I understand what you mean by "only".
I apologize I made a mistake. the first week starts from January 2nd 2014 (This might be weird. The reason is I will want to match this dataset with another dataset later on, and this is how that dataset has defined month/week).
So, starting from January 2nd 2014, I want to define weeks of 2014 as starting from Thursday and ending on the following Wednesday. The only thing to mention, though, is that for each given year, there should always be a week that starts from January 2nd. I have attached my desired output as an excel file.
Thanks @Reeza
I really apologize, I had made some mistakes, in my original post, when explaining how weeks should be defined. I have corrected those. I have also attached to the original post the desired output as an excel file. Does that clarify things?
Thanks,
Attaching an Excel file doesn't help most people, since most people will not download Microsoft Office documents. Better to provide the data via SAS data step code.
Given your example, my original answer was correct, use the WEEK format (or you can use a WEEK function) you can shift the start date using shift index within the format. Or you can use the calculation I have above, changing it for each year.
I would probably recommend the calculation above.
data want;
set have;
yearData = year(date);
week = floor(date - mdy(1, 1, yearData)/7) + 1;
run;
Looks like you want to do something like this:
data test;
year=2014 to 2017;
end_of_week=mdy(1,1,year);
week=0;
do until(end_of_week > mdy(12,31,year));
month=month(end_of_week);
output;
week+1;
end_of_week+7;
end;
format end_of_week yymmdd10.;
run;
Have you taken a look at the default WEEK formats/functions and see if any of those meet your requirements?
WEEKU/WEEKW and WEEKW formats all exist and can be applied to a SAS date directly.
I don't think they will, because they factor in Year and week, but not month.
How do you want to handle weeks across years or are you ignoring the year component entirely?
Please provide some sample data and expected output that illustrates these cases.
Here are instructions on how to provide sample data as a data step:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...
@AlG wrote:
Hi friends,
I am very new to the SAS world. While working with a dataset I have faced an issue. Hope you will help me. (I use SAS 9.4).
I have a dataset which includes the tweets of several companies during 2014-2018. I want to label each tweet based on the month and the week it was posted (i.e. a week and a month column). The way I want to do this is I want to start from January 1st 2014, and go seven days by seven days. so the tweets that were posted any time between January 1st 2014 to January 7th 2014 (inclusive) would be month 1 and week 1. Below are my two questions:
1. How can I create these two columns?
2. Because the labeling is seven days by seven days, it will not completely overlap with calendar weeks. For example, the fourth 7-day period of the month January ends on January 28th. The next seven day period would be January 29th to February 4th (inclusive). I want this period to be labeled month 1, week 5. In other words, when a seven-day period includes days of more than one month, the labeling should be based on the earlier month.
I'd really appreciate if you could give me a pretty specific code I could use (with variable names made up).
Thank you so much again in advance.
This code replicates your desired output in the Excel file
data series;
do i='01Jan2014'd to '31Dec2018'd by 7;
year=year(i);
month=month(i);
week=week(i);
output;
end;
run;
proc print label data=series (obs=10);
label i='Week end day';
format i mmddyy10.;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.