BookmarkSubscribeRSS Feed
AlG
Quartz | Level 8 AlG
Quartz | Level 8

Hi friends,

I am very new to the SAS world. While working with a dataset I have faced an issue. Hope you will help me. (I use SAS 9.4).

 

I have a dataset which includes the tweets of several companies during 2014-2018. I want to label each tweet based on the month and the week it was posted (i.e. a week and a month column). The way I want to do this is I want to start from January 1st 2014, and go seven days by seven days. so the tweets that were posted any time between January 2nd 2014 to January 8th 2014 (inclusive) would be month 1 and week 1. Below are my two questions:

 

1. How can I create these two columns?

2. Because the labeling is seven days by seven days, it will not completely overlap with calendar weeks. For example, the fourth 7-day period of the month January ends on January 29th. The next seven day period would be January 30th to February 5th (inclusive). I want this period to be labeled month 2, week 6. In other words, when a seven-day period includes days of more than one month, the labeling should be based on the later month.

 

The last thing is that for each given year, there should always be a week that starts from January 2nd. 

 

I'd really appreciate if you could give me a pretty specific code I could use (with variable names made up). I have attached my desired output as an excel file.

Thank you so much again in advance.

 

 

9 REPLIES 9
mkeintz
PROC Star

Jan 1, 2014 is a Wednesday.  Are you saying you only want weeks that start on Wednesdays and end on the following Tuesdays?

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
AlG
Quartz | Level 8 AlG
Quartz | Level 8

Thanks @mkeintz 

I am not sure I understand what you mean by "only". 

I apologize I made a mistake. the first week starts from January 2nd 2014 (This might be weird. The reason is I will want to match this dataset with another dataset later on, and this is how that dataset has defined month/week).

So, starting from January 2nd 2014, I want to define weeks of 2014 as starting from Thursday and ending on the following Wednesday. The only thing to mention, though, is that for each given year, there should always be a week that starts from January 2nd. I have attached my desired output as an excel file.

Reeza
Super User
SAS dates are numbers so that's quite trivial. Take the date, subtract the starting date and divide by 7 to get the weekly numbers. But from your initial post your logic seems more complicated, which is why I'm asking for sample data and output. For example should the next week after Month 1, Week 5 (your example) should it be Month 2, Week 1 or Month 2, Week 6?

data demo;
do i='01Jan2014'd to '31Dec2015'd;
week = floor((i-'01Jan2014'd) / 7) + 1;
output;
end;
format i date9.;
run;
AlG
Quartz | Level 8 AlG
Quartz | Level 8

Thanks @Reeza 

I really apologize, I had made some mistakes, in my original post, when explaining how weeks should be defined. I have corrected those. I have also attached to the original post the desired output as an excel file. Does that clarify things?

 

Thanks,

 

PaigeMiller
Diamond | Level 26

Attaching an Excel file doesn't help most people, since most people will not download Microsoft Office documents. Better to provide the data via SAS data step code.

--
Paige Miller
Reeza
Super User

Given your example, my original answer was correct, use the WEEK format (or you can use a WEEK function) you can shift the start date using shift index within the format. Or you can use the calculation I have above, changing it for each year. 

 

I would probably recommend the calculation above. 

 

data want;
set have;
yearData = year(date);

week = floor(date - mdy(1, 1, yearData)/7) + 1;

run;
Tom
Super User Tom
Super User

Looks like you want to do something like this:

data test;
  year=2014 to 2017;
  end_of_week=mdy(1,1,year);
  week=0;
  do until(end_of_week > mdy(12,31,year));
    month=month(end_of_week);
    output;
    week+1;
    end_of_week+7;
  end;
  format end_of_week yymmdd10.;
run;
Reeza
Super User

Have you taken a look at the default WEEK formats/functions and see if any of those meet your requirements?

 

WEEKU/WEEKW and WEEKW formats all exist and can be applied to a SAS date directly. 

 

https://support.sas.com/documentation/cdl/en/leforinforref/64790/HTML/default/viewer.htm#p1rvwomt04z...

 

I don't think they will, because they factor in Year and week, but not month. 

How do you want to handle weeks across years or are you ignoring the year component entirely?

 

Please provide some sample data and expected output that illustrates these cases. 

 

Here are instructions on how to provide sample data as a data step:
https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 


@AlG wrote:

Hi friends,

I am very new to the SAS world. While working with a dataset I have faced an issue. Hope you will help me. (I use SAS 9.4).

 

I have a dataset which includes the tweets of several companies during 2014-2018. I want to label each tweet based on the month and the week it was posted (i.e. a week and a month column). The way I want to do this is I want to start from January 1st 2014, and go seven days by seven days. so the tweets that were posted any time between January 1st 2014 to January 7th 2014 (inclusive) would be month 1 and week 1. Below are my two questions:

 

1. How can I create these two columns?

2. Because the labeling is seven days by seven days, it will not completely overlap with calendar weeks. For example, the fourth 7-day period of the month January ends on January 28th. The next seven day period would be January 29th to February 4th (inclusive). I want this period to be labeled month 1, week 5. In other words, when a seven-day period includes days of more than one month, the labeling should be based on the earlier month.

I'd really appreciate if you could give me a pretty specific code I could use (with variable names made up).

Thank you so much again in advance.

 

 


 

 

 

 

ghosh
Barite | Level 11

This code replicates your desired output in the Excel file

 

Untitled.png

data series;
	do i='01Jan2014'd to '31Dec2018'd by 7;
		year=year(i);
		month=month(i);
		week=week(i);
		output;
	end;
run;

proc print label data=series (obs=10);
	label i='Week end day';	
	format i mmddyy10.;
run;

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 1684 views
  • 8 likes
  • 6 in conversation