*SAS Question: How do I create a new variable "Season" to categorize my month values by season as a character variable?
I can't seem to get the season to populate correctly, instead i recieve a sas number in the new variable no matter what format I choose or how I add the new season variable*;
*This code chunk works*;
*DISTRIBUTION BY MONTH;
DATA want;
SET have;
datepart=datepart(startdate);
format datepart date9.;
MonthYear = datepart(startdate);
format Monthyear MONYY.;
Month = datepart(startdate);
format Month MonName.;
run;
*Thus code chunk does not work*;
DATA want (KEEP= STARTDATE DATEPART MONTHYEAR MONTH SEASON);
SET have;
SEASON=put(MONTH, $6.);
IF SEASON= "DEC" OR "JAN" OR "FEB" THEN SEASON = "WINTER";
IF SEASON= "MAR" OR "APR" OR "MAY" THEN SEASON = "SPRING";
IF SEASON= "JUN" OR "JUL" OR "AUG" THEN SEASON = "SUMMER";
IF SEASON= "SEPT" OR "OCT" OR "NOV" THEN SEASON = "AUTUMN";
RUN;
_______________VARIABLE PROPERTIES
STARTDATE
PROPERTY VALUE
LABEL STARTDATE
NAME STARTDATE
LENGTH 8
TYPE NUMERIC
FORMAT DATETIME19.
DATEPART
PROPERTY VALUE
LABEL DATEPART
NAME DATEPART
LENGTH 8
TYPE NUMERIC
FORMAT DATE9.
MONTHYEAR
PROPERTY VALUE
LABEL MONTHYEAR
NAME MONTHYEAR
LENGTH 8
TYPE NUMERIC
FORMAT MONYY.
MONTH
PROPERTY VALUE
LABEL MONTH
NAME MONTH
LENGTH 8
TYPE NUMERIC
FORMAT MONNAME
SEASON
PROPERTY VALUE
LABEL SEASON
NAME SEASON
LENGTH 6
TYPE CHAR
FORMAT ---
*
Why are you trying to use a character format with the numeric variable month?
Why are you comparing the value of SEASON to the logical expression ("DEC" OR "JAN" OR "FEB" ) ?
The easiest way to convert a date to season is to take the QTR number of the next month. That way DEC-FEB will map to QTR #1.
Example code
data test;
input dt :datetime. ;
date1=datepart(dt);
date2=date1;
format dt datetime19. date1 date9. date2 monyy7. ;
month_num = month(date1);
length season $6 ;
select (qtr(intnx('month',date1,1)));
when (1) season='WINTER';
when (2) season='SPRING';
when (3) season='SUMMER';
when (4) season='AUTUMN';
end;
cards;
01JAN2020:01:02
01FEB2020:01:02
01MAR2020:01:02
01APR2020:01:02
01MAY2020:01:02
01JUN2020:01:02
01JUL2020:01:02
01AUG2020:01:02
01SEP2020:01:02
01OCT2020:01:02
01NOV2020:01:02
01DEC2020:01:02
;
Results
month_ Obs dt date1 date2 num season 1 01JAN2020:01:02:00 01JAN2020 JAN2020 1 WINTER 2 01FEB2020:01:02:00 01FEB2020 FEB2020 2 WINTER 3 01MAR2020:01:02:00 01MAR2020 MAR2020 3 SPRING 4 01APR2020:01:02:00 01APR2020 APR2020 4 SPRING 5 01MAY2020:01:02:00 01MAY2020 MAY2020 5 SPRING 6 01JUN2020:01:02:00 01JUN2020 JUN2020 6 SUMMER 7 01JUL2020:01:02:00 01JUL2020 JUL2020 7 SUMMER 8 01AUG2020:01:02:00 01AUG2020 AUG2020 8 SUMMER 9 01SEP2020:01:02:00 01SEP2020 SEP2020 9 AUTUMN 10 01OCT2020:01:02:00 01OCT2020 OCT2020 10 AUTUMN 11 01NOV2020:01:02:00 01NOV2020 NOV2020 11 AUTUMN 12 01DEC2020:01:02:00 01DEC2020 DEC2020 12 WINTER
The result of a logical expression like you have, ( A or B or C), is always going to be either TRUE (which SAS represent by the number 1) or FALSE ( the number 0) so it will most likely never be equal to the value of SEASON.
If you want to test if a variable is one of a series of values either put the OR operator between each of the individual comparison , (x=a or x=b or x=c), or use the IN operator, (x in (a b c)).
But you need to test the value that is stored in the variable, not the way the value is displayed when printed.
FORMATS convert values to text. When you attach them to a variable it changes how the values are displayed, but it does not change the value that is stored. So your month variable does not contain the string 'January'. It contains a number like 21,918 which represents some date in January.
Your IF statements are wrong, and SAS formats don’t change the underlying values. Instead, use the month numbers which is a bit easier.
If season= “Jan” or “Feb” is wrong, it should use IN or be
If season = 'Jan' or season='Feb'
This is better:
*extract month from the datetime variable startDate;
Month = month(datepart(startDate));
*categorize as seasons;
If month in (1, 2, 3) then Season = 'Winter';
Else if month in (4:6) then Season = 'Spring';
As mentioned by @Tom and @Reeza, learn to use SAS date values (which are days since 01JAN1960) and then compute time periods using SAS functions and formats, which rely on numeric values of SAS dates, rather than trying to create your own calculations using character strings. The MONTH function, working on valid SAS date values, which produces values 1 through 12 gets you there. SAS has done the hard work for you, so you don't have to.
@Deanna_Payne wrote:
*Sorry I made the adjustment to this code chunk to reflect the month format seen in the picture-- Still doesn't work *;
DATA WORK.CO_PATIENT_SPI_ALT3 (KEEP= STARTDATE DATEPART MONTHYEAR MONTH SEASON);
SET WORK.CO_PATIENT_SPI_ALT2;
SEASON=put(MONTH, $6.);
IF SEASON= "December" OR "January" OR "February" THEN SEASON = "WINTER";
IF SEASON= "March" OR "April" OR "May" THEN SEASON = "SPRING";
IF SEASON= "June" OR "July" OR "Aug" THEN SEASON = "SUMMER";
IF SEASON= "September" OR "October" OR "November" THEN SEASON = "AUTUMN";
RUN;
If your Month value is the same as you created with "Month = datepart(startdate);" then Month would be numeric value of some sort and Put (month,$6) would never contain any text that looks at all like a month name.
Plus Doesn't work because you have created Season to be 6 characters long with the $6. format. So can never be "December" (8 characters), "January" (7 characters), "February" (8 characters) and several other those values are also longer than the possible 6 characters.
Here's what you get with Season=put(month,$6.);
142 data junk; 143 startdate='15Jan2021:12:20:30'dt; 144 month=datepart(startdate); 145 season=put(month,$6.); WARNING: Variable month has already been defined as numeric. 146 put season=; 147 run; season=22295
That 22295 is what your Season variable values look like. So you never get a month name to use.
You should also pay attention to the Warning about Month as numeric. That is another clue that your should not be using the $6 format with the value
And did you even look at the output? You would see that Season looks like 5-digit numbers for a wide range of likely recent dates.
A generic comment:
Doesn't work is awful vague.
Are there errors in the log?: Post the code and log in a code box opened with the "<>" to maintain formatting of error messages.
No output? Post any log in a code box.
Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the "<>" icon or attached as text to show exactly what you have and that we can test code against.
If you want to see the text of a month from a date value you would use the MONNAME format. If you want to compare to the full name of a month then you would use
Season=put(month, Monname9.); (The reason the MONNAME format defaults to Monname9 is so all the characters of the month names in English appear)
And your code would have to include "August" , not "Aug".
Or use Monname3. and compare to "Dec" "Jan" "Feb" , the first three letters of the month.
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.