BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
EmilyAV
Calcite | Level 5

I have a very large dataset that significantly slows down my processing speed.

Therefore, before I do any work with the data, I'd like to reduce its size by creating a smaller dataset that only contains entries made after a specific date. The dates are already in MMDDYY8. format, but I can't figure out how to specify dates on or after that date (February 3, 2014 in this example) using the WHERE statement.

 

My code:

 

data small;

  set large;

   where date_variable >= '023014';

run;

 

Most of the examples I have found have used SQL or first adapted the date to an SAS informat, so I'm not sure whether the problem is the format I'm entering or something else. I've also tried to enter the date converted to the SAS day code without success. Thanks for your help!

 

Thanks very much,

Emily

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

SAS has two types of variables.  Fixed length character strings and floating point numbers.  

So the first thing you need to check is whether your variable is numeric or character.  Then if it is numeric whether it has a format attached that being used to specify how to display the floating point numbers as character strings.

 

If the variable has date values then it has the number of days since 1960 storing in it.  The format is what makes that number meaning full to humans.  So if the date is the third of February in the year 2014 then the value stored is the number 19,757.  If you display that number with the DATE9. format it will look like 03FEB2014.  If you display it with the MMDDYYN8. it will look like 02032014.  If you display it with MMDDYY10. format it will look like 02-03-2014.  If you display it with the format you mentioned, MMDDYY8., then it will look like 02-03-14.

 

If you want to type a date value into your program you can use a date literal.  That is a quoted string with the letter d appended.  The string inside the quotes needs to be something the DATE informat can interpret as a date.  So use one of these values:

'03FEB2014'd
'3-feb-2014'd
"03feb14"d

View solution in original post

8 REPLIES 8
PaigeMiller
Diamond | Level 26
where date_variable >= '03FEB2014'd;

Your words talk about FEB 3 but your example code looks for FEB 30 (which doesn't exist), I assume you really wanted FEB 3.

 

The format of the variable is irrelevant to creating the code. '03FEB2014'd will work regardless of the format.

--
Paige Miller
EmilyAV
Calcite | Level 5

You're right, thank you. I was also missing the lowercase 'd'. For my learning, why is that necessary if the dates are already SAS formatted? Thanks.

Reeza
Super User
Because you can have a variable as text or numeric. The d tells SAS to interpret it not as a constant or character value but as a date value.

Another option is to use the MDY() function.

where date_variable > mdy(2, 3, 2014)
PaigeMiller
Diamond | Level 26

Formats change the appearance to us humans. SAS represents the date internally as the number of days since 01JAN1960. So for example, 01JAN2020 is 21915 to SAS. Humans can type this as '01JAN2020'd, you don't have to know that it's 21915.

 

But the format can make it appear as 01JAN2020 or 2020-01-01 or many other appearances. Regardless of the appearance, SAS performs operations on the internal representation (01JAN2020 is represented as 21915) and so these are all equivalent

 

date_variable>21915
date_variable>'01JAN2020'd

because both are 21915 to SAS.

 

These will not work

 

date_variable>20200101
date_variable>2020-01-01

 because neither of these is 21915 to SAS.

--
Paige Miller
ballardw
Super User

@EmilyAV wrote:

You're right, thank you. I was also missing the lowercase 'd'. For my learning, why is that necessary if the dates are already SAS formatted? Thanks.


SAS uses a single data literal format, the Date9 (or date7),  such as 03Feb2021 because  010203 could be: Jan 2 2003(or 1903), 1 Feb 2003, or 2001 Feb 03. With the single format then the program does not have to attempt to guess which date layout is intended.

jimbarbour
Meteorite | Level 14

Hi, @EmilyAV ,

 

A Format typically only affects how you see the data.  How it is stored internal to SAS is still a number (the count of days since Jan 1, 1960).

 

Take a look at this little SAS dataset.  In the Application_Date column what you see is YYMMDDD10. formatted.  However, notice the little icon for that column.  It's a little calendar; in other words, it's a date.  It is not a character value of '2021-07-28'.  Look at the column to the left of it, Applicant_Name.  Notice the icon, a little cone with the letter "A" for alphanumeric (character) data.  The icons are not the same even though you see what looks like character data in the Application_Date.  

 

Think of a Format like a pair of sunglasses.  When you put on the glasses, yes, the world looks different (depending on the tint, prescription, polarization, etc.), but it's not that the world has actually changed.  Just what you see has changed.

jimbarbour_0-1627585917360.png

 

Jim

Tom
Super User Tom
Super User

SAS has two types of variables.  Fixed length character strings and floating point numbers.  

So the first thing you need to check is whether your variable is numeric or character.  Then if it is numeric whether it has a format attached that being used to specify how to display the floating point numbers as character strings.

 

If the variable has date values then it has the number of days since 1960 storing in it.  The format is what makes that number meaning full to humans.  So if the date is the third of February in the year 2014 then the value stored is the number 19,757.  If you display that number with the DATE9. format it will look like 03FEB2014.  If you display it with the MMDDYYN8. it will look like 02032014.  If you display it with MMDDYY10. format it will look like 02-03-2014.  If you display it with the format you mentioned, MMDDYY8., then it will look like 02-03-14.

 

If you want to type a date value into your program you can use a date literal.  That is a quoted string with the letter d appended.  The string inside the quotes needs to be something the DATE informat can interpret as a date.  So use one of these values:

'03FEB2014'd
'3-feb-2014'd
"03feb14"d

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 3295 views
  • 1 like
  • 6 in conversation