BookmarkSubscribeRSS Feed
EinarRoed
Pyrite | Level 9

Still being rather new to DI Studio, I'd appreciate advice on a problem.

In an Extract transformation, I want to extract all rows where Last_Updated (DATETIME20.) is between the first and last date of the previous month. In other words, it needs to extract everything from August based on a datetime20 variable. How do I go about doing this?

9 REPLIES 9
Astounding
PROC Star

There are two basic steps to take:

The DATEPART function extracts just the date portion of your datetime variable.

The INTNX function can retrieve the first day of the months based on a date value.

Although it's not necessary to cram everything into one statement, it can be done:

where ( intnx('MONTH', "&sysdate9"d, -1) <= datepart(Last_Updated) < intnx('MONTH', "&sysdate9"d, 0) );

Note that &sysdate9 does not change, but is measured once when your session begins.

Good luck.

mstergia
Calcite | Level 5

Hi Astounding; I used your code in my program and it worked (i.e. when I run my data for this year, I get January and February YTD Data.  When I use your code, I get just January data).  So, it works great!  As I`m learning, I would like to understand the last part of the code.  I guess as I read it the (intnx("month", today(),-1)<=calendar_dt) makes sense to me (but doesn`t work on it`s own).  I`d like to better understand why it works when the last part (<intnx("month",today(),0)) is added to the code.  Thank you

 

(intnx("month", today(),-1)<=calendar_dt<intnx("month",today(),0))

DanielSantos
Barite | Level 11

As always, there is more than one right way do this...

For a month/year period extraction I kinda like to use formatted dates and perform an alpha match (which may work also for less/great than comparisons).

where put(date(),monyy7.) eq put(LAST_UPDATED,dtmonyy7.)


Now, what happens here:

date() returns the system date which is converted to text using the date monyy7. format = 'SEP2012'

LAST_UPDATE is converted to text using the dtmonyy7. date/time format = 'SEP2012'

And a alpha match is performed.

Or, you could use the datetime() function instead (and use date/time dtmonyy7 format), which returns the system date an time:

where put(datetime(),dtmonyy7.) eq put(LAST_UPDATED,dtmonyy7.)

Or, if you're not after an exact match, using the date yymm7. format = '2012M09'

where put(date(),yymm7.) gt put(datepart(LAST_UPDATED),yymm7.)


Notice the use of the datepart function above to convert LAST_UPDATED as the used format requires strictly a date value (not date/time).

More on SAS formats here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001263753.htm

More on PUT function here:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000199354.htm

Note: the use of date()/datetime() functions or &SYSDATE9 really depends on performance and how "accurate" you want the date to be. &SYSDATE9 evaluates at the beginning of the session, date()/datetime() at the moment the call is made to this functions.

Performance wise, the macro &SYSDATE9 should be "nicer" as it is evaluated once and then is part of the code as a constant value. The function is called each time the where is evaluated. Of course this could be meaningless or meaningful depending on the data's volume.

Cheers from Portugal.

Daniel Santos @ www.cgd.pt

DanielSantos
Barite | Level 11

Just noticed you wanted the previous month, not the current one.

As demonstrated previously you just need to use the INTNX function.

Just replace:

date()

with:

intnx('MONTH', date(), -1)

Sorry for that.

More on the INTNX function:

http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212700.htm

LinusH
Tourmaline | Level 20

Just want to underline the consequence of using function calls in a where clause.

As a general rule of thumb, they will prevent you from use index optimization, and will lead to a full table scan, which, of course, for large table will have a great impact on performance.

What I usually do, is to calculate a from and to datetime constants in a pre step (to the job or to the step), and store the as macro variable, in a datetime format. In the extract you will have:

last_updated between "&FromDatetime"dt and "&ToDatetime"dt

Data never sleeps
Patrick
Opal | Level 21

ANd to add to Linus very valuable statement: The code he used also allows the query to be pushed to the  database.

I normally try to avoid Precode so I often use %syfunc() instead for the same purpose as Linus outlines.

LinusH
Tourmaline | Level 20

Wow, I haven't considered that %sysfunc could be used for hiding "normal" function calls within regular SAS code.. Thanks for that tip!

Data never sleeps
Patrick
Opal | Level 21

not tested

intck('dtmonth', last_updated, datetime() ) = 1

or to not execute datetime() in every iteration

intck('dtmonth', last_updated, %sysfunc(datetime()) ) = 1

shivas
Pyrite | Level 9

Hi,

In extract transformation , where clause you can mention like this...

StartDate <= &last_updated < EndDate

and you can pass the last_updated date value in pre code(as this would be changing as per your requirement).

Thanks,

Shiva

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 9 replies
  • 4510 views
  • 6 likes
  • 7 in conversation