Still being rather new to DI Studio, I'd appreciate advice on a problem.
In an Extract transformation, I want to extract all rows where Last_Updated (DATETIME20.) is between the first and last date of the previous month. In other words, it needs to extract everything from August based on a datetime20 variable. How do I go about doing this?
There are two basic steps to take:
The DATEPART function extracts just the date portion of your datetime variable.
The INTNX function can retrieve the first day of the months based on a date value.
Although it's not necessary to cram everything into one statement, it can be done:
where ( intnx('MONTH', "&sysdate9"d, -1) <= datepart(Last_Updated) < intnx('MONTH', "&sysdate9"d, 0) );
Note that &sysdate9 does not change, but is measured once when your session begins.
Good luck.
Hi Astounding; I used your code in my program and it worked (i.e. when I run my data for this year, I get January and February YTD Data. When I use your code, I get just January data). So, it works great! As I`m learning, I would like to understand the last part of the code. I guess as I read it the (intnx("month", today(),-1)<=calendar_dt) makes sense to me (but doesn`t work on it`s own). I`d like to better understand why it works when the last part (<intnx("month",today(),0)) is added to the code. Thank you
(intnx("month", today(),-1)<=calendar_dt<intnx("month",today(),0))
As always, there is more than one right way do this...
For a month/year period extraction I kinda like to use formatted dates and perform an alpha match (which may work also for less/great than comparisons).
where put(date(),monyy7.) eq put(LAST_UPDATED,dtmonyy7.)
Now, what happens here:
date() returns the system date which is converted to text using the date monyy7. format = 'SEP2012'
LAST_UPDATE is converted to text using the dtmonyy7. date/time format = 'SEP2012'
And a alpha match is performed.
Or, you could use the datetime() function instead (and use date/time dtmonyy7 format), which returns the system date an time:
where put(datetime(),dtmonyy7.) eq put(LAST_UPDATED,dtmonyy7.)
Or, if you're not after an exact match, using the date yymm7. format = '2012M09'
where put(date(),yymm7.) gt put(datepart(LAST_UPDATED),yymm7.)
Notice the use of the datepart function above to convert LAST_UPDATED as the used format requires strictly a date value (not date/time).
More on SAS formats here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a001263753.htm
More on PUT function here:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000199354.htm
Note: the use of date()/datetime() functions or &SYSDATE9 really depends on performance and how "accurate" you want the date to be. &SYSDATE9 evaluates at the beginning of the session, date()/datetime() at the moment the call is made to this functions.
Performance wise, the macro &SYSDATE9 should be "nicer" as it is evaluated once and then is part of the code as a constant value. The function is called each time the where is evaluated. Of course this could be meaningless or meaningful depending on the data's volume.
Cheers from Portugal.
Daniel Santos @ www.cgd.pt
Just noticed you wanted the previous month, not the current one.
As demonstrated previously you just need to use the INTNX function.
Just replace:
date()
with:
intnx('MONTH', date(), -1)
Sorry for that.
More on the INTNX function:
http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000212700.htm
Just want to underline the consequence of using function calls in a where clause.
As a general rule of thumb, they will prevent you from use index optimization, and will lead to a full table scan, which, of course, for large table will have a great impact on performance.
What I usually do, is to calculate a from and to datetime constants in a pre step (to the job or to the step), and store the as macro variable, in a datetime format. In the extract you will have:
last_updated between "&FromDatetime"dt and "&ToDatetime"dt
ANd to add to Linus very valuable statement: The code he used also allows the query to be pushed to the database.
I normally try to avoid Precode so I often use %syfunc() instead for the same purpose as Linus outlines.
Wow, I haven't considered that %sysfunc could be used for hiding "normal" function calls within regular SAS code.. Thanks for that tip!
not tested
intck('dtmonth', last_updated, datetime() ) = 1
or to not execute datetime() in every iteration
intck('dtmonth', last_updated, %sysfunc(datetime()) ) = 1
Hi,
In extract transformation , where clause you can mention like this...
StartDate <= &last_updated < EndDate
and you can pass the last_updated date value in pre code(as this would be changing as per your requirement).
Thanks,
Shiva
Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.
Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.