Hi,
I was trying to create a SAS DI job where we have create below jobs :
1.Parameterised Job
2.Loop Job
In loop job where we are getting distinct year and month and using loop functionality to load it to final SAS datasets
Actually are loading the data based on below values to loop over it
first time
month year
NOV 2018
Jan 2018
DEC 2017
Actually i was looping over NOV 2018 and load into SAS dataset
and then Jan 2018 load,and then DEC 2017 dataset like wise.
so is there any way to loop not one month in year,but rather 2 or 3months to loop over it.
please help
In Theory i need to below steps,please help
1.Extract the minimum date value from existing column(date) from datasets and get the data from all data sets for example May month's data from year 2016
2.After extracting May month's data from all data sets then need to do join operation(take required columns)
3.Load it into target table for only may month's data
Proceed above steps same for month june till ... max(month)(19Apr2019) and load it
I don't fully get it what you have and what you need so let me make a few assumptions and then throw some ideas.
1. You have source tables with monthly data - something like month end account balances (1 row per account and month, but multiple months in the table).
2. You want to load some SCD type target table - so you need to load month by month.
3. This is a one off task. Going forward you'll just have a monthly process where you just need to pick the latest month.
If above is true:
I'd start with implementing the inner job. I'd define a job parameter (a prompt) with a default value of a date which I know exists in the source tables. Then implement all the code required to load this one month (using the parameter macro variable in your source data selections).
Once that works I'd change the default parameter to the next higher month's date. Then execute my job again. If this works and the data in target is as expected then my first round of unit testing for this inner job is done.
So now to the outer (loop) job: What you need to do here is implement logic which builds up the control table for the loop transformation (creating 1 row per month). I guess you just need to create lists of distinct month begin dates from your source tables (use intnx() to align source dates if you're dealing with a mixture of month begin and month end dates) to work out for which dates you've got all the data, then sort the dates an populate the loop control table. Define the loop to NOT execute in parallel so that the inner job then gets called in sequence month by month.
...and for ongoing processing: You should be able to use the inner job on its own with little to no change. You basically would just need to add a bit of logic at the start of the job (i.e. in the job pre-code section) which checks if the inner job has been called via the loop transformation or on its own. To do so you could look for DIS generated macro variables which you can use as indicators; else: once the first round of inner job unit testing is done set the default of the date parameter to missing and use this as indicator =set the date parameter to current month -1 if missing.
Hi @Patrick ,
Iam using below code in outer loop job in user written node
PROC SQL;
create table &_output. as
SELECT distinct year(Date) as year, month(Date) as month
FROM &SYSLAST. ;
QUIT;
from above code i am getting output as below
year month
2018 May
2018 NOV
2018 OCT
2019 JAN
based on the above output every time loop transformation will do looping on 2018 May and load the data to final target table.and next on 2018 Nov ....like this
This is currently happening in job.
but i want to do the looping over 2018 year ,May month and Nov month,Oct month(only 3 months to be taken in each loop operation and load the data to final table please ).
Would you please suggest is there any option we can do this please.Thanks
The loop transformation will execute the inner job exactly once per row in the loop control table (your &_output table I assume). For the example you've provided the loop will execute the inner job exactly 4 times.
How you define and populate the control table and what the inner job then does with the parameters you pass-in is totally up to you.
"but i want to do the looping over 2018 year ,May month and Nov month,Oct month(only 3 months to be taken in each loop operation and load the data to final table please )."
Sorry, but I don't understand your explanation. Where do these 3 months come from? Do you want a control table with start months (i.e. quarter) and then the inner loop to execute data quarter by quarter (=you would need to fix how you populate your control table)?
Happy to help but I'd need you to try and formulate what you have and what you want once more in different words to help me understand the problem.
Hi @Patrick ,
Basically we are creating one-off job where finally SAS dataset will be going to have some millions of records.
1. dragged and dropped the &SYSLAST table
node into job area
2.written below mentioned code in user-written node transformation where we will get distinct year and month
PROC SQL;
create table &_output. as
SELECT distinct year(Created_Date) as year, month(Created_Date) as month
FROM &SYSLAST. ;
QUIT;
and above output node will have only two columns (year and month )
and output will look similar as below
year month
2018 11
2018 10
2019 01
2.next Loop transformation is used (done the parameter mapping tab as year and month as mapped source column)
3.loop job added
4.loop end transformation added.please find the below screen shot and kindly ignore the error in screen shot please.
Screen Shot :
Patrik,currently this job will work as loop on year -2018 and month-Nov and load the final dataset and then take another record 2018 Oct and load to final SAS table and it goes till ...
actually only one month per each year we are looping over loading the data to dataset right now,
is it any other option that we can load instead of one month per each year,2 or 3 months like -loop over 2018 year under october and november months data to be loaded to final SAS table please to increase the efficiency.
please let me know if you need any clarification or any thing which i am not clear please.kindly suggest if you have any better approach which you feel please
Thanks for the details. This makes things much clearer.
If you implement logic for your inner job so that it can pick up 3 months per iteration then sure you can do it. You then just would need to generate a control table for the loop which provides the start year/date and you then have logic in your inner job which picks up the month/year you pass in as parameter plus the two following months - and of course you populate your control table with dates in 3 month steps.
...or you amend your selection logic in the inner job to work with date ranges - i.e. date between &startDT and &endDT - and you then amend your control table and the prompts for the inner job so that you can populate and pass such values.
Having said that:
If there is any chance that you can use the exactly same inner job for on-going monthly processing with almost no extra logic for the one off situation then I would prioritize re-usability over one-off job performance. So in such a case I'd go for monthly loops even if this takes a bit longer for the initial load. Using the same inner job also for ongoing processing would reduce implementation and testing effort and - sort of - add an additional round of "testing" before moving into BAU mode as the inner job will already get executed many times as part of the one off initial load.
Hi @Patrick ,
Thanks for the reply,would you please clarify on below points.
1.You mean to say you would chose rather that using loop option for 3 months under each year it is better to use only 1 month per each year please ?
2.Also sorry to trouble you,but i am not really understood where and how we can modify the existing loop job so that it will take 3months at a time while loading
About 1)
This is about cost and effort. If you can use a job for both initial load and ongoing BAU processing then that's certainly more cost effective as you don't have to design, develop and test two different jobs
About 2)
I don't know what your inner job does. All I'm saying is: Your inner job uses some SAS macro variables which get populated via the loop transformation and what's in the loop control table. Because you've got everything under control you can of course write code in your inner job which can process 3 months worth of data at a time - and you then can create and populate a control table which passes in values to the inner job as it requires it for such 3 months at a time processing.
Hi @Patrick ,
Would you please help on code to write in job which can process 3 months of data in each year if possible please
How would I be able to do that? I don't know anything about your source and target data.
On a general level: That's just about formulating your SQL where clause and/or join condition in a way that it selects 3 months.
It's getting late where I live so putting my computer to sleep for now.
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.