Hello Everyone,
Hope you all are doing great and keeping yourself safe amid Covid-19.
Please help me, I am trying to implement Incremental loading: Using below code but getting error.
INC_TABLE is having approx 10 lakh names.
Thanks & Regards,
Robin
You cannot change the maximum length of a macro variable, it is part of how the macro processor is designed.
What is it that you are actually trying to do? It kind of looks like you want to load data from one table but exclude ids that exist in another table. Why not just make a view that does the join and load from that?
proc sql;
create view to_load as
select * from VALIBLA.MAIN_TABLE
where name not in (select name from VALIBLA.INC_TABLE)
;
quit;
So now just load TO_LOAD into VA instead of MAIN_TABLE.
You are abusing the macro processor for something it's not meant to do. Create a view with a sub-select, and use that in proc imstat.
Instead of loading 10 lakh names into macro variable, why not load in a hash table and use lookup to perform this? I believe you can check if your data fits into hash table based on your memory and decide this.
My personal preference is to avoid "select * " followed by "not in".
Thanks Nihanta, KurtBremser, Tom, Smantha.
Appreciate your kind help.
My requirement is to Delete rows from Main_Table which are identified in INC_Table through column "Name"
As Main_Table is on LASR server, need to perform this task using PROC IMSTAT. And after which I have to append all records from INC_Table to Main_Table (Which is simple to do so).
Please suggest the way out:
1. If I can get alternate of macro variable where length limit is exceeded.
2. Also please suggest how same can be used in where clause under PROC IMSTAT. (how view is used if created) As syntax error blocks the task further.
Thanks & Regards,
Robin Sharma
After working through the documentation for IMSTAT, I suggest the following:
Add a variable named _where_ to your update dataset, and create a WORK table with it:
data INC_TABLE;
set VALIBLA.INC_TABLE;
_where_ = "name ='" !! strip(name) !! "'";
run;
Now use that in an UPDATE statement in IMSTAT:
proc imstat data=VALIBLA.MAIN_TABLE;
update data=INC_TABLE;
run;
quit;
I have no idea if that works, it's just what my interpretation of the documentation would let me try; I have no LASR server at hand, so I cannot test it.
Hi Kurt,
Thanks for your kind support.
I have to perform two steps:
1st Step: Delete records from Main_Table which is on LASR server and contains data for a BI report. Here records which need to be deleted must be identified from INC_Table through column "name".
Challenge:
1.Macro variable "KeyName" performing perfect for small size INC_Table but failing large INC_Table and giving error of limited length characters.
2. Sub queries do not work with where clause under PROC IMSTAT Procedure.
Since you remove and append the same keys (names), it is effectively an update. Unless your keys are not unique.
Yups, Its an update to accomplish Incremental loading. But Unique key "Name" has multiple entries based on financial year, amout...etc.
Scenarios is like,
Main_Table
Year Name Amount
2019 ABCDS6008K 500
INC_Table
Year Name Amount
2019 ABCDS6008K 500
2020 ABCDS6008K 1000
First, I would advise you to test my suggestion with some simple made-up data, so you can verify if my idea works at all.
Next, you need to expand the _where_ variable in the update dataset so that it identifies unique observations:
data INC_TABLE;
set VALIBLA.INC_TABLE;
length _where_ $200;
_where_ = "name = '" !! strip(name) !! "' and year = " !! put(year,z4.) !! " and amount = " !! put(amount,best.);
run;
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.