Hi ,
I would like to compare 2 attributes of two tables A and B. If the attributes match, I need to update table A.
How do I do that with SAS DI studio?
Thanks in advance,
Regards,
Sheeba Swaminathan
What do you mean by attribute?If you mean like table label or something, it sounds odd. There are no standard transform for this. You can have user written code in the pre step of a table loader that check table attributes. If mismatch abort the job.
Hi Linush,
I am sorry about that...It was a mistake...I meant column.
I would like to compare 2 Columns of two tables A and B. If the column values match, I need to update table A.
How do I do that with SAS DI studio?
Regards,
Sheeba Swaminathan
Which DIS version are you using? Starting with DIS4.4 there is a SQL Update transformation allowing you to update columns based on a subquery.
Hi Patrick,
I am using DIS 4.6. I will try out the update transformation
Regards,
Sheeba Swaminathan
Hi Sheeba
You must be very clear of how you want to load the table. Let me recap what I understand:
1. You have a source table with columns {groupid, element, pf}. These 3 columns do not uniquely identify a row (so can not be used as primary composite key).
2. You have a target table where you want to match your source data with the target over variable {element, pf}. If I'm correct with 1. then this is a many:many relationship.
3. You want generate a surrogate key based if there is no match over columns {element, pf}
That's kind of confusing.
- If you have column {groupid} in your source why can't you simply load it?
- A surrogate key is generated based on a unique composite business key - which you haven't specified yet. A surrogate key serves as primary key and it must be unique - so it can't be a "groupid".
- There is the concept of a retained key - but such a key is only used in the context of slowly changing dimensions.
How do you want to load your source data? All rows from source or only deltas. I'm only guessing here but I believe what you're after is to first load your data using a normal table loader with either update/insert - or a simple append if this is what you need to do.
If it's update/insert then you will need to determine the business key.
I'm not sure about the purpose of "groupid". It appears that a group is defined over {element, pf}. If you really need an extra variable for the group then just concatenate the 2 existing variables like:
groupid=cats(element,'|',pf); You can also use this to generate some hash value if you prefer this (look up the code generated in the SCD2 loader for the digest value if you want to go down this path).
Hope this is of some help
Patrick
Hi Patrick,
I am sorry about the disconnect here. I wanted to rephrase my question to make it more clear with examples
I have a table A having following columns
1) line_id (PK)
2)groupid
3) element
4) pf
5) scope
6)pid
All i wanted here is to update table A's groupid column based on matching attributes in table B.
The structure of Table B is
1) execkey(PK)
2)groupid
3)pf
4)scope
5)pid
The update condition here is
if A.element = "ABC", then update A.groupid = (Select groupId from B where A.pf = b.pf)
if A.element = "DEF", then update A.groupId = (Select groupId from B where A.scope = B.scope and A.pid=B.pid)
Regards,
Sheeba Swaminathan
I'm still not really getting the update rules here.
If you provide some sample input and desired output data, it might get clearer.
Hi Linush,
Please find the sample input and output data below.
I have a table A having following columns
Table A
line_id (PK) groupid element pf scope pid
1 ABC pf1 scope1 pid1
2 DEF pf1 scope1 pid1
3 XYZ pf2 scope1 pid2
4 XXX pf3 sope2 pid3
5 YYY pf4 scope2 pid4
Table B
groupid pf scope pid
201 pf1 scope1 pid1
202 pf1 scope1 pid1
201 pf2 scope1 pid2
pf3 scope2 pid3
203 pf4 scope2 pid4
The update condition here is
if A.element = "ABC", then update A.groupid = (Select groupId from B where A.pf = b.pf)
if A.element = "DEF", then update A.groupId = (Select groupId from B where A.scope = B.scope and A.pid=B.pid)
My aim is to update A comparing the values of A and B.
Initially Table A does not have group idd. After the transformation A would look like
ouptut Table A
line_id (PK) groupid element pf scope pid
1 ABC pf1 scope1 pid1
2 DEF pf1 scope1 pid1
3 XYZ pf2 scope1 pid2
4 ABC pf3 scope2 pid3
5 DEF pf4 scope2 pid4
1) for the first record A.element = ABC and A.pf= B.pf so I need to assign A.groupid=B.group id
2) for the second record A.element='DEC' then do;
if(A.scope=B.scope and A.pid=B.pid ) then A.groupid=B.groupid.
Your data does not support the logic you want to implement.
" if A.element = "ABC", then update A.groupid = (Select groupId from B where A.pf = b.pf)"
The sub-select "Select groupId from B where A.pf = b.pf" returns 2 matching rows from table B with 2 different groupid's (201, 202). So which groupid should it be?
You also want to apply different matching logic based on values of "Element". It appears that this is not table driven but would need the use of hard coded values in your code (which is "ugly"). Also you haven't told us what should happen with "element='XYZ' ".
But first: You need a relationship between your tables suitable for the logic you want to implement - or you have to amend your logic (eg. if there are multiple groupid's possible then take the highest value).
Hi Patrick,
Thanks for the reply.
Apologies for incorrect data.
.....group id can have only unique values or null values.....Also if the element is xyz the condition is different. Yes here i need to harcode the values.
Table A
line_id (PK) groupid element pf scope pid
1 ABC pf1 scope1 pid1
2 DEF pf1 scope1 pid1
3 XYZ pf2 scope1 pid2
4 ABC pf3 sope2 pid3
5 DEF pf4 scope2 pid4
Table B
groupid pf scope pid
201 pf1 scope1 pid1
202 pf1 scope1 pid1
203 pf2 scope1 pid2
pf3 scope2 pid3
204 pf4 scope2 pid4
The update condition here is
if A.element = "ABC", then update A.groupid = (Select groupId from B where A.pf = b.pf)
if A.element = "DEF", then update A.groupId = (Select groupId from B where A.scope = B.scope and A.pid=B.pid)
My aim is to update A comparing the values of A and B.
Initially Table A does not have group idd. After the transformation A would look like
ouptut Table A
line_id (PK) groupid element pf scope pid
1 ABC pf1 scope1 pid1
2 DEF pf1 scope1 pid1
3 XYZ pf2 scope1 pid2
4 ABC pf3 scope2 pid3
5 DEF pf4 scope2 pid4
1) for the first record A.element = ABC and A.pf= B.pf so I need to assign A.groupid=B.group id
2) for the second record A.element='DEC' then do;
if(A.scope=B.scope and A.pid=B.pid ) then A.groupid=B.groupid.
3) for the third record A.element='YZ' then do;
if(A.scope=B.scope ) then A.groupid=B.groupid.
4)
You're missing the point and the data you've posted does still not support your logic.
I suggest you post the 3rd result table with the values for "groupid" populated. May be when you try to pick the values from Table B you will realise where the issue is.
For the first row in Table A and using your logic there are 2 matching rows in Table B (all rows where b.pf='pf1) so there are 2 possible values for groupid. So which one would you pick to populate your 3rd table (the result after the update)?
Hi Patrick,
I got your point. I also want to specify that I need to update table A based on comparison....My result table is again going to be A. No new table needs to be created.
I am reposting the corrected sample data below.....
Table A
line_id (PK) groupid element pf scope pid
1 ABC pf1 scope1 pid1
2 DEF pf2 scope2 pid2
3 XYZ pf3 scope3 pid3
Table B
groupid pf scope pid element
201 pf1 scope1 pid1 ABC
202 pf2 scope2 pid2 DEF
203 pf3 scope3 pid3 XYZ
204 pf4 scope4 pid4 XYZ
) for the first record A.element = ABC and A.pf= B.pf so I need to assign A.groupid=B.group id
2) for the second record A.element='DEF' then do;
if(A.scope=B.scope and A.pid=B.pid ) then A.groupid=B.groupid.
3) for the third record A.element='XYZ' then do;
if(A.scope=B.scope ) then A.groupid=B.groupid.
Output TABLE A
line_id (PK) groupid element pf scope pid
1 201 ABC pf1 scope1 pid1
2 202 DEF pf2 scope2 pid2
3 203 XYZ pf3 scope3 pid3
O.K. - your logic is now going to work with exactly the sample data you've provided. Here the code:
data tableA;
length groupid 8;
if 0 then call missing(groupid);
input line_id (element pf scope pid) (:$8.);
datalines;
1 ABC pf1 scope1 pid1
2 DEF pf2 scope2 pid2
3 XYZ pf3 scope3 pid3
;
run;
data tableB;
input groupid (pf scope pid element) (:$8.);
datalines;
201 pf1 scope1 pid1 ABC
202 pf2 scope2 pid2 DEF
203 pf3 scope3 pid3 XYZ
204 pf4 scope4 pid4 XYZ
;
run;
proc sql;
update tableA as A
set groupid =
(select groupid from tableB as B
where
(A.element='ABC' and A.pf=B.pf)
or (A.element='DEF' and A.scope=B.scope and A.pid=B.pid )
or (A.element='XYZ' and A.scope=B.scope)
)
;
quit;
Now with DIS you can use the SQL Update transformation. I don't have access to DIS right now but it shouldn't be too hard to set it up in a way that it generates code as above in the SQL Update statement.
Thanks a lot Patrick.
I will try this out and post the result.
Regards,
Sheeba Swaminathan
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.