06-01-2016 05:34 AM - edited 06-02-2016 09:41 AM
Which building blocks and, which order and in which way need to be used to both match data from different (two) data sources and to extract the result with the correct data? Can you please describe in detail the process from the point of retreiving the data to the point of preparing the end result for further usage (my end result should be in tables on MS SQL Server).
I am having difficulties with understanding how SAS Data Mgmt Studo (DataFlux) works, and I cannot find online any documentation with good and straightforward expamples.
06-02-2016 08:14 AM
I'm going to assume the "result with the correct data" you are looking for is the combined "best view" of the matched data from both sources. Here goes:
If you haven't tracked down good examples on the SAS support site, one good alternative source is the online proceedings from past SAS Global Forums.
06-02-2016 08:20 AM
many thanks for your reply.
I have managed to do everything you said up to clustering part (inclusive).
Now I am at the point where my clusters have between few hunderts and few thousand rows.
I would really appreciate if you could give me the detailed explanation with a simple examle of the next step - using Surviving Record Identification node.
06-02-2016 08:55 AM - edited 06-02-2016 10:00 AM
In additon to my previous question, I would also like to ask for the guidelines in matching the adresses from Contract_Config table to AddressMaster table.
The table Contract_Config contains different types of adresses (4 types) with the corresponding fields (see the attachment). Each of those 4 types of addresses has to be matched with the adressMaster table (which contains only one set of fields for address definition - seet the attachment).
I am at the stage where I have (a) defined the data connections, and (b) loaded the data with the data source nodes. How do I need to proceed, which nodes I need to use, and how can I join 4 different types of addresses from one table with adressmaster?
NOTE: the date in the contract config table is not clean. Meaning for example, LINE_LINE_STREET field contains not only information about the street, but also the information about the house/door number. Does that mean parsing has to be applied before Data Union step?
EDIT_1: I have also attached the screenshot of the data job. Is this the correct approach?
I have also attached the screenshot of the node Surviving Record Identification. Can you please help me with the definition?
Looking forward to your reply.
06-02-2016 11:48 AM