Don't just think like a data scientist. Be one.

DataFlux - Data Exploration: MAP vs REPORT/Field Match

Reply
New Contributor
Posts: 2

DataFlux - Data Exploration: MAP vs REPORT/Field Match

Hello everyone,

 

I found differences in the results between "Map" tab and "Report" tab after running a Data Exploration. I though they had to be the same and would appreciate an explanation. An attached pdf contains screenshots.

 

Using the Virtual Lab, I opened DataFlux Management Studio (2.6, QKB 2.6) and did the following:

 

* Created repository "Basics Demos"

* Created a new Data Exploration Ch3D2_dfConglomerate_DataExploration

  - used dfConglomerate Gifts and dfConglomerate Grocery as source data (see page 1)

  - checked "Field Name Matching"; Locale: "English (US)"; Match def.: "Field Name"; Sensitivity: 85 (see page 1)

* Run the DataExploration

* Received in the tab "Map" a map that, if I clicked the field "ID" (table Products; dfConglomerate Gifts), resulted in 9 highlighted green lines; meaning 9 similar fields were found (see page 2: Products/ID is selected by left click, Orders/Customer ID is selected by mouse over). These fields are:

  - ID in Employees (dfConglomerate Gifts)

  - ID in Customers (dfConglomerate Gifts)

  - ID in MANUFACTURERS (dfConglomerate Grocery)

  - MANUFACTURER_ID in BREAKFAST_ITEMS (dfConglomerate Grocery)

  - ITEM_ID in BREAKFAST_ITEMS (dfConglomerate Grocery)

  - ID in BREAKFAST_ITEMS (dfConglomerate Grocery)

  - ID in Order Details (dfConglomerate Gifts)

  - EMPLOYEE ID in Orders (dfConglomerate Gifts)

  - CUSTOMER ID in Orders (dfConglomerate Gifts)

* Received in the tab "Report", "Field Match" riser bar a list that, if I clicked the field "ID" (table Products; dfConglomerate Gifts), resulted in only 6 similar fields (see page 3). These 6 fields are included in the list above. The 3 missing fields are:

  - CUSTOMER ID in Orders (dfConglomerate Gifts) (the one selected on page 2 by mouse over)

  - ITEM_ID in BREAKFAST_ITEMS (dfConglomerate Grocery)

  - MANUFACTURER_ID in BREAKFAST_ITEMS (dfConglomerate Grocery)

 

If I'm not mistaken, it should be the same amount of similar fields in both cases, or?

 

Thanks and best regards,

Lu

SAS Moderator
Posts: 30

Re: DataFlux - Data Exploration: MAP vs REPORT/Field Match

Luhan,

This is very observant, and it is something we are going to need to look into.

-theresa

SAS Employee
Posts: 25

Re: DataFlux - Data Exploration: MAP vs REPORT/Field Match

Hi Luhan,

 

I was able to replicate this with the latest Data Management Studio and QKB versions. We will investigate.

 

Thanks for reporting.

Audrey

Ask a Question
Discussion stats
  • 2 replies
  • 213 views
  • 0 likes
  • 3 in conversation