SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Standardization failing on some rows - no error codes

Reply
Regular Learner
Posts: 1

Standardization failing on some rows - no error codes

Hello - 

 

On Dataflux DMS 2.7 I am trying to run Standardization -> U.S. Address Verification within a single Data Job.  Single Input and Output from and to flat files(.xlsx Input, .csv Output).  In the output file I noticed that a number of rows were offset one column to the right.  Upon investigation, these rows appeared to have failed Standardization, but I'm not sure why.

 

A few example data below:

 

1320 N BROADWAY - Suite #3
1320 N BROADWAY #1

2211 11TH AVE APT #B
2211 11TH AVE APT #A

 

I'm not sure if it's failing because of the # sign?  I understand it is a reserved character for regex.  If that is the problem, how do I fix it?  If not, whats's happening and how do I fix that?  Thank you much!

SAS Super FREQ
Posts: 90

Re: Standardization failing on some rows - no error codes

Hi,

 

Can you provide some more details?

 

  • Are you using the "Standardization" node or are you using the "Address Verification (US/Canada)" node in DM Studio? Or maybe both? Can you attach an image of you data job flow?
  • If you are using the "Standardization" node, what QKB locale and definition have you chosen to use?

 

If you are using the "Standardization" node, you can use the QKB customization editor (Find it under Administration > Quality Knowledge Bases > QKB CI) to open the standardization definition and see the transformations used against some sample input you provide. If it is indeed a regex problem, you can use the Regex Library Editor to make modification to regex libraries used in the QKB standardization definition.


Ron

Ask a Question
Discussion stats
  • 1 reply
  • 102 views
  • 0 likes
  • 2 in conversation