I am having some troubles with cleaning these addresses. at first it was easy with some if statement. But since my data is getting bigger, it is not efficient to use if statement anymore. I would love to hear your thoughts about how to clean this field.
input address $200;
100 ABC ST RM S02A
102 ABC STREET FLOOR 1
103 ABC ST APT 3 HOMELESS
1035 CD AVENUE FLOOR 2
108 SOMETHING ST # 2FL
115 VISA VISTA DR APT 212 APT 212
1155 LOOK AVENUE APT 205
12 BORED AVE APT 2
1214 TIRED STREET APT 428
127 HAPPY STREET FLOOR 2
1397 SOMEWHERE STREET FIRST FLOOR
142 SOMETHING ST APT 3
200 RAINBOW AVE UNIT 202
I don't want any Unit or floor or apt number in the clean address. So I want the address field that would look like this:
What is your actual use case? If your address cleaning needs to scale up to enterprise-wide customer volumes and techniques then you would be better off using a tool specific to this task like SAS Data Quality. On the other hand, cleaning a few hundred addresses with a few transformation rules like the ones in your post, you are probably better off persevering with your current approach.
Secure your spot at the must-attend AI and analytics event of 2024: SAS Innovate 2024! Get ready for a jam-packed agenda featuring workshops, super demos, breakout sessions, roundtables, inspiring keynotes and incredible networking events.
Register by March 1 to snag the Early Bird rate of just $695! Don't miss out on this exclusive offer.