BookmarkSubscribeRSS Feed
kmc9731
Calcite | Level 5

The address_1, address_2 and address_3 are the raw data collected in hand and they summed up together to be the complete whole address. And now I have to like sort the whole address into the corresponding data columns like flat, floor, building, street and district. I have tried to use the if then statement but it was too troublesome to do so and took me long time to execute like the whole data set contains like thousands of items. Can array and do loop help to do the task?

 

Input:

ADDRESS_1

ADDRESS_2

ADDRESS_3

ROOM 11 23/F

61-63 YYY RD

District A

ROOM 10 1/F ABC BLDG

33-37 ZZZ STREET

District B

UNIT A1 G/F

QWE SQUARE

29 XYZ RD District A

ROOM 16 3/F WATER HOUSE

46 XXX ROAD

District C

 

Output:

floor

flat

building

street

district

23

11

 

61-63 YYY RD

A

1

10

ABC BLDG

33-37 ZZZ STREET

B

G

A1

QWE SQUARE

29 XYZ RD

A

3

16

WATER HOUSE

46 XXX ROAD

C

 

Thanks!

2 REPLIES 2
PeterClemmensen
Tourmaline | Level 20

This seems to be quite a messy task. Some if it is pretty straight forward. However, other parts are difficult to put into programming logic.

 

For example, how exactly is the street defined? Or building?

ballardw
Super User

See if this example gives you enough clues to get the floor, flat and building information.

If this works the district should be easy enough using the patterns here. (Hint: index looking for 'District')

 

This relies on the Floor information ALWAYS as the 3rd element, always has a / in the middle, room is always second item, that building is the bit after the floor information and spaces always separate them.

data example;
   address_1='ROOM 10 1/F ABC BLDG';
   length floor flat $ 5 ;
   floor= scan(scan(address_1,3,' '),1,'/');
   flat = scan(address_1,2,' ');
   building = substr(address_1, index(address_1,scan(address_1,4,' ')));
run;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 2 replies
  • 1745 views
  • 1 like
  • 3 in conversation