It is possible to provide an extract of the dataset you have? I think it might be preferrable to have a seperate variable for number, street name, city, region, country etc - that would make searching and creating new variables more straightforward. At the moment I would be concerned that there is too much potential overlap between lines like if index(address,"GERMANY")>0 THEN countyname_1='Mainland Europe'; and if index(address,"NY")>0 THEN countyname_1='USA and Canada'; if they were ordered as above (i know they arent in your example) instances of 'Germany' could be allocated to 'USA and Canada'. Also is it possible for these strings to also appear in street names and so generate false results? Maybe the data could be read in differently if the structure of the datasets does indeed turn out to be an issue?
... View more