How to create multiple datasets for each zip code and get text clustering output

VISHALKAPASI — Wed, 13 Feb 2019 14:35:06 GMT

Hello Everyone..

I am using SAS Eminer 12.1 and using text parsing, filtering and finally getting results of text clustering. I am happy with unstructued data of Address, nicely classified in appropriate clusters.

Now I want this intelligent clustering within each zip code. Typically in 1 city we have more than 50 zip codes (and in addition multiple cities within country!)

The sample data lines are (For Example)

Address Customer_id Zipcode

infotech park, andheri 1 400701

nearby nandkamal garden, juhu 2 400701

people colony, sion 500 400701

industrial tower, turbhe 501 400702

government quarters, belapur 502 400702

saint international school, vashi 1000 400702

and so on

50 zip codes (having more than 500 customers within each zipcode)

So is it possible to create a loop where SAS will create multiple dataset based on say 50 zipcodes, 50 data sets will get created or when text clustering happens, it will create clustering grouped by zip code. So for each zip code we will have typically 20-25 clusters which we get and in final output we will have 20*50 ie. 1000 clusters, grouped by zip code

This will be really helpful. Hope I am able to explain

Regards

Vishal Kapasi

Re: How to create multiple datasets for each zip code and get text clustering output

Rick_SAS — Wed, 13 Feb 2019 15:27:22 GMT

Although I am not familiar with the software you are using, the "SAS Way" to handle this is to use BY-group processing to analyze each ZIP code. Check your documentation to see if it has an example that demonstrates the "BY statement", "BY processing", or group processing."

topic Re: How to create multiple datasets for each zip code and get text clustering output in SAS Data Science

How to create multiple datasets for each zip code and get text clustering output

Re: How to create multiple datasets for each zip code and get text clustering output