Hello experts!
I have 2 datasets - a Customer dataset and a Retailer dataset.
Customer address information has a column named Custzip which has the Zipcode for each customer.
The retailer has many stores across US and the Retailer dataset has a column named StoreZip which has the Zipcode for each store.
For each customer, I need to compute the distance between the customer and all the stores of the retailer and then select the zipcode which has the least distance (the nearest store for that customer). I want to use the zipcitydistance function provided by SAS to compute distances.
For example, say CustZip looks like this:
ID | CustZip |
1 | 99212 |
2 | 92882 |
3 | 98058 |
4 | 8810 |
5 | 29901 |
6 | 92637 |
7 | 92336 |
8 | 92008 |
9 | 93730 |
10 | 90224 |
11 | 80015 |
12 | 94401 |
13 | 91776 |
14 | 10514 |
15 | 97221 |
16 | 85338 |
17 | 98119 |
18 | 93271 |
19 | 84004 |
20 | 60137 |
And StoreZip looks like this:
ID | StoreZip |
1 | 33146 |
2 | 95678 |
3 | 92101 |
4 | 95128 |
5 | 97086 |
6 | 60173 |
7 | 48083 |
8 | 55425 |
9 | 10601 |
10 | 11590 |
11 | 44124 |
12 | 97015 |
13 | 97204 |
14 | 19406 |
The final output needs to enhance the Custzip structure and add the columns: NearestStoreZip and Distance (as computed by zipcitydistance).
Please let me know if my query is not very clear. I am using SAS 9.4 and any help will be deeply appreciated.
Thanks!
Arko.
It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.