Hi,
I have a large dataset containing distances of different stores to a customer. My dataset looks something like this:
Customer_ID Retail_Store_1 Retail_Store_2 Retail_Store_3 Closest_Store
1234 0.56 0.78 3.1 Retail_Store_1
I am trying to split this dataset into 3 datasets for each retail store. I want to split by closest store and Retail_Store_# < 10 miles.
I used the obs=1000 to limit it but it is affecting my where clause. There are scenarios where the dataset is less than 1000 and I want to add more records to it. I want to check the remaining records in the original dataset and check again for the closest store and less than 10 miles until count reaches 1000 or there is no more records left.
Below is my code to search for the closest store:
%macro find_closest_store(county, numrows);
%let county = &county;
%let numrows = &numrows;
data closest_store;
set &county.data
array Retail_Store(&numrows) Retail_Store_1-Retail_Store_&numrows;
&county._min_value=min(of Retail_Event(*));
closest_store=vname(Retail_Store(whichn(&county._min_value, of Retail_Store(*))));
run;
%mend find_closest_store;
%find_closest_store(county1, 10);
My code for split (I had a macro but it was a long code so I just took a piece of it)
proc sql;
create table county_Retail_Store_1 as
select
from county_data (obs=1000)
where closest_store = 'Retail_Store_1' and Retail_Store_1 < 10;
quit;
However, this obs statement is throwing off the where clause and I am getting results for distances > 10 as well.
These are the steps that I want to take but I am running into issues.
Any help will be appreciated.
Many Thanks!
You report that the "obs=1000" option in this code
proc sql;
create table county_Retail_Store_1 as
select
from county_data (obs=1000)
where closest_store = 'Retail_Store_1' and Retail_Store_1 < 10;
quit;
is "throwing off the where clause and I am getting results for distances > 10 as well".
Are you saying that in the absence of the "obs=1000", you only get observations with Retail_Store_1<10, but in using the obs= parameter you start getting instances of Retail_Store_1>=10? I'll believe it when I see it.
In your particular example code, you don't have an variables listed in the select clause (or even "select *"). SQL would throw an error and stop processing the request. If you already had a table named County_retail_store_1, it would not be replaced. And if that prior table had distances over 10, they would still be there, wholly unrelated to any "obs=1000" parameter.
By the way, you could create all three datasets in one process, using a DATA step, as in:
data ret1 ret2 ret3;
set county_data;
if closest_store='Retail_Store_1' and retail_store_1<10 then output ret1; else
if closest_store='Retail_Store_2' and retail_store_1<10 then output ret2; else
if closest_store='Retail_Store_3' and retail_store_1<10 then output ret3;
run;
To replicate this via PROC SQL, you'd have to read the data three times, one for each result table. But the code above reads it only once to produce the same 3 "tables".
Edited correction - the 2nd and 3rd IF statement used the wrong variables in the "<10" comparison. It should be:
data ret1 ret2 ret3;
set county_data;
if closest_store='Retail_Store_1' and retail_store_1<10 then output ret1; else
if closest_store='Retail_Store_2' and retail_store_2<10 then output ret2; else
if closest_store='Retail_Store_3' and retail_store_3<10 then output ret3;
run;
Thanks for your response. I actually wrote that piece of code in a hurry. I had a * after select. I did see results for more than 10. I am not sure why it would give me that because when I remove the (obs=1000), I do get proper results.
Thanks a lot for your suggestion on data step instead proc sql, I will use that instead.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.