BookmarkSubscribeRSS Feed
ay02261
Calcite | Level 5

Hello everyone. Trying to figure out what code to use for the following logic. From the dataset created by the code below, I want to exclude rows where the value of dealer_state = the value of customer_state or states adjacent to the dealer_state. For example, if the dealer_state = 'OH', I would want to exclude rows where customer_state = 'OH','KY','IN','MI','WV' and 'PA'. I'm ok with hardcoding the exclusions, but don't know what that code would be. I'm using SAS 7.1 on Windows. Any suggestions? Thanks in advance!

 

proc sql;
create table New_Apps as select
a.application_id ,a.received_date, a.booking_date,a.market_value,
a.application_status,a.product_type,a.collateral_type, a.initial_decision_user_id,
b.dealer_state, c.customer_state, c.ssn

from ilos.application a
left join ilos.company_dealer b
on a.dealer_id = b.dealer_id
left join ilos.application_search c
on a.application_id = c.application_id
where year(datepart(received_date)) > 2014 and
/* if b.dealer_state = 'OH' and c.customer_state not in (OH);*/

;quit;

3 REPLIES 3
PaigeMiller
Diamond | Level 26
where year(datepart(received_date)) > 2014 and (
(b.dealer_state = 'OH' and c.customer_state not in ('OH','KY','IN','MI,'WV'))
or (b.dealer_state = 'IN' and c.customer_state not in ('IL','IN','MI','KY','OH'))
or (b.dealer_state = 'IL' and c.customer state not in (...)) or (...)
)
--
Paige Miller
ballardw
Super User

Just to keep from writing such ugly code I would be tempted to create a format containing the values of the exclusion list and then simplify the where clause a bit using a slightly different logic:

 

Proc format library=work;
value $exclude
'OH' = 'OH   KY   IN   MI  WV'
;
run;

data example;
	input st1 $ st2 $;
	if indexw(put(st1,$exclude.),st2) = 0;
datalines;
OH FL
OH KY
;

The  IN operator wouldn't allow the Put (var, fmt.) as a list builder so use indexw.

 

Then you move all of the ugly lists into the format.

 

 

mkeintz
PROC Star

Why not make an ancillary data set CHK with two variables: ST and NEIGHBORS, such as

data chk;
  input st $2.  neighbors :$&20. ;
datalines;
OH  OH KY IN MI WV
;

Then you can use it very simply in the SQL, as in:

 

...
   where received_date>= '01jan2015:00:00:00'dt and
   b.dealer_state=chk.state and findw(chk.neighbors,c.customer_state)=0
...

You might need to encompass the c.customer_state reference with a TRIM function if it is longer than 2 characters.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------