Solved: Appending datasets but must keep unique records by imb_code

buechler66 · Posted 04-04-2016 03:42 PM

Hi. I have the following macro that iterates multiple times identifying various records. Records are unique by IMB_CODE. With each iteration the identified data is appended into my FinalData dataset.

My problem is that the query can identify the same record (with the same IMB_CODE) in later iterations. When this happens I must have a way to only keep the first instance of the record in my FinalData dataset..

Any suggestions?

%macro BuildQueryData(analysis_desc= , rule= , rule_order= );

proc sql;
connect to oracle as db (user="me" password="mepw" path="medb");
create table MyRuleQuery as
select * from connection to db

( select '&analysis_desc' as RULE_NM, b.actual_dlvry_date as AD_DT, b.imb_code, &rule_order as rule_order
from ivprl.bi_spm_piece_iv_recon a, ivprl.bi_spm_piece_bids_recon b
where trunc(a.upload_create_dt) = trunc(sysdate)
and trunc(b.upload_create_dt) = trunc(sysdate)
and a.imb_code = b.imb_code
and a.imb_dlvry_zip_3 in(select imb_dlvry_zip_3 from ivprl.bi_spm_piece_bids_recon)
and a.ml_cl_code in(select ml_cl_code from ivprl.bi_spm_piece_bids_recon)
and a.ml_cat_code in(select ml_cat_code from ivprl.bi_spm_piece_bids_recon)
and a.imb_code not in(select imb_code from MyRuleQueryData where rule_order < 999.1)
and &rule);

disconnect from db;
quit;

proc append base=FinalData data = MyRuleQuery;
run;

%mend BuildQueryData;

data _null_;
set QueryRules;
call execute('%BuildQueryData(analysis_desc='||analysis_desc||' , rule='||rule||' , rule_order='||rule_order||');');
run;

jklaverstijn · Posted 04-04-2016 05:07 PM

It is quite customary to preprocess the data you want to append by filtering out the rows with a key that's already in the base dataset. A simple SQL query should do (untested):

proc sql;
  create table newrows as
  select * form MyRuleQuery where ID
     not in (select distinct IMB_CODE from FinalData);
quit;

proc append base=FinalData data = newrows;
run;

Alternatively you van ceate a unique index on the target table.

PROC DATASETS LIBRARY=yourlib; 
MODIFY Finaldata; 
INDEX CREATE IMB_CODE / UNIQUE NOMISS ;
quit;

This will reject rows that have a key that is already present. Personally I dislike the (ab)use of a unique index (or actually the associated integrity constraint) as an ETL tool, but is very effective. Apply the unique index just the same as a safety net.

Best of luck,

- Jan.

Hope this helps,

- Jan.

View solution in original post

LinusH · Posted 04-04-2016 04:30 PM

proc sort nodupkey

Data never sleeps

buechler66 · Posted 04-04-2016 05:07 PM

But I need to guarentee the first instance inserted into the FinalData dataset is the one that stays. I'm not sure NoDupKey guarentees the first record in is not eliminated.

jklaverstijn · Posted 04-04-2016 05:07 PM

It is quite customary to preprocess the data you want to append by filtering out the rows with a key that's already in the base dataset. A simple SQL query should do (untested):

proc sql;
  create table newrows as
  select * form MyRuleQuery where ID
     not in (select distinct IMB_CODE from FinalData);
quit;

proc append base=FinalData data = newrows;
run;

Alternatively you van ceate a unique index on the target table.

PROC DATASETS LIBRARY=yourlib; 
MODIFY Finaldata; 
INDEX CREATE IMB_CODE / UNIQUE NOMISS ;
quit;

This will reject rows that have a key that is already present. Personally I dislike the (ab)use of a unique index (or actually the associated integrity constraint) as an ETL tool, but is very effective. Apply the unique index just the same as a safety net.

Best of luck,

- Jan.

Hope this helps,

- Jan.

Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Re: Appending datasets but must keep unique records by imb_code

Registration is open

SAS Training: Just a Click Away