Hi ,
The data set UPDATE_EMAIL_ltst has c_id, email_id variables
The data set dcclnt has c_id , email_id and p_c_id variables
c_id, email_id are the key variables
for the below query ,I am getting previous row data for p_c_id instead the current record data.
Can any one help with this.
rsubmit;
data Email ;
set UPDATE_EMAIL_ltst ;
set dcclnt key=Clt_Email /unique;
if _error_=0 then
do;
delete;
end;
else do;
_error_=0;
output ;
end;
run;
endrsubmit;
Sample data, in the form of working data steps, please. Help us help you.
Hi,
The data sample is below,the inputs and the output
In Email Out put data set P_Email_id should be ideally xyz1@gmail.com ,where as it is abc@gmail.com
The code is below
rsubmit;
data a;
input C_ID $ EMAIL_ID $20.;
cards;
C1 abc@gmail.com
C2 xyz@gmail.com
;
run;
endrsubmit;
rsubmit;
data b(index=(Clt_Email=(C_ID email_id)));
input C_ID $ EMAIL_ID $5-18 P_EMAIL_ID $20-33
;
cards;
C1 abc@gmail.com abc@gmail.com
C11 abc1@gmail.com abc1@gmail.com
C2 xyz1@gmail.com xyz1@gmail.com
;
run;
endrsubmit;
rsubmit;
data ab ;
set a ;
set b key=Clt_Email /unique;
if _error_=0 then
do;
delete;
end;
else do;
_error_=0;
output ;
end;
run;
endrsubmit;
This creates your expected dataset without indexes:
data a;
input C_ID $ EMAIL_ID :$20.;
cards;
C1 abc@gmail.com
C2 xyz@gmail.com
;
data b;
input C_ID $ EMAIL_ID :$20. P_EMAIL_ID :$20.;
cards;
C1 abc@gmail.com abc@gmail.com
C11 abc1@gmail.com abc1@gmail.com
C2 xyz1@gmail.com xyz1@gmail.com
;
data ab;
set a;
if _n_ = 1
then do;
length _email_id P_EMAIL_ID $20;
declare hash b (dataset:"b (rename=(email_id=_email_id))");
b.definekey("c_id");
b.definedata("_email_id","p_email_id");
b.definedone();
call missing(_email_id,P_EMAIL_ID);
end;
if b.find() = 0 and _email_id = email_id then delete;
drop _:;
run;
Hi Kurt ,
Thanks for the solution ,the problem with this approach is memory failure .
How can we change the below index merge query so that it will serve the purpose.
rsubmit;
data ab ;
set a ;
set b key=Clt_Email /unique;
if _error_=0 then
do;
delete;
end;
else do;
_error_=0;
output ;
end;
run;
endrsubmit;
Hi Kurt,
Thanks for the solution given ,but when ran it gives the memory failure error.
How can we change the below index merge query so that it will serve the purpose,which is fast as well.
rsubmit;
data ab ;
set a ;
set b key=Clt_Email /unique;
if _error_=0 then
do;
delete;
end;
else do;
_error_=0;
output ;
end;
run;
endrsubmit;
OK, I figured this out without provision of sample data.
So your unsuccessful SET /key= yields a non-zero _ERROR_ . In such cases, you have chosen to reset _ERROR_ to zero, but you also need to set the retained P_C_ID to missing.
Format your code, this is illegible. There's an icon for this to use when pasting.
rsubmit; %* send to remote host;
data EMAIL ; %* create table EMAIL;
set UPDATE_EMAIL_LIST ; %* read table UPDATE_EMAIL_LIST ;
set DCCLNT key=CLTEMAIL /unique; %* read table DCCLNT using key CLTEMAIL ;
if _ERROR_=0 then do; %* if a record is found in DCCLNT ;
delete; %* delete (i.e. ignore) the record read in UPDATE_EMAIL_LIST ;
end;
else do; %* else ;
_ERROR_=0; %* reset _ERROR_ flag ;
output ; %* save the record read in UPDATE_EMAIL_LIST ;
end;
run;
endrsubmit;
So in effect: keep all records from UPDATE_EMAIL_LIST
not present in DCCLNT
If I read that code correctly, you want to delete observations where the key is found in dcclnt?
In which case, a hash solves the issue nicely without needing to rely on the indexes:
data Email ;
set UPDATE_EMAIL_ltst;
if _n_ =1
then do;
declare hash dcc (dataset:"dcclnt");
dcc.definekey("Clt_Email");
dcc.definedone();
end;
if dcc.find() ne 0;
run;
@Kurt_Bremser check() would be faster than find() here 🙂 .
Thanks Kurt ,
But I get the below error.
ERROR: Undeclared key symbol Clt_Email for hash object at line 1019 column 3.
ERROR: DATA STEP Component Object failure. Aborted during the EXECUTION phase.
My code as posted works. Just copy/paste and run it.
Adapt it to your variables, or post example data with the real column names.
Note that the hash keys only on the id and uses the email in a separate comparison.
Thanks Kurt,
I just used the same code as below.
The whole code is as below
rsubmit;
data UPDATE_EMAIL_ltst(keep=client_id email_id );
set UPDATE_EMAIL;
email_id=upcase(event_value);
by client_id last_update_ts;
if last.client_id;
run;
endrsubmit;
rsubmit;
data dcclnt(keep=client_id polisy_email_id email_id index=(Clt_Email=(client_id email_id)));
set data.dcclnt(rename=(clt_key=client_id) where=(valid_flag='1') );
email_id=upcase(email_id);
polisy_email_id=email_id;
run;
endrsubmit;
rsubmit;
data Email_t;
set UPDATE_EMAIL_ltst;
if _n_ =1
then do;
declare hash dcc (dataset:"dcclnt");
dcc.definekey("client_id");
dcc.definedone();
end;
if dcc.find() ne 0;
run;
endrsubmit;
The error is below:
ERROR: Hash object added 32505840 items when memory failure occurred.
FATAL: Insufficient memory to execute DATA step program. Aborted during the EXECUTION phase.
ERROR: The SAS System stopped processing this step because of insufficient memory.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.