hello,
there are few edits in the below code
I am stuck with something please help me out with this..
I have a table as-
wr_no rfa_no stk_ref status
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1
2 1 x fp
1 1 x gd
1 2
1 1 x gp
1 1 x gc
I want to create a flag as 0 for records having status 'nu' i.e for every 'nu' of stk_ref i want to flag 0 'fp' but not ('gd','gp','gc') with reference to my stk_ref ,wr_no and rfa_no.so basically here i want to flag 0 1st and 4th record
because for the same stk_ref i got 'nu' which has unique wr_no and unique rfa_no.there are blank stk_ref and status also for which the flag should be zero.
And the table contains 5billion rows.
Hi,
something like that?
DATA have;
input wr_no rfa_no stk_ref $ status $;
datalines;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;
PROC SQL;
CREATE TABLE help AS
SELECT a.*,b.flag
FROM have a
LEFT JOIN
(SELECT *,1 AS flag
FROM have
WHERE status eq 'nu') b
ON a.wr_no eq b.wr_no
AND a.rfa_no eq b.rfa_no
AND a.stk_ref eq b.stk_ref
;
quit;
data want;
set help;
where flag ne 1;
drop flag;
run;Cheers
- Cheers -
Amazing!!thankyou so much..
just 1 que can you please explain me how for the other status='FP' it takes flag=1.?
- Cheers -
With a data step:
data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;
proc sort data=have;
by wr_no rfa_no stk_ref;
run;
data want;
merge
have (in=a)
have (in=b rename=(status=_status) where=(_status = 'nu'))
;
by wr_no rfa_no stk_ref;
if a and not b;
drop _status;
run;
Thankyou the code works fine.
Hi,
with 5 billions rows, hash may be a much faster than my SQL:
data _NULL_;
if 0 then set have;
if _N_ eq 1 then do;
declare hash hnu(dataset: 'work.have (where=(status eq "nu")) duplicate: "add"');
hnu.defineKey('wr_no','rfa_no','stk_ref');
hnu.defineDone();
declare hash w(ordered: 'ascending');
w.defineKey('wr_no','rfa_no','stk_ref');
w.defineData('wr_no','rfa_no','stk_ref','status');
w.defineDone();
end;
set have end=eof;
if hnu.find()>0;
if w.add()>=0;
if eof then rc=w.output(dataset: 'work.want');
run;
- Cheers -
@Oligolas wrote:
Hi,
with 5 billions rows, hash may be a much faster than my SQL:
Only if you have enough memory available:
assuming that 10 % of observations have status = 'nu', you get 26 * 500,000,000 ~ 12.4 Gbytes raw size for the hash object, without the hash tree itself.
Operations on datasets, OTOH, scale until you run out of disk space, which comes much later.
data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;
proc sql;
select *
from have
where catx('|',wr_no ,rfa_no ,stk_ref)
not in (select catx('|',wr_no ,rfa_no ,stk_ref) from have where status='nu');
quit;
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.