hello,
there are few edits in the below code
I am stuck with something please help me out with this..
I have a table as-
wr_no rfa_no stk_ref status
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1
2 1 x fp
1 1 x gd
1 2
1 1 x gp
1 1 x gc
I want to create a flag as 0 for records having status 'nu' i.e for every 'nu' of stk_ref i want to flag 0 'fp' but not ('gd','gp','gc') with reference to my stk_ref ,wr_no and rfa_no.so basically here i want to flag 0 1st and 4th record
because for the same stk_ref i got 'nu' which has unique wr_no and unique rfa_no.there are blank stk_ref and status also for which the flag should be zero.
And the table contains 5billion rows.
Hi,
something like that?
DATA have; input wr_no rfa_no stk_ref $ status $; datalines; 1 1 x fp 1 2 x fp 1 1 y fp 1 1 x nu 2 1 x fp ; run; PROC SQL; CREATE TABLE help AS SELECT a.*,b.flag FROM have a LEFT JOIN (SELECT *,1 AS flag FROM have WHERE status eq 'nu') b ON a.wr_no eq b.wr_no AND a.rfa_no eq b.rfa_no AND a.stk_ref eq b.stk_ref ; quit; data want; set help; where flag ne 1; drop flag; run;
Cheers
- Cheers -
Amazing!!thankyou so much..
just 1 que can you please explain me how for the other status='FP' it takes flag=1.?
- Cheers -
With a data step:
data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;
proc sort data=have;
by wr_no rfa_no stk_ref;
run;
data want;
merge
have (in=a)
have (in=b rename=(status=_status) where=(_status = 'nu'))
;
by wr_no rfa_no stk_ref;
if a and not b;
drop _status;
run;
Thankyou the code works fine.
Hi,
with 5 billions rows, hash may be a much faster than my SQL:
data _NULL_; if 0 then set have; if _N_ eq 1 then do; declare hash hnu(dataset: 'work.have (where=(status eq "nu")) duplicate: "add"'); hnu.defineKey('wr_no','rfa_no','stk_ref'); hnu.defineDone(); declare hash w(ordered: 'ascending'); w.defineKey('wr_no','rfa_no','stk_ref'); w.defineData('wr_no','rfa_no','stk_ref','status'); w.defineDone(); end; set have end=eof; if hnu.find()>0; if w.add()>=0; if eof then rc=w.output(dataset: 'work.want'); run;
- Cheers -
@Oligolas wrote:
Hi,
with 5 billions rows, hash may be a much faster than my SQL:
Only if you have enough memory available:
assuming that 10 % of observations have status = 'nu', you get 26 * 500,000,000 ~ 12.4 Gbytes raw size for the hash object, without the hash tree itself.
Operations on datasets, OTOH, scale until you run out of disk space, which comes much later.
data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;
proc sql;
select *
from have
where catx('|',wr_no ,rfa_no ,stk_ref)
not in (select catx('|',wr_no ,rfa_no ,stk_ref) from have where status='nu');
quit;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.