hello,
there are few edits in the below code
I am stuck with something please help me out with this..
I have a table as-
wr_no rfa_no stk_ref status
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1
2 1 x fp
1 1 x gd
1 2
1 1 x gp
1 1 x gc
I want to create a flag as 0 for records having status 'nu' i.e for every 'nu' of stk_ref i want to flag 0 'fp' but not ('gd','gp','gc') with reference to my stk_ref ,wr_no and rfa_no.so basically here i want to flag 0 1st and 4th record
because for the same stk_ref i got 'nu' which has unique wr_no and unique rfa_no.there are blank stk_ref and status also for which the flag should be zero.
And the table contains 5billion rows.
Hi,
something like that?
DATA have; input wr_no rfa_no stk_ref $ status $; datalines; 1 1 x fp 1 2 x fp 1 1 y fp 1 1 x nu 2 1 x fp ; run; PROC SQL; CREATE TABLE help AS SELECT a.*,b.flag FROM have a LEFT JOIN (SELECT *,1 AS flag FROM have WHERE status eq 'nu') b ON a.wr_no eq b.wr_no AND a.rfa_no eq b.rfa_no AND a.stk_ref eq b.stk_ref ; quit; data want; set help; where flag ne 1; drop flag; run;
Cheers
- Cheers -
Amazing!!thankyou so much..
just 1 que can you please explain me how for the other status='FP' it takes flag=1.?
- Cheers -
With a data step:
data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;
proc sort data=have;
by wr_no rfa_no stk_ref;
run;
data want;
merge
have (in=a)
have (in=b rename=(status=_status) where=(_status = 'nu'))
;
by wr_no rfa_no stk_ref;
if a and not b;
drop _status;
run;
Thankyou the code works fine.
Hi,
with 5 billions rows, hash may be a much faster than my SQL:
data _NULL_; if 0 then set have; if _N_ eq 1 then do; declare hash hnu(dataset: 'work.have (where=(status eq "nu")) duplicate: "add"'); hnu.defineKey('wr_no','rfa_no','stk_ref'); hnu.defineDone(); declare hash w(ordered: 'ascending'); w.defineKey('wr_no','rfa_no','stk_ref'); w.defineData('wr_no','rfa_no','stk_ref','status'); w.defineDone(); end; set have end=eof; if hnu.find()>0; if w.add()>=0; if eof then rc=w.output(dataset: 'work.want'); run;
- Cheers -
@Oligolas wrote:
Hi,
with 5 billions rows, hash may be a much faster than my SQL:
Only if you have enough memory available:
assuming that 10 % of observations have status = 'nu', you get 26 * 500,000,000 ~ 12.4 Gbytes raw size for the hash object, without the hash tree itself.
Operations on datasets, OTOH, scale until you run out of disk space, which comes much later.
data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;
proc sql;
select *
from have
where catx('|',wr_no ,rfa_no ,stk_ref)
not in (select catx('|',wr_no ,rfa_no ,stk_ref) from have where status='nu');
quit;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.