BookmarkSubscribeRSS Feed
nid197
Obsidian | Level 7

hello,

 

there are few edits in the below code 

 

I am stuck with something please help me out with this..

I have a table as-

 

wr_no    rfa_no   stk_ref     status

1               1          x             fp

1               2          x             fp

1               1          y             fp

1               1          x             nu

2               1

2               1          x             fp

1               1          x             gd

1               2

1               1          x             gp

1               1          x             gc

 

I want to create  a flag as 0 for records having status 'nu' i.e for every  'nu' of  stk_ref i want to flag 0  'fp' but not ('gd','gp','gc') with reference to my stk_ref ,wr_no and rfa_no.so basically here i want to flag 0 1st and 4th record 

because for the same stk_ref i got 'nu' which has unique  wr_no and unique rfa_no.there are  blank stk_ref and status also for which the flag should be zero.

 

And the table contains 5billion rows.

 

 

8 REPLIES 8
Oligolas
Barite | Level 11

Hi,

 

something like that?

DATA have;
input wr_no rfa_no stk_ref $ status $;
datalines;
1 1 x fp
1 2 x fp
1 1 y fp
1 1 x nu
2 1 x fp
;
run;

PROC SQL;
   CREATE TABLE help AS
      SELECT a.*,b.flag
      FROM have a
      LEFT JOIN
        (SELECT *,1 AS flag
         FROM have
         WHERE status eq 'nu') b 
      ON a.wr_no eq b.wr_no
      AND a.rfa_no eq b.rfa_no
      AND a.stk_ref eq b.stk_ref
   ;
quit;

data want;
   set help;
   where flag ne 1;
   drop flag;
run;

Cheers

 

________________________

- Cheers -

nid197
Obsidian | Level 7

Amazing!!thankyou so much..

just 1 que can you please explain me how for the other status='FP' it takes flag=1.?

 

Oligolas
Barite | Level 11
(SELECT *,1 AS flag FROM have WHERE status eq 'nu') b
is a subquery table where every row gets flag=1 and names it 'b' henceforth.

SELECT a.*,b.flag
this selects all columns from 'a' and the column 'flag' from 'b' (which value is 1 everywhere)

The join performed is done by comparing the wr_no, rft_no and stk_ref rows. For each match between table have(a) and the subquery(b), the value of the column flag is appended, that is appended with '1'

Hope this is clear.
There are many join explanations on the web.
Maybe you can refer to:
https://blog.codinghorror.com/a-visual-explanation-of-sql-joins/

Cheers
________________________

- Cheers -

Kurt_Bremser
Super User

With a data step:

data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1               1          x              fp
1               2          x              fp
1               1           y             fp
1               1           x             nu
2               1           x             fp
;
run;

proc sort data=have;
by wr_no rfa_no stk_ref;
run;

data want;
merge
  have (in=a)
  have (in=b rename=(status=_status) where=(_status = 'nu'))
;
by wr_no rfa_no stk_ref;
if a and not b;
drop _status;
run;
nid197
Obsidian | Level 7

Thankyou the code works fine.

Oligolas
Barite | Level 11

Hi,

 

with 5 billions rows, hash may be a much faster than my SQL:

data _NULL_;
   if 0 then set have;
   
   if _N_ eq 1 then do;
      declare hash hnu(dataset: 'work.have (where=(status eq "nu")) duplicate: "add"');
      hnu.defineKey('wr_no','rfa_no','stk_ref');
      hnu.defineDone();
      
      declare hash w(ordered: 'ascending');
      w.defineKey('wr_no','rfa_no','stk_ref');
      w.defineData('wr_no','rfa_no','stk_ref','status');
      w.defineDone();
   end;
   set have end=eof;

   if hnu.find()>0;
   if w.add()>=0;
   if eof then rc=w.output(dataset: 'work.want');
run;

 

________________________

- Cheers -

Kurt_Bremser
Super User

@Oligolas wrote:

Hi,

 

with 5 billions rows, hash may be a much faster than my SQL:

 

 


Only if you have enough memory available:

assuming that 10 % of observations have status = 'nu', you get 26 * 500,000,000 ~ 12.4 Gbytes raw size for the hash object, without the hash tree itself.

Operations on datasets, OTOH, scale until you run out of disk space, which comes much later.

Ksharp
Super User
data have;
input wr_no rfa_no stk_ref $ status $;
cards;
1               1          x              fp
1               2          x              fp
1               1           y             fp
1               1           x             nu
2               1           x             fp
;
run;
proc sql;
select *
 from have
  where catx('|',wr_no ,rfa_no ,stk_ref)
   not in (select catx('|',wr_no ,rfa_no ,stk_ref) from have where status='nu');
quit;

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1162 views
  • 6 likes
  • 4 in conversation