BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Mr_sassy_sug
Fluorite | Level 6

Hi Experts, 

I m looking to create the grp varaible for records where the rep value linked with num value. Grp should get the first num value to link the group. THANK YOU SO MUCH FOR YOUR VALUEBLE TIME ON THIS. 

 

Have this below dataset 

 

data xx;
length sub num rep $20;
sub = '001';num='1';rep = '';output;
sub = '001';num='2';rep = '';output;
sub = '001';num='3';rep = '002';output;
sub = '001';num='4';rep = '';output;
sub = '001';num='5';rep = '003';output;
sub = '001';num='6';rep = '';output;
sub = '002';num='1';rep = '';output;
sub = '002';num='2';rep = '';output;
sub = '002';num='3';rep = '';output;
sub = '002';num='4';rep = '001';output;
run;

subnumrep
0011 
0012 
0013002
0014 
0015003
0016 
0021 
0022 
0023 
0024001

 

and want the below dataset

subnumrepgrp
0011  
0012 grp002
0013002grp002
0014  
0015003grp002
0016  
0021 grp001
0022  
0023  
0024001grp001
1 ACCEPTED SOLUTION

Accepted Solutions
yabwon
Onyx | Level 15

something like this?

data xx;
length sub num rep $20;
sub = '001';num='1';rep = '';output;
sub = '001';num='2';rep = '';output;
sub = '001';num='3';rep = '002';output;
sub = '001';num='4';rep = '';output;
sub = '001';num='5';rep = '003';output;
sub = '001';num='6';rep = '';output;
sub = '002';num='1';rep = '';output;
sub = '002';num='2';rep = '';output;
sub = '002';num='3';rep = '';output;
sub = '002';num='4';rep = '001';output;
run;

data want;
  do _N_ = 1 by 1 until(last.sub);
    set xx;
    by sub;
    min=min(input(rep,best32.),min);
  end;

  do _N_ = 1 to _N_;
    set xx;
    if num = min OR rep NE " " then grp = "grp"!!put(min,z3.);
                               else grp = " ";
    output;
  end;
run;
proc print;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



View solution in original post

4 REPLIES 4
yabwon
Onyx | Level 15

something like this?

data xx;
length sub num rep $20;
sub = '001';num='1';rep = '';output;
sub = '001';num='2';rep = '';output;
sub = '001';num='3';rep = '002';output;
sub = '001';num='4';rep = '';output;
sub = '001';num='5';rep = '003';output;
sub = '001';num='6';rep = '';output;
sub = '002';num='1';rep = '';output;
sub = '002';num='2';rep = '';output;
sub = '002';num='3';rep = '';output;
sub = '002';num='4';rep = '001';output;
run;

data want;
  do _N_ = 1 by 1 until(last.sub);
    set xx;
    by sub;
    min=min(input(rep,best32.),min);
  end;

  do _N_ = 1 to _N_;
    set xx;
    if num = min OR rep NE " " then grp = "grp"!!put(min,z3.);
                               else grp = " ";
    output;
  end;
run;
proc print;
run;

Bart

_______________
Polish SAS Users Group: www.polsug.com and communities.sas.com/polsug

"SAS Packages: the way to share" at SGF2020 Proceedings (the latest version), GitHub Repository, and YouTube Video.
Hands-on-Workshop: "Share your code with SAS Packages"
"My First SAS Package: A How-To" at SGF2021 Proceedings

SAS Ballot Ideas: one: SPF in SAS, two, and three
SAS Documentation



Tom
Super User Tom
Super User

I do not totally follow your example.

 

It kind of looks like you are using REP as a pointer to the observation with the matching value of NUM within the same SUB.  But you are doing some type of conversion since REP='002' is NOT the same string as NUM='2'.  Did you intend for NUM and REP to be numeric variables instead of character?

 

Also why does REP='003' get GRP='grp002'?  Why not 'grp003'?

 

Is your intent to treat the (NUM,REP) pairs as edges in a graph with the goal of find the connected sub-graphs?  If so then perhaps you want to use this %SUBNET() macro .  (Note you might need to enhance the macro to allow BY group processing.  Or split the data into separate datasets for each SUB.  Or convert your NUM and REP values into composite values that include the SUB id so they are unique.)

 

Example:

data have ;
  input sub $ num rep;
cards;
001 1 . 
001 2 . 
001 3 002
001 4 . 
001 5 003
001 6 . 
002 1 . 
002 2 . 
002 3 . 
002 4 001
;

%subnet(in=have,out=subnet,from=rep,to=num,subnet=subnet,directed=1);

proc sql;
create table want as 
  select distinct a.*,b.subnet as grp
  from have a left join subnet b
    on a.sub = b.sub and (a.num=b.num or a.num=b.rep )
  order by 1,2
;
quit;

Result

Obs    sub    num    rep    grp

  1    001     1      .      .
  2    001     2      .      2
  3    001     3      2      2
  4    001     4      .      .
  5    001     5      3      2
  6    001     6      .      .
  7    002     1      .      1
  8    002     2      .      .
  9    002     3      .      .
 10    002     4      1      1

 

Mr_sassy_sug
Fluorite | Level 6

Thank you so much, you got it right and it is also working based on the situation, really appreciate your help. 

Ksharp
Super User

I think this question is more complicated that you demostrator with this simple data.

I think it is searching a tree problem.

 

data xx;
length sub num rep $20;
sub = '001';num='1';rep = '';output;
sub = '001';num='2';rep = '';output;
sub = '001';num='3';rep = '002';output;
sub = '001';num='4';rep = '';output;
sub = '001';num='5';rep = '003';output;
sub = '001';num='6';rep = '';output;
sub = '002';num='1';rep = '';output;
sub = '002';num='2';rep = '';output;
sub = '002';num='3';rep = '';output;
sub = '002';num='4';rep = '001';output;
run;

data have;
set xx(where=(num is not missing and rep is not missing));
length from to $ 30;
from=catx('|',sub,input(rep,best.));
to=catx('|',sub,input(num,best.));
keep from to;
run;

data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node;
  output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 20);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full(where=(from is not missing and to is not missing))',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;


/*****Find the ancestor/top  one *******/
proc sql;
create table ancestor as
select node,max(grp) as grp from
(


select a.*,b.grp from want as a left join 
(
select from,'grp'||put(input(scan(from,-1,'|'),best.),z10.) as grp from have where from not in (select to from have)
) as b
on a.node=b.from


) group by household;
quit;

data final_want;
if _n_=1 then do;
 if 0 then set ancestor;
 declare hash h(dataset:'ancestor');
 h.definekey('node');
 h.definedata('grp');
 h.definedone();
end;
set xx;
length from to $ 30;
call missing(grp);
if not missing(rep)  then do;
 from=catx('|',sub,input(rep,best.));rc=h.find(key:from);
end;
if not missing(num) then do;
 to=catx('|',sub,input(num,best.));rc=h.find(key:to);
end;
drop from to rc node;
run;

Ksharp_0-1744702554479.png

 

sas-innovate-white.png

Special offer for SAS Communities members

Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.

 

View the full agenda.

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 274 views
  • 3 likes
  • 4 in conversation