BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
zana
Calcite | Level 5

Dear Arthur Information of my SAS is: SAS 9.1.3 Service Pack 4. Windows Version 6.1.7601 I see some error when i run your code, as [ERROR 68-185: The function CMISS is unknown, or cannot be accessed]. So thanks zana

Tom
Super User Tom
Super User

Basically you need to build out connected sub-graphs (or sub-networks) of your data.  You could use the SubGraphs macro posted above.  Here is another macro that uses SQL queries to assign nodes to subnets.

%macro subnet

(in=           /* Input table with pairs */

,out=          /* Output table of input with &SUBNET variable added */

,nodes=nodes   /* Output table of nodes with with subnet assigned */

,from=from     /* First variable in the pair */

,to=to         /* Second variable in the pair */

,subnet=subnet /* Name to use for SUBNET variable */

);

/*----------------------------------------------------------------------

SUBNET - Build connected subnets from pairs of nodes.

Start each new subnet by taking a node that is not currently assigned to

a subnet. Grow the subnet by adding nodes that are connected to the

current subnet and are not currently assigned a subnet.

----------------------------------------------------------------------*/

%local nodeid next getnext ;

%let nodeid=0;

%*----------------------------------------------------------------------

Put code to get unassigned node into a macro variable so it can be used

in two places in the program.

-----------------------------------------------------------------------;

%let getnext= select node into :next from &nodes where subnet=.;

proc sql noprint ;

*----------------------------------------------------------------------;

* Get list of all nodes ;

*----------------------------------------------------------------------;

  create table &nodes as

    select distinct . as subnet,&from as node from &in

    union

    select distinct . as subnet,&to as node from &in

  ;

*----------------------------------------------------------------------;

* Get next unassigned node ;

*----------------------------------------------------------------------;

  &getnext;

%do %while (&sqlobs) ;

*----------------------------------------------------------------------;

* Set subnet to next node id ;

*----------------------------------------------------------------------;

  %let nodeid=%eval(&nodeid+1);

  update &nodes set subnet=&nodeid where node=&next;

  %do %while (&sqlobs) ;

*----------------------------------------------------------------------;

* Get list of connect nodes for this subnet ;

*----------------------------------------------------------------------;

    create table new as

      select distinct a.&to as node

        from &in a, &nodes b, &nodes c

        where a.&from = b.node

          and a.&to = c.node

          and b.subnet = &nodeid

          and c.subnet = .

    ;

*----------------------------------------------------------------------;

* Set subnet to next node id ;

*----------------------------------------------------------------------;

    update &nodes set subnet=&nodeid

      where node in ( select node from new )

    ;

  %end;

*----------------------------------------------------------------------;

* Get next unassigned node ;

*----------------------------------------------------------------------;

  &getnext;

%end;

*----------------------------------------------------------------------;

* Create output dataset by adding subnet number. ;

*----------------------------------------------------------------------;

  create table &out as

    select distinct a.*,b.subnet as &subnet

      from &in a , &nodes b

      where a.&from = b.node

  ;

quit;

%mend subnet ;

Here is program to call this for your data.

data have ;

  input from to @@;

cards;

2 4 2 5 2 6 2 8 4 7 4 11 6 9 6 10 6 12 10 15 10 16 13 14 16 17

run;

options mprint;

%subnet(in=have,out=want);

proc sort data=nodes;

  by subnet node ;

run;

data _null_;

  set nodes ;

  by subnet ;

  if first.subnet then put subnet ': ' @ ;

  put node @ ;

  if last.subnet then put ;

run;

And the resulting display:

1 : 2 4 5 6 7 8 9 10 11 12 15 16 17

2 : 13 14

Ksharp
Super User

Tom,

With this dataset, I got a different result with mine.

2 4 2 5 2 6 2 8 4 7 4 11 6 9 6 10 6 12 10 15 10 16 13 14 16 17 99 4

Yours:

1 : 2 4 5 6 7 8 9 10 11 12 15 16 17

2 : 13 14

3 : 99

Mine:

1 : 2 4 5 6 7 8 9 10 11 12 15 16 17 99

2 : 13 14

Tom
Super User Tom
Super User

The part that grows the nets is assuming a directed graph.  You can either change the logic that gets the list to include links that flow in the opposite direction.  Or you can just expand the data to include both versions of the link.

data full;

  set have ;

  output;

  x=from; from=to; to=x; output;

run;

art297
Opal | Level 21

I don't have a way of testing whether all of the hash methods work in 9.1 but, since your data are numbers, you can replace the two functions in 's code that are only available since 9.2:

Both the following, and Tom's suggested code, produce the same result, but Ksharp's suggested use of the hash method runs quite a bit faster:

data have;

  infile cards delimiter='|';

  input subject1 subject2;

  cards;

        2    |    4 

        2    |    5 

        2    |    6 

        2    |    8

        4    |    7 

        4    |    11 

        6    |    9 

        6    |    10

        6    |    12

        10  |    15 

        10  |    16 

        13  |    14 

        16  |    17

;

proc sql noprint;

  select quote(strip(name)), name, count(name)

    into :list1 separated by ',',

         :list2 separated by ' '

         :n

      from dictionary.columns

        where libname='WORK' and

              memname='HAVE'

  ;

quit;

*options compress=yes;

data need(keep=&list2 household);

  declare hash ha(hashexp : 20,ordered : 'a');

  declare hiter hi('ha');

  ha.definekey('count');

  ha.definedata('count',&list1 );

  ha.definedone();

  declare hash _ha(hashexp: 20,ordered : 'a');

  _ha.definekey('key');

  _ha.definedone();

  do until(last);

    set have end=last;

    /*Remove obs which variable's are all missing firstly*/

    if /*c*/nmiss(of &list2) lt &n then do;

      count+1;

      ha.add();

    end;

  end;

  length key $ 40;

  array h{*} $ 40 &list2 ;

  /*copy the first obs from Hash Table HA into PDV*/

  _rc=hi.first();

  do while(_rc eq 0); *until the end of Hash Table HA;

    /*assign a unique cluster flag(i.e. household)*/

    household+1;

    do i=1 to &n;

      /*push not missing value of current obs into another Hash Table _HA*/

      if not missing(h{i}) then do;

        key=h{i};

        _ha.replace();

      end;

    end;

    /*start to run over Hash Table HA ,until can not find any more

    observation which is the same cluster with current observation*/

    do until(x=1);

      x=1;

      /*copy the first obs from Hash Table HA into PDV*/

      rc=hi.first();

      do while(rc=0);

        found=0;

        do j=1 to &n;

          /*find whether any one of value is included in the current obs*/

          key=h{j};

          rcc=_ha.check();

          if rcc =0 then found=1;

        end;

        if found then do;

          /*if any one of value is included,then push the obs which is copied from

          Hash Table HA into Hash Tables _HA,flag it the same cluster with the

          current obs and output it into dataset*/

          do k=1 to &n;

            if not missing(h{k}) then do;

              key=h{k};

              _ha.replace();

            end;

          end;

          output;

          x=0;

          _count=count;*keep this found obs's index;

        end;

        rc=hi.next();

        /*remove the found obs from Hash Table HA,since it has been seared*/

        if found then rx=ha.remove(key : _count);

      end;

    end;

    /*clear up all the index which is the same cluster with the current obs*/

    _ha.clear();

    /*copy the first obs from Hash Table HA into PDV*/

    _rc=hi.first();

  end;

run;

data want;

  set need;

  by household;

  length m $ 400;

  retain m;

  array x{*} $ &list2;

  do i=1 to &n ;

    if not find(m,strip(x{i})) and not missing(x{i}) then m=catx(' ',m,x{i});

  end;

  if last.household then do;

    output;

    call missing(m);

  end;

  keep household m;

run;

Ksharp
Super User

Not realize still use 9.1

data have;
infile cards delimiter='|';
input subject1 $  subject2 $ ;
cards;
        2    |    4  
        2    |    5  
        2    |    6  
        2    |    8
        4    |    7  
        4    |    11  
        6    |    9  
        6    |    10
        6    |    12
        10  |    15  
        10  |    16  
        13  |    14  
        16  |    17
;
run;
proc sort data=have ;by subject1;run;
proc transpose data=have out=test(drop=_name_) ;
by subject1;
var subject2;
run;
proc sql noprint;
select quote(strip(name)),name,count(name) into : list1 separated by ',' , : list2 separated by ' ' , : n
 from dictionary.columns where libname='WORK' and memname='TEST';
quit;
 %put &list1  &list2 &n;


options compress=yes;
data want(keep=&list2 household);
declare hash ha(hashexp : 16);
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('count',&list1 );
ha.definedone();
declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();
do until(last);
set test end=last;
count+1;
ha.add();
end;
length key $ 40;
array h{*} $ 40 &list2 ;
_rc=hi.first();
do while(_rc eq 0); 
household+1;
do i=1 to &n;
if not missing(h{i}) then do; key=h{i}; _ha.replace();end;
end;
do until(x=1);
x=1;
rc=hi.first();
do while(rc=0);
found=0;
do j=1 to &n;
key=h{j};rcc=_ha.check();
if rcc =0 then found=1;
end;
if found then do;
do k=1 to &n;
if not missing(h{k}) then do;
key=h{k};_ha.replace();end;
end;
output;x=0; _count=count;
end;
rc=hi.next();
if found then rx=ha.remove(key : _count);
end;
end;
_ha.clear();
_rc=hi.first();
end;
run;
data final;
 set want;
 by household;
 length m $ 400;
 retain m;
 array x{*} $ &list2;
 do i=1 to &n ;
  if not indexw(m,strip(x{i})) and not missing(x{i}) then m=catx(' ',m,x{i});
 end;
 if last.household then do;output;call missing(m);end;
 keep household m;
run;

Xia Keshan

Message was edited by: xia keshan

zana
Calcite | Level 5

Dear Ksharp,so thanks.

As i said, i use V9.1. Unfortunately by running your coed had some error (e.g., Unknown method CLEAR). Maybe by adding more information (or maybe by replacing input data with other suitable), you can better understand me.

Input data:

a.JPG

expected output:

a1.JPG

Yours sincerely.

zana

Ksharp
Super User

I have no 9.1 to test my code. But you could try this one:

data have;
infile cards delimiter='|';
input subject1 $  subject2 $ ;
cards;
        2    |    4  
        2    |    5  
        2    |    6  
        2    |    8
        4    |    7  
        4    |    11  
        6    |    9  
        6    |    10
        6    |    12
        10  |    15  
        10  |    16  
        13  |    14  
        16  |    17
;
run;
proc sort data=have ;by subject1;run;
proc transpose data=have out=test(drop=_name_) ;
by subject1;
var subject2;
run;
proc sql noprint;
select quote(strip(name)),name,count(name) into : list1 separated by ',' , : list2 separated by ' ' , : n
 from dictionary.columns where libname='WORK' and memname='TEST';
quit;
 %put &list1  &list2 &n;



data want(keep=&list2 household);
declare hash ha(hashexp : 16);
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('count',&list1 );
ha.definedone();

do until(last);
set test end=last;
count+1;
ha.add();
end;
length key $ 40;
array h{*} $ 40 &list2 ;
_rc=hi.first();
do while(_rc eq 0); 

declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();

household+1;
do i=1 to &n;
if not missing(h{i}) then do; key=h{i}; _ha.replace();end;
end;
do until(x=1);
x=1;
rc=hi.first();
do while(rc=0);
found=0;
do j=1 to &n;
key=h{j};rcc=_ha.check();
if rcc =0 then found=1;
end;
if found then do;
do k=1 to &n;
if not missing(h{k}) then do;
key=h{k};_ha.replace();end;
end;
output;x=0; _count=count;
end;
rc=hi.next();
if found then rx=ha.remove(key : _count);
end;
end;
_rc=hi.first();
end;
run;
data final;
 set want;
 by household;
 length m $ 400;
 retain m;
 array x{*} $ &list2;
 do i=1 to &n ;
  if not indexw(m,strip(x{i})) and not missing(x{i}) then m=catx(' ',m,x{i});
 end;
 if last.household then do;output;call missing(m);end;
 keep household m;
run;

Xia Keshan

zana
Calcite | Level 5

Thank You Ksharp. This is very helpful.

Ksharp
Super User

Thanks . Tom.

Under Tom's help, I finally got the faster code. Thanks again. TOM  Smiley Happy

data have;
infile cards delimiter='|';
/*from is parent, to is child*/
input from $  to $ ;
cards;
        2    |    4  
        2    |    5  
        2    |    6  
        2    |    8
        4    |    7  
        4    |    11  
        6    |    9  
        6    |    10
        6    |    12
        10  |    15  
        10  |    16  
        13  |    14  
        16  |    17
          99|2
          7|4
          100|20
;
run;
data full;
  set have end=last;
  if _n_ eq 1 then do;
   declare hash h();
    h.definekey('node');
     h.definedata('node');
     h.definedone();
  end;
  output;
  node=from; h.replace();
  from=to; to=node; output;
  node=from; h.replace();
  if last then h.output(dataset:'node');
  drop node;
run;


data want(keep=node household);
declare hash ha(ordered:'a');
declare hiter hi('ha');
ha.definekey('count');
ha.definedata('last');
ha.definedone();
declare hash _ha(hashexp: 16);
_ha.definekey('key');
_ha.definedone();

if 0 then set full;
declare hash from_to(dataset:'full',hashexp:20,multidata:'y');
 from_to.definekey('from');
 from_to.definedata('to');
 from_to.definedone();

if 0 then set node;
declare hash no(dataset:'node');
declare hiter hi_no('no');
 no.definekey('node');
 no.definedata('node');
 no.definedone();
 

do while(hi_no.next()=0);
 household+1; output;
 count=1;
 key=node;_ha.add();
 last=node;ha.add();
 rc=hi.first();
 do while(rc=0);
   from=last;rx=from_to.find();
   do while(rx=0);
     key=to;ry=_ha.check();
      if ry ne 0 then do;
       node=to;output;rr=no.remove(key:node);
       key=to;_ha.add();
       count+1;
       last=to;ha.add();
      end;
      rx=from_to.find_next();
   end;
   rc=hi.next();
end;
ha.clear();_ha.clear();
end;
stop;
run;

    
 
 

    
 
 

Xia Keshan

Message was edited by: xia keshan

zana
Calcite | Level 5

This question answered by TomKari ().

you are a magician Tom.


Hardly thanks for you, Ksharp and all other person that help me.


zana

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 25 replies
  • 1877 views
  • 6 likes
  • 7 in conversation