Quartz | Level 8

## How to omitt some redundant observations

Hi all,

For some network analysis, I want to check whether two different firms are connected by a mutual person.

For example in the following table, A and C are connected through person X.

Firm     Person

A          x

B          y

C          x

Up to know, I have written this code:

proc sql;

create table want as

select a.firm as firm_a, b.firm as firm_b

from have as a, have as b,

where

a.person=b.person and a.firm ~=b.firm;

quit;

My problem is that in this case, I will get two observations, although they are only one because SAS will match every observation twice so I will get:

Firm_a     Firm_b

A          C

C          A

Obviously they are not duplicate, but redundant anyways, and unfortunately I could not get rid of them easily. How can I remove these multiple observations? Should i alter my code or change it after the new table?

I will appreciate it if you can help me,

Thanks,

1 ACCEPTED SOLUTION

Accepted Solutions
Opal | Level 21

## Re: How to omitt some redundant observations

data have;

input Firm \$ Person \$;

datalines;

A          x

E          y

C          x

D          z

B          y

F          x

;

proc sql;

create table long as

select person, firm, count(distinct firm) as nbFirms

from have

group by person

having count(distinct firm) > 1

order by person, firm;

quit;

data want;

length firms \$200;

do until(last.person);

set long; by person;

firms = catx(",", firms, firm);

end;

keep person firms;

run;

proc print data=want noobs; run;

PG

PG
6 REPLIES 6
Opal | Level 21

## Re: How to omitt some redundant observations

Try instead to count the number of distinct firms that share the same person

proc sql;

create table want as

select person, firm, count(distinct firm) as nbFirms

from have

group by person

having count(distinct firm) > 1;

quit;

PG

PG
Quartz | Level 8

## Re: How to omitt some redundant observations

Thank you very much, PG.

But if I'm not wrong, in this way I won't get the pairs of firms and I'll get the firms individually. I guess for my further work it would be more convenient if I get the paids, like A-C, A-B, etc. Is it possible to change into that?

Thanks again,

Opal | Level 21

## Re: How to omitt some redundant observations

data have;

input Firm \$ Person \$;

datalines;

A          x

E          y

C          x

D          z

B          y

F          x

;

proc sql;

create table long as

select person, firm, count(distinct firm) as nbFirms

from have

group by person

having count(distinct firm) > 1

order by person, firm;

quit;

data want;

length firms \$200;

do until(last.person);

set long; by person;

firms = catx(",", firms, firm);

end;

keep person firms;

run;

proc print data=want noobs; run;

PG

PG
Quartz | Level 8

## Re: How to omitt some redundant observations

Thanks a lot, PG. That's perfect!

Onyx | Level 15

## Re: How to omitt some redundant observations

For a one-step solution, this code uses some undocumented feature (monotonic()) that some may frown upon:

data haVE;

input Firm\$     Person\$;

cards;

A          x

B          y

C          x

;

proc sql;

create table want (drop=pair) as

select a.firm as firm_a, b.firm as firm_b, ifc(a.firm<b.firm, cats(a.firm,b.firm), cats(b.firm,a.firm) )as pair

from have as a, have as b

where

1. a.person=b.person and a.firm ~=b.firm

group by pair

having monotonic()=min(monotonic());

quit;

You can always break it into 2-steps.

Haikuo

Super User

## Re: How to omitt some redundant observations

if I understand what you mean , why not :

call sortc(Firm_a ,Firm_b);

then

proc sort  nodupkey; by Firm_a     Firm_b;run;

Discussion stats
• 6 replies
• 942 views
• 3 likes
• 4 in conversation