Help using Base SAS procedures

How to omitt some redundant observations

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 75
Accepted Solution

How to omitt some redundant observations

Hi all,

For some network analysis, I want to check whether two different firms are connected by a mutual person.

For example in the following table, A and C are connected through person X.

Firm     Person    

A          x

B          y

C          x

Up to know, I have written this code:

proc sql;

create table want as

select a.firm as firm_a, b.firm as firm_b

from have as a, have as b,

where

a.person=b.person and a.firm ~=b.firm;

quit;

My problem is that in this case, I will get two observations, although they are only one because SAS will match every observation twice so I will get:

Firm_a     Firm_b

A          C

C          A

Obviously they are not duplicate, but redundant anyways, and unfortunately I could not get rid of them easily. How can I remove these multiple observations? Should i alter my code or change it after the new table?

I will appreciate it if you can help me,

Thanks,


Accepted Solutions
Solution
‎07-20-2014 09:18 PM
Respected Advisor
Posts: 4,753

Re: How to omitt some redundant observations

Just add a step:

data have;

input Firm $ Person $;

datalines;

A          x

E          y

C          x

D          z

B          y

F          x

;

proc sql;

create table long as

select person, firm, count(distinct firm) as nbFirms

from have

group by person

having count(distinct firm) > 1

order by person, firm;

quit;

data want;

length firms $200;

do until(last.person);

    set long; by person;

    firms = catx(",", firms, firm);

    end;

keep person firms;

run;

proc print data=want noobs; run;

PG

PG

View solution in original post


All Replies
Respected Advisor
Posts: 4,753

Re: How to omitt some redundant observations

Try instead to count the number of distinct firms that share the same person

proc sql;

create table want as

select person, firm, count(distinct firm) as nbFirms

from have

group by person

having count(distinct firm) > 1;

quit;

PG

PG
Frequent Contributor
Posts: 75

Re: How to omitt some redundant observations

Thank you very much, PG.

But if I'm not wrong, in this way I won't get the pairs of firms and I'll get the firms individually. I guess for my further work it would be more convenient if I get the paids, like A-C, A-B, etc. Is it possible to change into that?

Thanks again,

Solution
‎07-20-2014 09:18 PM
Respected Advisor
Posts: 4,753

Re: How to omitt some redundant observations

Just add a step:

data have;

input Firm $ Person $;

datalines;

A          x

E          y

C          x

D          z

B          y

F          x

;

proc sql;

create table long as

select person, firm, count(distinct firm) as nbFirms

from have

group by person

having count(distinct firm) > 1

order by person, firm;

quit;

data want;

length firms $200;

do until(last.person);

    set long; by person;

    firms = catx(",", firms, firm);

    end;

keep person firms;

run;

proc print data=want noobs; run;

PG

PG
Frequent Contributor
Posts: 75

Re: How to omitt some redundant observations

Thanks a lot, PG. That's perfect!

Respected Advisor
Posts: 3,147

Re: How to omitt some redundant observations

For a one-step solution, this code uses some undocumented feature (monotonic()) that some may frown upon:

data haVE;

input Firm$     Person$;

cards;

A          x

B          y

C          x

;

proc sql;

create table want (drop=pair) as

select a.firm as firm_a, b.firm as firm_b, ifc(a.firm<b.firm, cats(a.firm,b.firm), cats(b.firm,a.firm) )as pair

from have as a, have as b

where

  1. a.person=b.person and a.firm ~=b.firm

group by pair

having monotonic()=min(monotonic());

quit;


You can always break it into 2-steps.


Haikuo

Super User
Posts: 9,775

Re: How to omitt some redundant observations

if I understand what you mean , why not :

call sortc(Firm_a ,Firm_b);

then

proc sort  nodupkey; by Firm_a     Firm_b;run;

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 302 views
  • 3 likes
  • 4 in conversation