BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
LoveSAS
Calcite | Level 5

My data structure is like this

ID         TYPE

A          K1

A                    K2

A                    K3

B                    K3

C                    K2

C                    K3

D                    K2

I want to match, for each unique ID, how many other unique ID's that have the same type. For example, the output for this data would be

ID                    NUMBEROFID

A            3                                     (A contains K1,K2,K3. Nobody else is K1 so we don't need to consider K1.

                                                      Because there are C and D,except A, that are K2. There are B and C,except A,

                                                      that are K3 => the unique members, hence, are=> B C D, return 3 )

B            2                                     (Because B has only one type K3, so the only member that has K3 except B are => A and C. So return 2  )

C            2                                    (C contains K2 and K3. The member that has K2, except C, are A,D. The member that has K3,

                                                      except C, are A,B => Member are A,B,D, return 3)

D            2                                    (Same reasoning => A,C, return 2)

Please help! Thank you in Advance Smiley Happy

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

You can do this with SQL by joining the table with itself.

data have ;

  input (id type) ($) @@;

cards;

A K1 A K2 A K3 B K3 C K2 C K3 D K2

;;;;

proc sql noprint ;

create table want as

  select a.id,count(distinct b.id) as numberofid

  from have a left join have b

  on a.type=b.type and a.id ^= b.id

  group by a.id

;

quit;

id=A numberofid=3

id=B numberofid=2

id=C numberofid=3

id=D numberofid=2

View solution in original post

7 REPLIES 7
Tom
Super User Tom
Super User

You can do this with SQL by joining the table with itself.

data have ;

  input (id type) ($) @@;

cards;

A K1 A K2 A K3 B K3 C K2 C K3 D K2

;;;;

proc sql noprint ;

create table want as

  select a.id,count(distinct b.id) as numberofid

  from have a left join have b

  on a.type=b.type and a.id ^= b.id

  group by a.id

;

quit;

id=A numberofid=3

id=B numberofid=2

id=C numberofid=3

id=D numberofid=2

art297
Opal | Level 21

Tom,

That's not what the OP was asking for, but close.  I think the OP was asking for a similar join, but based on the transposed table:

data have ;

  input (id type) ($) @@;

cards;

A K1 A K2 A K3 B K3 C K2 C K3 D K2

;;;;

proc transpose data=have

  out=need (drop=_:);

  var type;

  id type;

  by id;

run;

I'd be interested to see a sql solution that, given the above data, the correct count is achieved.

Tom
Super User Tom
Super User

Art - Not sure what you mean.  I got the numbers that match the english description. For id=C the number the OP listed 2, but if you read the text it should be 3.

- Tom

art297
Opal | Level 21

Tom, after reviewing your code and result I agree with you.

PGStats
Opal | Level 21

I had read the question as the calculation of a sort of group membership: how many poeple share how many characteristics with others in the group, as in the rephrasing:

data have ;
  input (id type) ($) @@;
cards;
Paul Smokes
Paul OwnsACar
Paul PlaysGolf
Henry PlaysGolf
Susan OwnsACar
Susan PlaysGolf
George OwnsACar
;;;;

/* An interpretation which fits Tom's proposed solution */

proc sql noprint ;
create table want as
  select a.id,count(distinct b.id) as fitsTheClub
  from have a left join have b
  on a.type=b.type and a.id ^= b.id
  group by a.id
;
quit;

PG

PG
LoveSAS
Calcite | Level 5

Thank you very much guys! Sorry for my typo for C case. It should be 3 as Tom suggested. 

Would it be possible for you to explain the code?

proc sql noprint ;

create table want as
  select a.id,count(distinct b.id) as numberofid
  from have a left join have b
  on a.type=b.type and a.id ^= b.id
  group by a.id
;
quit;

I would like to learn how you come up with these steps.

art297
Opal | Level 21

Tom's code is fairly easy to understand if you add in some of the extra variables that he didn't include in the file.  e.g., if you run:

proc sql noprint ;

create table want as

  select a.id,count(distinct b.id) as numberofid

   ,a.type,b.id as b_id

  from have a left join have b

  on a.type=b.type and a.id ^= b.id

  group by a.id

;

quit;

you would get the following file:

id    numberofid    type    b_id

A          3         K3      B

A          3         K2      C

A          3         K2      D

A          3         K3      C

A          3         K1

B          2         K3      C

B          2         K3      A

C          3         K2      A

C          3         K3      B

C          3         K2      D

C          3         K3      A

D          2         K2      A

D          2         K2      C

Tom used a left join so there MUST be at least one record for each ID/type combination in the original file.  However, since he excluded matches where a.id = b.id, the one non-match record for A/type K1 doesn't get an assignment for b.id.

All of the other a.id/type combination had matching b.ids, thus they appear in the b.id column.

Then, since he is using the count function on distinct b.id/type values, the missing value and duplicates dont get counted.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 891 views
  • 7 likes
  • 4 in conversation