BookmarkSubscribeRSS Feed
knighsson
Obsidian | Level 7

Hello,

 

I have two tables that both contain variables "participant sequence number" (value: 1-10000) and "participant group number" (value: 1-4). The sample of two tables are the same, so "participant sequence number" are same in two tables, but the methods of grouping are different, so their corresponding "participant group number" may be different between two tables.

 

I want to know how many participants that are included in the same group are in both tables? (for example, how many participants that are included in group 1 in the first table, and also included in group 1 in the second table) .

 

Could you tell me how to do this?

 

Thanks in advance.

6 REPLIES 6
sbxkoenk
SAS Super FREQ

Maybe a PROC FREQ?

 

Submit this as an example:

data have1;
 call streaminit(123);
 do participant_sequence_number = 1 to 10000;
  participant_group_number=floor(rand('uniform')*5);
  output;
 end;
run;
data have2;
 call streaminit(456);
 do participant_sequence_number = 1 to 10000;
  participant_group_number=floor(rand('uniform')*5);
  output;
 end;
run;
data want;
 merge have1(rename=(participant_group_number=participant_group_number1)) 
       have2(rename=(participant_group_number=participant_group_number2)); 
 by participant_sequence_number /* participant_group_number */;
run;

PROC FREQ data=want;
 tables participant_group_number1 * participant_group_number2;
run;
/* end of program */

Cheers,

Koen

ballardw
Super User

@sbxkoenk wrote:

Maybe a PROC FREQ?

 

Submit this as an example:

data have1;
 call streaminit(123);
 do participant_sequence_number = 1 to 10000;
  participant_group_number=floor(rand('uniform')*5);
  output;
 end;
run;
data have2;
 call streaminit(456);
 do participant_sequence_number = 1 to 10000;
  participant_group_number=floor(rand('uniform')*5);
  output;
 end;
run;
data want;
 merge have1(rename=(participant_group_number=participant_group_number1)) 
       have2(rename=(participant_group_number=participant_group_number2)); 
 by participant_sequence_number /* participant_group_number */;
run;

PROC FREQ data=want;
 tables participant_group_number1 * participant_group_number2;
run;
/* end of program */

Cheers,

Koen


I think the LIST option may make the output easier to read for the original question:

PROC FREQ data=want;
 tables participant_group_number1 * participant_group_number2/ list ;
run;
knighsson
Obsidian | Level 7
Thank you! this is helpful!
PGStats
Opal | Level 21

Or use SQL:

 

proc sql;
select 
    a.participant_group_number, 
    count(*) as nbInBothTables
from
    have1 as a inner join
    have2 as b 
        on  a.participant_sequence_number = b.participant_sequence_number and 
            a.participant_group_number = b.participant_group_number
group by a.participant_group_number;
quit;
 

Using simulated data from @sbxkoenk ,

PGStats_1-1617817532932.png

 

 

 

PG
knighsson
Obsidian | Level 7

Thank you! This is helpful!

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

Mastering the WHERE Clause in PROC SQL

SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 6 replies
  • 544 views
  • 2 likes
  • 4 in conversation