Hi,
I have the following code joining two tables by an unique identifier, in which a disease population is joined with a background population, creating approx 1000 matches per person in the disease population.
proc sql;
create table new as
select distinct case_id, id
from disease_pop inner join background_pop
on disease_pop.case_id = background_pop.id;
This makes a table that looks like this:
CASE_ID | ID |
123 | 345 |
123 | 567 |
123 | 891 |
234 | 902 |
234 | 903 |
234 | 904 |
Only there are approximately 1000 "IDs" for each CASE ID
So what I want to do next is reduce the number of matches, so instead of having 1000 IDs per Case ID I would like only 10 IDs per CASE ID, but I still want to have all CASE IDs in my new table.
Any suggestions?