BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
ha33
Obsidian | Level 7

Hi,


I have the following code joining two tables by an unique identifier, in which a disease population is joined with a background population, creating approx 1000 matches per person in the disease population.

 

proc sql;

create table new as

select distinct case_id, id

from disease_pop inner join background_pop

on disease_pop.case_id = background_pop.id;

 

This makes a table that looks like this:

 

CASE_IDID
123345
123567
123891
234902
234903
234904

 

Only there are approximately 1000 "IDs" for each CASE ID

 

So what I want to do next is reduce the number of matches, so instead of having 1000 IDs per Case ID I would like only 10 IDs per CASE ID, but I still want to have all CASE IDs in my new table.

 

Any suggestions?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ha33
Obsidian | Level 7

Hey so an update:

 

I ended up using PROC SURVEYSELECT and defining each patient as one stratum. Worked perfectly. 

 

 

View solution in original post

3 REPLIES 3
LinusH
Tourmaline | Level 20

And what is the rule for reducing no of id's?

There are several ways of  creating samples in SAS.

In a raw data step you could do BY case_id, start a counter, and do implicit OUTPUT as long the counter is <= 10.

Reset the counter for each new BY value (IF first.case_id THEN...).

Data never sleeps
ha33
Obsidian | Level 7

Thanks for you reply. There is no rule as everyone in the background population is matched by the same variables. I will attempt your suggestion. 

ha33
Obsidian | Level 7

Hey so an update:

 

I ended up using PROC SURVEYSELECT and defining each patient as one stratum. Worked perfectly. 

 

 

SAS INNOVATE 2024

Innovate_SAS_Blue.png

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. 

Register now!

Discussion stats
  • 3 replies
  • 2052 views
  • 0 likes
  • 2 in conversation