BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
JackieEvans85
Calcite | Level 5

Hello Community! I am working with a new data set that has numerous entries for each participant. I want to set a threshold for inclusion such that 10 visits must be present  for a participant to be included in my new data set.

After meeting this threshold of number of visits, I need to figure out how many participants met that visit number threshold.

It's a .cvs file I converted from an excel file if that makes any difference

I tried PROC FREQ but couldn't figure out how to do anything with those results. Probably the completely wrong approach but I'm learning SAS on the job so cut me some slack!

Thanks for any advice! 

Jackie E.

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Use SQL for this:

data have;

input Participant;

n = _n_;

datalines;

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1001

1001

1001

1001

1001

1001

1001

1001

1001

1001

1001

1001

1002

1002

1002

1002

1002

1002

;

proc sql;

create table participants as

select * from have

group by participant

having count(*) >= 10

order by participant, n;

select * from participants;

quit;

PG

PG

View solution in original post

6 REPLIES 6
PGStats
Opal | Level 21

Showing some data (even fake) would make things more concrete. - PG

PG
JackieEvans85
Calcite | Level 5

Basically I need to work with only this column (sorry for the small amount of data, but it takes awhile to reblind the info) and the other columns don't matter for the analysis. I need to find if there are 10 entries for participant #1000 and if not, disregard the info for that participant. So on and so forth. So for this data, it would show that 1000 and 1001 both qualify but 1002 does not have enough entries.

Participant
1000
1000
1000
1000
1000
1000
1000
1000
1000
1000
1001
1001
1001
1001
1001
1001
1001
1001
1001
1001
1001
1001
1002
1002
1002
1002
1002
1002

Thanks!

Jackie E.

PGStats
Opal | Level 21

Use SQL for this:

data have;

input Participant;

n = _n_;

datalines;

1000

1000

1000

1000

1000

1000

1000

1000

1000

1000

1001

1001

1001

1001

1001

1001

1001

1001

1001

1001

1001

1001

1002

1002

1002

1002

1002

1002

;

proc sql;

create table participants as

select * from have

group by participant

having count(*) >= 10

order by participant, n;

select * from participants;

quit;

PG

PG
stat_sas
Ammonite | Level 13

proc sql;

select * from have

where participant in (select participant from have group by participant

having count(*)>=10);

quit;

JackieEvans85
Calcite | Level 5

Thank you both for the responses! After spending my lunch break researching (not too successfully) SQL I am wondering how to manipulate the data after using an SQL statement(Both of your responses are the first I've seen of SQL in action). I am more familiar with using IF-THEN-DO statements to manipulate Data.

So after the SQL statement you provided, which does manipulate the data how I want it too, is there any way to an another step or give this SQL statement an output name for storing and manipulating further?

Thanks!

Jackie E.

PGStats
Opal | Level 21

After executing my code, the resulting observations are in dataset Participants.

The select * from participants; statement can be removed as it is only there to provide a printed output.

- PG

PG

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1452 views
  • 6 likes
  • 3 in conversation