Hello, I have longitudinal data with multiple health records for multiple dates, per patient. I created a numerical variable for month to simplify (values: 1-12) and because it's for a time series.
For now I am trying to randomly select one patient record, per month. so it's ok to have multiple records for a person, as long as there's only one chosen per month. I've been trying to figure out proc surveyselect but cannot get it per patient/per month.
id pt_id month ...
1 XXY 1
2 XXY 2
3 XXY 1
4 ZZH 2
5 ZZH 2
6 KKJ 3
7 KKJ 4
8 KKJ 3
9 KKJ 5
10 KKJ 5
11 KKJ 6
Any suggestions?
Thanks
How large is your input dataset? How many patient*month (or patient*quarter) combinations are there?
What type of variable is QUARTER (or month)? If it is actually a date value that is using a format to display only the quarter or year-quarter using an attached format then you might have many more combinations that you thought.
Also how many of you patient*month combinations only have one observation already?
I would first try turning off the listing output of PROC SURVERYSELECT by adding the NOPRINT option. Perhaps your session is just locking up because you are generating pages and pages of output.
If you have a lot of patient*month combinations with only one observation already the SAS LOG might get really large with the notes that PROC SURVEYSELECT generates when that happens.
If you are doing a SRS then perhaps just do it yourself instead.
proc sql;
create view for_selection as
select *, random('uniform') as rand
from udt.aim2b
order by patid, quarter, rand
;
quit;
data aim2_random_b ;
set for_selction;
by patid quarter ;
if first.quarter;
run;
Please show the code you've tried in PROC SURVEYSELECT.
@Student77 wrote:
Hello, I have longitudinal data with multiple health records for multiple dates, per patient. I created a numerical variable for month to simplify (values: 1-12) and because it's for a time series.
For now I am trying to randomly select one patient record, per month. so it's ok to have multiple records for a person, as long as there's only one chosen per month. I've been trying to figure out proc surveyselect but cannot get it per patient/per month.
id pt_id month ...
1 XXY 1
2 XXY 2
3 XXY 1
4 ZZH 2
5 ZZH 2
6 KKJ 3
7 KKJ 4
8 KKJ 3
9 KKJ 5
10 KKJ 5
11 KKJ 6
Any suggestions?
Thanks
data have;
input id pt_id $ month;
cards;
1 XXY 1
2 XXY 2
3 XXY 1
4 ZZH 2
5 ZZH 2
6 KKJ 3
7 KKJ 4
8 KKJ 3
9 KKJ 5
10 KKJ 5
11 KKJ 6
;;;;
proc sort data=have;
by month;
run;
proc surveyselect data=have method=srs sampsize=1 out=want;
strata month;
run;
You need to remove the patid from the STRATA statement. Otherwise you're saying pick one per patient per month.
How large is your input dataset? How many patient*month (or patient*quarter) combinations are there?
What type of variable is QUARTER (or month)? If it is actually a date value that is using a format to display only the quarter or year-quarter using an attached format then you might have many more combinations that you thought.
Also how many of you patient*month combinations only have one observation already?
I would first try turning off the listing output of PROC SURVERYSELECT by adding the NOPRINT option. Perhaps your session is just locking up because you are generating pages and pages of output.
If you have a lot of patient*month combinations with only one observation already the SAS LOG might get really large with the notes that PROC SURVEYSELECT generates when that happens.
If you are doing a SRS then perhaps just do it yourself instead.
proc sql;
create view for_selection as
select *, random('uniform') as rand
from udt.aim2b
order by patid, quarter, rand
;
quit;
data aim2_random_b ;
set for_selction;
by patid quarter ;
if first.quarter;
run;
Just an issue with my memory.
RAND()
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.