> I have a data set with 15 observation hours for each
> of my subjects.
Does the data set produced by the following code model your data?
[pre]
proc plan seed=618029071;
factors
subjid = 10 ordered
agegroup = 1 of 3 random
time = 15 ordered
y1 = 1 of 200
/ noprint;
treatments y2=1 of 50 y3=1 of 30;
output out=plan;
run;
quit;
proc print;
run;
[/pre]
> I'm trying to use survey select to
> generate new data sets with decreasing numbers of
> hours, that I'm then comparing to the total hours.
> (i.e. what is the correlation between 14 and 15
> hours? 13 and 15? 12 and 15? etc.) I have 2 strata,
> basically age and subject. My problem is that I'd
> like to keep the same set of hours for each subject,
> and I can't figure out how to do that. For example,
> when I generate a set containing 3 hours out of the
> 15, if the hours for subject 1 are 8, 10, and 12 then
> I want the hours for subject 2 (and all others) to
> also be 8, 10 and 12. Is there any way to do this?
I don't think SURVERSELECT is going to work well this. SURVEYSELECT selects observations from data sets. It sounds like you want to select levels of a variable (TIME). In your example.
[pre]where time in(8,10,12) [/pre]
There are a number of ways to select (k of n) values at random. There is CALL RANPERK
[pre]
CALL RANPERK Routine
Randomly permutes the values of the arguments, and returns a permutation of k out of n values [/pre]
Also PROC PLAN in the FACTORS statement. I used this above to create sample data.
[pre]
name=m < OF n > < selection-type >
where
name
is a valid SAS name. This gives the name of a factor in the design.
m
is a positive integer that gives the number of values to be selected. If n is specified, the value of m must be less than or equal to n.
n
is a positive integer that gives the number of values to be selected from.
[/pre]
There are others, these are the ones I'm most familiar with.
If I am correct the details of which method(s) might be most appropriate depend on the output you desire. You mentioned correlation. If you describe (with sample data) how the data should look to produce the analysis this will help refine the solution.
Also, do you want to do this for all (n of m) subsets and do you want replication? That is replications of subsets of size n.