- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I would like to extract a sample with these fields: patient_id, Size of nodule (3 classes) and diagnostic (cancer or benign) from population of 1731 patients.
The sample will be done without replacement and i need to have this distribution in final sample for :
size of nodule:
01:[4-10mm] => 37.5%
02:[10-20mm] => 37.5%
03:[20-30mm] => 25%
Diagnostic:
Cancer => 33.33%
Benign => 66.67%
The final sample will include 200 patients.
Thanks a lot for your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
No time to write an example (on simulated data) for you right now.
But you need :
- PROC SURVEYSELECT
- with STRATA statement
- with ALLOC=SAS-data-set option.
See here :
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_details20.htm
Good luck,
Koen
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
To extract a sample in SAS, you can use the SAMPLE
statement in a PROC SURVEYSELECT
procedure. Here is an example of how you might use it to extract your sample:
PROC SURVEYSELECT SAMPLE=200 SEED=12345 OUT=sample_data METHOD=SRS NOPRINT; STRATA SizeOfNodule (01: 0.375, 02: 0.375, 03: 0.25) DIAGNOSTIC (Cancer: 0.3333, Benign: 0.6667); RUN;
This will create a new dataset called
sample_data
that contains a sample of 200 patients, with the proportions of nodule sizes and diagnostic outcomes that you specified.The
SEED
option allows you to specify a seed value for the random number generator, so that you can obtain the same sample each time you run the code. You can change the seed value to any integer that you like.The
METHOD=SRS
option specifies that the sample will be selected using simple random sampling without replacement. This means that each patient has an equal probability of being selected, and that each patient can only be selected once.The
STRATA
statement allows you to specify the proportions of the sample that should be allocated to each combination of nodule size and diagnostic outcome. In this case, the sample will be stratified by nodule size and diagnostic outcome, with the proportions that you specified for each stratum.I hope this helps! Let me know if you have any other questions.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for your answers,
I tried this one but some errors appear:
PROC SURVEYSELECT
N=200
SEED=12345
OUT=sample_data
METHOD=SRS
NOPRINT;
STRATA strateDiam3
(Min: 0.375, Mid: 0.375, Max: 0.25)
ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, DESCENDING, NOTSORTED, _ALL_, _CHARACTER_, _CHAR_,
_NUMERIC_.
ERROR 76-322: Syntax error, statement will be ignored.
FinalDiagnosis
(Cancer: 0.3333, Benign: 0.6667);
RUN;