SAS Programming

DATA Step, Macro, Functions and more
BookmarkSubscribeRSS Feed
PSIOT2
Calcite | Level 5

Hello,

I would like to extract a sample  with these fields: patient_id, Size of nodule (3 classes) and diagnostic (cancer or benign)  from population of 1731 patients.

The sample will be done without replacement and i need to have this distribution in final sample for :

size of nodule:

01:[4-10mm] => 37.5%

02:[10-20mm] => 37.5%

03:[20-30mm] => 25%

Diagnostic:

Cancer => 33.33%

Benign => 66.67%

The final sample will include 200 patients.

Thanks a lot for your help.

3 REPLIES 3
sbxkoenk
SAS Super FREQ

Hello,

 

No time to write an example (on simulated data) for you right now.

 

But you need :

See here :
https://go.documentation.sas.com/doc/en/pgmsascdc/9.4_3.5/statug/statug_surveyselect_details20.htm

 

Good luck,

Koen

webart999ARM
Quartz | Level 8

To extract a sample in SAS, you can use the SAMPLE statement in a PROC SURVEYSELECT procedure. Here is an example of how you might use it to extract your sample:

 

PROC SURVEYSELECT 
    SAMPLE=200
    SEED=12345 
    OUT=sample_data
    METHOD=SRS
    NOPRINT;
    STRATA SizeOfNodule
        (01: 0.375, 02: 0.375, 03: 0.25)
        DIAGNOSTIC
        (Cancer: 0.3333, Benign: 0.6667);
RUN;

This will create a new dataset called sample_data that contains a sample of 200 patients, with the proportions of nodule sizes and diagnostic outcomes that you specified.

The SEED option allows you to specify a seed value for the random number generator, so that you can obtain the same sample each time you run the code. You can change the seed value to any integer that you like.

The METHOD=SRS option specifies that the sample will be selected using simple random sampling without replacement. This means that each patient has an equal probability of being selected, and that each patient can only be selected once.

The STRATA statement allows you to specify the proportions of the sample that should be allocated to each combination of nodule size and diagnostic outcome. In this case, the sample will be stratified by nodule size and diagnostic outcome, with the proportions that you specified for each stratum.

I hope this helps! Let me know if you have any other questions.

PSIOT2
Calcite | Level 5

Thank you for your answers,

I tried this one but some errors appear:

 

PROC SURVEYSELECT
N=200
SEED=12345
OUT=sample_data
METHOD=SRS
NOPRINT;
STRATA strateDiam3
(Min: 0.375, Mid: 0.375, Max: 0.25)

ERROR 22-322: Syntax error, expecting one of the following: a name, ;, -, /, :, DESCENDING, NOTSORTED, _ALL_, _CHARACTER_, _CHAR_,
_NUMERIC_.
ERROR 76-322: Syntax error, statement will be ignored.


FinalDiagnosis

(Cancer: 0.3333, Benign: 0.6667);
RUN;

sas-innovate-white.png

Our biggest data and AI event of the year.

Don’t miss the livestream kicking off May 7. It’s free. It’s easy. And it’s the best seat in the house.

Join us virtually with our complimentary SAS Innovate Digital Pass. Watch live or on-demand in multiple languages, with translations available to help you get the most out of every session.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1115 views
  • 3 likes
  • 3 in conversation