- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I'd like to request some assistance with the following programming query.
I have an input dataset (please see attachement) - small sample provided below. It consists of one row per person. Each row belongs to a specific campaign cell and is given a model score (values from 1 - 10).
Cell | Individual | ModelScore |
100 | Person1 | 1 |
100 | Person2 | 2 |
100 | Person3 | 3 |
100 | Person4 | 4 |
100 | Person5 | 5 |
100 | Person6 | 6 |
100 | Person7 | 7 |
100 | Person8 | 8 |
100 | Person9 | 9 |
100 | Person10 | 10 |
100 | Person11 | 1 |
100 | Person12 | 2 |
101 | Person13 | 3 |
101 | Person14 | 4 |
101 | Person15 | 5 |
101 | Person16 | 6 |
101 | Person17 | 7 |
102 | Person18 | 8 |
102 | Person19 | 9 |
102 | Person20 | 10 |
For each value of Cell, I would like to perform a random stratified (strata variable: ModelScore) sample to split the data into 20 roughly equal-sized groups. For example, the subset of rows that belong to cell 100 will be split into 20 groups, using the ModelScore variable. I would like this to be repeated for each value of Cell in the input dataset.
I was thinking of using PROC SURVEYSELECT with the GROUPS= option, but wasn't sure if there is a better approach.
Many thanks,
Hoa
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What do you mean by " using the ModelScore variable. " ? you want group it from smallest to largest ? proc ranks data=have groups=20; by cell; var modelscore; rank rank; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
What do you mean by " using the ModelScore variable. " ? you want group it from smallest to largest ? proc ranks data=have groups=20; by cell; var modelscore; rank rank; run;