BookmarkSubscribeRSS Feed
Singham20
Obsidian | Level 7

Hi-

I am struggling stratifying my sample that has a predefined bound in SAS. 

For example:

Say I have a dataset named "PopSamp" with many sample points and different prices like below:

Sample Points     Price

          A                 .87

          B                  1.97

          C                  .27

          D                  2.27

          E                   4.97

          F                 .17

          G                  2.42

          H                  .03

           I                  5.99

          J                   3.69

 

Now suppose I have a dataset named "StrataBounds" with my strata upper bounds as follows:

Strata          Upper Bound

    1                     0.42

    2                     1.50

    3                      3.99

    4                   999999

 

May someone help me figure out how I can write a code that will use my "PopSamp" dataset and add a variable strata using the "StrataBounds" dataset?  This would be done by seeing in what strata the price is found.

 

My final dataset should look like this:

Sample Points     Price             Strata

          A                 .87                 2

          B                  1.97              3

          C                  .27                1

          D                  2.27               3

          E                   4.97              4

          F                 .17                  1

          G                  2.42               3

          H                  .03                 1

           I                  5.99                4

          J                   3.69               3

 

Thank you!

5 REPLIES 5
mkeintz
PROC Star

What have you tried so far?   While you can do this in a single statement, why not start out with a series of

    IF .. THEN ... ;

    ELSE IF .. THEN ..

statements?

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
Singham20
Obsidian | Level 7

This would work great if I had just one survey to do however, I have a lot of surveys to do and many of them have different number of strata.  I will need to write a proc sql I think.

mkeintz
PROC Star

Then show us a dataset of strata boundaries, along with corresponding survey id's.

 

No doubt this can be done in PROC SQL, but it can also be done in a DATA step.  To show how, we need to know what the boundaries dataset looks like.  Preferably presented as a SAS dataset, with INPUT statements and boundary values.

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
ballardw
Super User

If the "strata" was in the sample at the time of collection (it should have been by the way) where did it go?

 

Does your "strata" variable as resolved need to be numeric or character?

If character works you can use Proc format to add them assuming you know your actual bounds:

 

 

 

proc format library=work;
value Strata_one
 0- .42      = '1'
 .42< - 1.50 = '2'
 1.50<- 3.99 = '3'
 3.99<- 999999='4';
 run;

 data have;
    set want;
    strata = put(price,strata_one.);
run;

This is more efficient if any of the "strata" in different sets use the same rule. Create as many as needed.

 

Formats can be made from data sets relatively easily for this if the bounds are in a data set.

s_lassen
Meteorite | Level 14

@Singham20:

One possibility is to use PROC SQL to create a SELECT statement (or rather, the WHEN and OTHERWISE statements that go inside it) in a macro variable:

data PopSamp;
  input Point $ Price;
cards;
A  .87
B 1.97
C  .27
D 2.27
E 4.97
F  .17
G 2.42
H  .03
I 5.99
J 3.69
;run;

data StrataBounds;
  input stratum upper_bound;
cards;
1 0.42
2 1.50
3 3.99
4 999999
;run;
proc sql noprint; select case when upper_bound<999999 then cats('when(price<',upper_bound,') stratum=',stratum) else cats('otherwise stratum=',stratum) end into :select_stmt separated by ';' from StrataBounds order by upper_bound; quit; data want; set PopSamp; select; &select_stmt; end; run;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 833 views
  • 0 likes
  • 4 in conversation