DATA Step, Macro, Functions and more

splitting data into three group

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

splitting data into three group

I have variable size having more than 1000 values. I want to create a group of (low, medium, high) on the basis of size values. so whole size will be divided into 3 groups. top 1 are high then medium and low.

in this example data it is shown that there are 9 values. Top3 are in high group, having values in the middle level 3 values are in medium group then lowest 3 are in low group. So my intention is to create this group variable.  

Size group

10    low

20    low

50    low

50    medium

70   medium

80   Medium

95   high

99   high

99   high


Accepted Solutions
Solution
‎07-15-2017 08:53 PM
Super User
Posts: 17,818

Re: splitting data into three group

Are those the only values you'll see in the dataset? Or are there ranges of the variables?

 

One quick way is to use PROC RANK which will use percentiles, ie bottom 1/3 will be low by percentiles, top 1/3 will be high using a percentile. Rank will use a 0/1/2 to create the groups and you can rename them if desired.

 

proc rank data=have out=want groups=3;
var size;
ranks rank_size;
run;

jazzy wrote:

I have variable size having more than 1000 values. I want to create a group of (low, medium, high) on the basis of size values. so whole size will be divided into 3 groups. top 1 are high then medium and low.

in this example data it is shown that there are 9 values. Top3 are in high group, having values in the middle level 3 values are in medium group then lowest 3 are in low group. So my intention is to create this group variable.  

Size group

10    low

20    low

50    low

50    medium

70   medium

80   Medium

95   high

99   high

99   high


 

View solution in original post


All Replies
Solution
‎07-15-2017 08:53 PM
Super User
Posts: 17,818

Re: splitting data into three group

Are those the only values you'll see in the dataset? Or are there ranges of the variables?

 

One quick way is to use PROC RANK which will use percentiles, ie bottom 1/3 will be low by percentiles, top 1/3 will be high using a percentile. Rank will use a 0/1/2 to create the groups and you can rename them if desired.

 

proc rank data=have out=want groups=3;
var size;
ranks rank_size;
run;

jazzy wrote:

I have variable size having more than 1000 values. I want to create a group of (low, medium, high) on the basis of size values. so whole size will be divided into 3 groups. top 1 are high then medium and low.

in this example data it is shown that there are 9 values. Top3 are in high group, having values in the middle level 3 values are in medium group then lowest 3 are in low group. So my intention is to create this group variable.  

Size group

10    low

20    low

50    low

50    medium

70   medium

80   Medium

95   high

99   high

99   high


 

New Contributor
Posts: 2

Re: splitting data into three group

Thank. There are ranges of variables.
I think this will work for my requirement.
☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 2 replies
  • 100 views
  • 1 like
  • 2 in conversation