BookmarkSubscribeRSS Feed
Isabel
Calcite | Level 5

Hi,

 

I need help :), have to make a random homogeneous distribution of the observations on the basis of the amount of a given variable.

I have a set of records that must be assigned to specific entities based on certain percentages.

 

Example (basetable):

ID  COUNT AMOUNT 

1   1        300

2   1        150

3   2        500

4   1        200

5   3        700

 

entities and percentages 

ENTITY  PERCENTAGE 

A       20%        

B       70%       

C       10%

 

how can I assign the correct values ​​of COUNT and AMOUNT entities A, B and C in relation to their proportion in the most consistent and fair as possible ?

 

you may kindly direct me some examples?

 

Thank you so much , Isabel

5 REPLIES 5
Ksharp
Super User
Your post is ambiguous . What output would you like to see ?
Isabel
Calcite | Level 5

Hi,

 

for example:
A 20% COUNT: 1.6 AMOUNT: 370
B 70% COUNT: 5.6 AMOUNT:1295
C 10% COUNT: 0.8 AMOUNT: 185
 
so:
ID  COUNT AMOUNT
1   1        300  => A 
2   1        150  => A
3   2        500  => B
4   1        200  => C 
5   3        700  => B
 
A =>COUNT: 2 AMOUNT: 450
B =>COUNT: 5 AMOUNT: 1200
C =>COUNT: 1 AMOUNT: 200
Norman21
Lapis Lazuli | Level 10

Hi Isabel.

 

Is there a rule that determines the assignment of data to groups A, B and C? Or do you want this to be random, ending up with the proportions indicated (20% of obervations in group A, etc.)?

 

If the latter, then AnnMaria might have the solution: http://www.thejuliagroup.com/blog/?p=2599

Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

Ksharp
Super User
Ou God. I understand something now. It is more like a SAS/OR problem. Is there any OBJECT function ? I believe there are lots of combination suited your requirement . Whether you want the minimize number of each GROUP or want the minimize difference of each GROUP ?
Reeza
Super User

Proc SurveySelect is generally used for sample selection and you can specify proportions but I can't see how A/B/C tie back to your original data. Otherwise it seems a bit like three samples just stacked together. 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1709 views
  • 0 likes
  • 4 in conversation