BookmarkSubscribeRSS Feed
Isabel
Calcite | Level 5

Hi,

 

I need help :), have to make a random homogeneous distribution of the observations on the basis of the amount of a given variable.

I have a set of records that must be assigned to specific entities based on certain percentages.

 

Example (basetable):

ID  COUNT AMOUNT 

1   1        300

2   1        150

3   2        500

4   1        200

5   3        700

 

entities and percentages 

ENTITY  PERCENTAGE 

A       20%        

B       70%       

C       10%

 

how can I assign the correct values ​​of COUNT and AMOUNT entities A, B and C in relation to their proportion in the most consistent and fair as possible ?

 

you may kindly direct me some examples?

 

Thank you so much , Isabel

5 REPLIES 5
Ksharp
Super User
Your post is ambiguous . What output would you like to see ?
Isabel
Calcite | Level 5

Hi,

 

for example:
A 20% COUNT: 1.6 AMOUNT: 370
B 70% COUNT: 5.6 AMOUNT:1295
C 10% COUNT: 0.8 AMOUNT: 185
 
so:
ID  COUNT AMOUNT
1   1        300  => A 
2   1        150  => A
3   2        500  => B
4   1        200  => C 
5   3        700  => B
 
A =>COUNT: 2 AMOUNT: 450
B =>COUNT: 5 AMOUNT: 1200
C =>COUNT: 1 AMOUNT: 200
Norman21
Lapis Lazuli | Level 10

Hi Isabel.

 

Is there a rule that determines the assignment of data to groups A, B and C? Or do you want this to be random, ending up with the proportions indicated (20% of obervations in group A, etc.)?

 

If the latter, then AnnMaria might have the solution: http://www.thejuliagroup.com/blog/?p=2599

Norman.
SAS 9.4 (TS1M6) X64_10PRO WIN 10.0.17763 Workstation

Ksharp
Super User
Ou God. I understand something now. It is more like a SAS/OR problem. Is there any OBJECT function ? I believe there are lots of combination suited your requirement . Whether you want the minimize number of each GROUP or want the minimize difference of each GROUP ?
Reeza
Super User

Proc SurveySelect is generally used for sample selection and you can specify proportions but I can't see how A/B/C tie back to your original data. Otherwise it seems a bit like three samples just stacked together. 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

What is ANOVA?

ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 5 replies
  • 1441 views
  • 0 likes
  • 4 in conversation