BookmarkSubscribeRSS Feed
librasantosh
Obsidian | Level 7
Data generated using random number with Normal distribution (mean=5 std=2) as dataset D1 which has 100 elements.
Data generated using random number with Normal distribution (mean=3 std=3) as dataset D2 which has 80 elements.
Based on D1 and D2, data generated with Normal distribution at 30% Normal (mean=5, std=2)+70% Normal (mean=3, std=3) as D3 which has 120 elements 
 
How to estimate proportion of D1 and D2 in dataset D3?
9 REPLIES 9
librasantosh
Obsidian | Level 7

 

Data generated using random number with Normal distribution (mean=5 std=2) as dataset D1 which has 100 elements.
Data generated using random number with Normal distribution (mean=3 std=3) as dataset D2 which has 80 elements.
Based on D1 and D2, data generated with Normal distribution at 30% Normal (mean=5, std=2)+70% Normal (mean=3, std=3) as D3 which has 120 elements

How to estimate proprotion of D1 and D2 in dataset D3?

If any idea please let me know.

StatDave
SAS Super FREQ

Sounds like a job for PROC FMM. Given data set D3 with response values in Y:

proc fmm;
model y = / k=2;
run;
Reeza
Super User

Two ways pop to mind:

 

1.Simulation

2. Theory — look at joint distribution probability and see. 

 

The correct method is likely what you’ve been taught in the course. If it’s theoretical, check your text, this isn’t a SAS question. 

If it is simulation, what do you have so far?  

 

I also feel like the question may be missing something...Is this the word for word question from your assignment or have you paraphrased it? 

 


@librasantosh wrote:
Data generated using random number with Normal distribution (mean=5 std=2) as dataset D1 which has 100 elements.
Data generated using random number with Normal distribution (mean=3 std=3) as dataset D2 which has 80 elements.
Based on D1 and D2, data generated with Normal distribution at 30% Normal (mean=5, std=2)+70% Normal (mean=3, std=3) as D3 which has 120 elements 
 
How to estimate proportion of D1 and D2 in dataset D3?

PS asking the same question multiple times is unhelpful. 

PGStats
Opal | Level 21

Proc FMM (Finite Mixture Models) does this kind of estimation.

PG
librasantosh
Obsidian | Level 7

thanks for reply

Rick_SAS
SAS Super FREQ

Not clear if you want to do this in SAS or if it is a theoretical question....

 

If in SAS, it sounds like you are simulating a random sample from a mixture of normal distributions. However, your question seems to imply that the data set that you want is a random mixture of SAMPLES, where the samples are obtained beforehand.

 

Anyway, if you read the article, you will see that the general technique is 

1. Generate random Bernoulli variate:

    b = rand("Bern", 0.3);

2. Use the 0/1 value to determine if you should choose a random sample from the first or second distribution.

 

I think what I would do is use PROC SURVEYSELECT (or, easier, PROC IML) to sample 120 elements from each data set with replacement. Then merge the results and use the above technique to get your simulated sample:

 

 

If this is a theoretical question, the answer is to look at the expected value of the proportions. You expect 0.3*120 = 36 observations from D1 and 0.7*120 = 84 observations from D2.Use those values and the sizes of D1 and D2 to answer the question.

Gunther
Fluorite | Level 6

Congratulations Rick_Sas for all your badges !

Add me as a friend ? Heart

Rick_SAS
SAS Super FREQ

You can add me as a friend. That enables you to get notified (if you wish) on my activity, such as when I answer a question. 

Gunther
Fluorite | Level 6

Thank you Rick, you're really the best ! Heart

 

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1075 views
  • 1 like
  • 6 in conversation