turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- question about the mixture of two normal distribut...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-13-2017 07:46 PM - last edited on 12-15-2017 06:04 AM by Rick_SAS

Data generated using random number with Normal distribution (mean=5 std=2) as dataset D1 which has 100 elements.

Data generated using random number with Normal distribution (mean=3 std=3) as dataset D2 which has 80 elements.

Based on D1 and D2, data generated with Normal distribution at 30% Normal (mean=5, std=2)+70% Normal (mean=3, std=3) as D3 which has 120 elements

How to estimate proportion of D1 and D2 in dataset D3?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to librasantosh

12-13-2017 09:28 PM

Two ways pop to mind:

1.Simulation

2. Theory — look at joint distribution probability and see.

The correct method is likely what you’ve been taught in the course. If it’s theoretical, check your text, this isn’t a SAS question.

If it is simulation, what do you have so far?

I also feel like the question may be missing something...Is this the word for word question from your assignment or have you paraphrased it?

librasantosh wrote:

Data generated using random number with Normal distribution (mean=5 std=2) as dataset D1 which has 100 elements.Data generated using random number with Normal distribution (mean=3 std=3) as dataset D2 which has 80 elements.Based on D1 and D2, data generated with Normal distribution at 30% Normal (mean=5, std=2)+70% Normal (mean=3, std=3) as D3 which has 120 elementsHow to estimate proportion of D1 and D2 in dataset D3?

PS asking the same question multiple times is unhelpful.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to librasantosh

12-13-2017 11:13 PM - edited 12-13-2017 11:13 PM

**Proc FMM** (Finite Mixture Models) does this kind of estimation.

PG

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to PGStats

12-14-2017 08:14 AM

thanks for reply

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to librasantosh

12-14-2017 10:50 AM

Not clear if you want to do this in SAS or if it is a theoretical question....

If in SAS, it sounds like you are simulating a random sample from a mixture of normal distributions. However, your question seems to imply that the data set that you want is a random mixture of SAMPLES, where the samples are obtained beforehand.

Anyway, if you read the article, you will see that the general technique is

1. Generate random Bernoulli variate:

b = rand("Bern", 0.3);

2. Use the 0/1 value to determine if you should choose a random sample from the first or second distribution.

I think what I would do is use PROC SURVEYSELECT (or, easier, PROC IML) to sample 120 elements from each data set with replacement. Then merge the results and use the above technique to get your simulated sample:

If this is a theoretical question, the answer is to look at the expected value of the proportions. You expect 0.3*120 = 36 observations from D1 and 0.7*120 = 84 observations from D2.Use those values and the sizes of D1 and D2 to answer the question.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

12-15-2017 05:34 AM

Congratulations Rick_Sas for all your badges !

Add me as a friend ?

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Gunther

12-15-2017 06:03 AM

You can add me as a friend. That enables you to get notified (if you wish) on my activity, such as when I answer a question.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rick_SAS

12-15-2017 06:04 AM

Thank you Rick, you're really the best !

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to librasantosh

12-13-2017 03:48 PM - last edited on 12-14-2017 10:24 AM by AnnaBrown

Data generated using random number with Normal distribution (mean=5 std=2) as dataset D1 which has 100 elements.

Data generated using random number with Normal distribution (mean=3 std=3) as dataset D2 which has 80 elements.

Based on D1 and D2, data generated with Normal distribution at 30% Normal (mean=5, std=2)+70% Normal (mean=3, std=3) as D3 which has 120 elements

How to estimate proprotion of D1 and D2 in dataset D3?

If any idea please let me know.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to librasantosh

12-19-2017 10:52 AM

Sounds like a job for PROC FMM. Given data set D3 with response values in Y:

```
proc fmm;
model y = / k=2;
run;
```