Turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page

🔒 This topic is **solved** and **locked**.
Need further help from the community? Please
sign in and ask a **new** question.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

Posted 07-09-2018 01:28 PM
(725 views)

I have data with some categorical variables with missing values in them. I need variance estimates of the data based on leave i-th observation out within each group inside a categorical variable. (So in the end, I would have the number of estimates same as the number of observations in each group and k variance estimates for k groups in a categorical variable).

Is using ROC SURVEYFREQ with strata statement and VARMETHOD=JACKKNIFE specification a correct way to do this? Or do I need to make a loop like this: https://blogs.sas.com/content/iml/2017/06/21/jackknife-estimate-standard-error-sas.html ?

Also, if I need to make a loop like that for k groups, how do we construct proc iml codes?

Thank you in advance! I would greatly appreciate any help.

1 ACCEPTED SOLUTION

Accepted Solutions

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I don't fully understand what you are saying, but I guess I don't have to. Your question seems to be "how can I compute k variances in SAS/IML where each one is computed by leaving out one of the k groups."

Attached is a program that I hope will demonstrate the programming technique, even if I am misinterpreting some of the details. For background, I suggest you read my article about the "UNIQUE-LOC" technique.

```
data Have;
call streaminit(1234);
do Group = 1 to 10;
StdDev = round(1 + rand("Uniform"), 0.05); /* Group-specific Std Dev */
do i = 1 to 50;
x = rand("Normal", 0, StdDev);
output;
end;
end;
run;
proc iml;
use Have;
read all var {Group x};
close;
OverallVar = var(x);
u = unique(Group);
k = ncol(u); /* number of groups */
variance = j(1, k, .);
do i = 1 to k;
idx = loc( Group ^= u[i] ); /* omit the i_th group */
variance[i] = var( x[idx] ); /* compute statistic on remaining k-1 groups */
end;
print variance;
est = ssq( OverallVar - variance ) / (k/(k-1));
print est;
```

3 REPLIES 3

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I'm confused. Do you have a reference for what you are trying to do? Or can you provide data and explain what you are attempting?

SURVEYFREQ is used to estimate proportions. For example, you can use PROC SURVEYFREQ to estimate that your population is 40% white, 40% black, 10% Hispanic, and 10% Asian, and to get standard errors for those estimates. Is that what you want?

Usually, "jackknife" refers to estimates that leave out one observation. However, you seem to imply that you want to leave out an entire level of a categorical variable such as dropping the "Hispanic" level for a RACE variable. I am not familiar with that method. There are cross-validation techniques that leave out a portion of the data, but that is a different method than the jackknife.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

-Edited reply-

Data would consist of 50 observations from each group. Then, I would like to calculate a statistic of the observations within each group while excluding one observation from the group at a time. Then, in the end, I would be able to calculate a variance estimate for each group.

Hm, I see.. Then I don't think I should use SURVEYFREQ. Sorry, I can't provide data but I think any data should work. I think I should use codes like https://blogs.sas.com/content/iml/2017/06/21/jackknife-estimate-standard-error-sas.html but instead, do leave one out within a group.

- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content

I don't fully understand what you are saying, but I guess I don't have to. Your question seems to be "how can I compute k variances in SAS/IML where each one is computed by leaving out one of the k groups."

Attached is a program that I hope will demonstrate the programming technique, even if I am misinterpreting some of the details. For background, I suggest you read my article about the "UNIQUE-LOC" technique.

```
data Have;
call streaminit(1234);
do Group = 1 to 10;
StdDev = round(1 + rand("Uniform"), 0.05); /* Group-specific Std Dev */
do i = 1 to 50;
x = rand("Normal", 0, StdDev);
output;
end;
end;
run;
proc iml;
use Have;
read all var {Group x};
close;
OverallVar = var(x);
u = unique(Group);
k = ncol(u); /* number of groups */
variance = j(1, k, .);
do i = 1 to k;
idx = loc( Group ^= u[i] ); /* omit the i_th group */
variance[i] = var( x[idx] ); /* compute statistic on remaining k-1 groups */
end;
print variance;
est = ssq( OverallVar - variance ) / (k/(k-1));
print est;
```

Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.

**If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website. **

Multiple Linear Regression in SAS

Learn how to run multiple linear regression models with and without interactions, presented by SAS user Alex Chaplin.

Find more tutorials on the SAS Users YouTube channel.