An Idea Exchange for SAS software and services

Comments
by Super User
on ‎09-06-2017 04:45 PM

 

I am curious how you get so many weights of 0.

by New Contributor mitcheem
on ‎09-07-2017 09:25 AM

The data is the Medical Expenditure Panel Survey (MEPS), which is nationally representative (and not too many zero weights for the main person weight).

 

The survey also includes supplemental surveys on a sub-population -- for instance, persons with diabetes. For that analysis, anyone that didn't take the Diabetes Care Survey has a secondary weight of 0.

by SAS Employee SAS_Rob
on ‎09-11-2017 08:01 AM

In this situation you should use the NOMCAR option on the SURVEYFREQ statement which will automatically treat the missing/zero weights as a separate domain and thus use them to derive correct standard errors.

by New Contributor mitcheem
on ‎09-11-2017 09:20 AM

@SAS_Rob, I can't get the NOMCAR to make a difference. Here's an example, using 2015 data from MEPS (can download SAS transport file here):

 

 

/* Option 1: default version -- 33250 observations dropped */
proc surveyfreq data = h181; 
	FORMAT DSEY1553 diab_eye. ;
	STRATA VARSTR;
	CLUSTER VARPSU;
	WEIGHT DIABW15F; 
	TABLES DSEY1553 / row;
run;

/* Option 2: Using nomcar -- no difference in SEs */
proc surveyfreq data = h181 nomcar; 
	FORMAT DSEY1553 diab_eye. ;
	STRATA VARSTR;
	CLUSTER VARPSU;
	WEIGHT DIABW15F; 
	TABLES DSEY1553 / row;
run;

/* Option 3: Changing weight by hand */
data alt; 
	set h181;
	if DIABW15F = 0 then do;
		DIABW15F = 1;
		domain = 2;
	end;
	else do;
		domain = 1;
	end;
run;

proc surveyfreq data = alt; 
	FORMAT DSEY1553 diab_eye. ;
	STRATA VARSTR;
	CLUSTER VARPSU;
	WEIGHT DIABW15F; 
	TABLES domain*DSEY1553 / row;
run;

Options 1 and 2 both drop 33,250 observations before analysis, and both give row standard errors for 'Eye exam in past year' of 1.2847.

 

 

Option 3, which alters the weights in the dataset, doesn't drop any observations, and gives a row standard error of 1.2927

by SAS Employee SAS_Rob
on ‎09-11-2017 09:45 AM

I think there may be more going on here.  When you add those observations in manually, you are also introducing 30 additional clusters that would not have been there otherwise (see the Data Summary table).  This also changes the variance calculation, not just because of the addition of those 30 clusters, but also because of the 

NOTE:There is at least one stratum that contains only a single cluster for the table of DSEY1553. Single-cluster strata are not included in the variance estimates.

 

I think that the alternate code might actual be giving you incorrect standard errors.

by New Contributor mitcheem
on ‎09-11-2017 02:54 PM

Thanks for the suggestions, @SAS_Rob, but I don't think that's the issue. The problem is that over 30,000 observations are dropped from the dataset before SURVEYFREQ runs, meaning that some PSUs and Strata are dropped as well, which are needed for correct SE estimation (this is similar to the reason why we have to use the DOMAIN statement, instead of a 'WHERE = ' subset, to keep all observations in the dataset in order to calculate correct SEs).

 

But since you brought it up, what's the best way to deal with lonely PSUs (aka 'single-cluster strata') in SAS?

by SAS Employee SAS_Rob
on ‎09-15-2017 11:18 AM

There is no one best way to deal with singleton strata which is why SAS doesn't do anything automatically.  The most common approach is to collapse them into other similar strata prior to running the procedure.

Idea Statuses
Top Liked Authors