Help using Base SAS procedures

PROC SURVEYFREQ Issue

Accepted Solution Solved
Reply
Contributor
Posts: 24
Accepted Solution

PROC SURVEYFREQ Issue

Hi,

I'm trying to get the percentages of women that had a screening mammogram by county using Behavioral Risk Factor Surveillance System (BRFSS, a large survey performed by the Centers for Disease Control and Prevention) data. This is the code I'm trying to use:

proc surveyfreq data=tmp;

strata _ststr;

cluster _psu;

weight _llcpwt;

tables fips*mam/or;

run;

I'm confident the strata, cluster, and weight statements are correct. The issue seems to be in the tables statement, specifically with using the county in the cross tabulation. "fips" refers to the county code (e.g. 22001, 22003, etc. - there's 64 different counties in the state of interest), and "mam" refers to if the woman had a mammogram or not (1=yes, 2=no, 9=unknown).

Can someone tell me why I can't seem to run this code (it just keeps running)? And how I could do this analysis?

Amanda


Accepted Solutions
Solution
‎08-14-2015 09:29 AM
Contributor
Posts: 24

Re: PROC SURVEYFREQ Issue

It just took way longer than I expected. I let if run all night and it worked.

View solution in original post


All Replies
Super User
Posts: 10,857

Re: PROC SURVEYFREQ Issue

How long is "just keeps running"?

number of records in the data set?

With my data, Idaho, for 44 counties and roughly 6000 records in a single year that takes about 3 minutes to run without the OR option. So expect it to take a bit longer. 

You might check but I believe the NOMCAR option is currently preferred for use with BRFSS analysis.

Note that CDC has not provided a standard method for doing small area estimates, i.e. county, in general from BRFSS data. The results are generally best restricted for use at the level the sample was stratified.

Contributor
Posts: 24

Re: PROC SURVEYFREQ Issue

After I left it fun for over an hour, I stopped it because I assumed something was wrong.

There are 9,068 records in the dataset.

It seems based on how quit your data ran, I must not be doing something correct.

I read I should use NOMCAR to get better SEs or CIs (can't remember which).

Super User
Posts: 10,857

Re: PROC SURVEYFREQ Issue

Try the code without OR. Also, is the data local to your machine or on a network? Sometimes local data runs faster.

Contributor
Posts: 24

Re: PROC SURVEYFREQ Issue

Will do. I'm going to run it overnight and see what happens. It's on a network. I can move it to my computer.

Super User
Posts: 10,857

Re: PROC SURVEYFREQ Issue

It might be informative to see just how long it takes to copy to your computer. I've had days where copying a 2mb file on our network took close to an hour. Running any analysis in that environment would have been practically impossible.

Contributor
Posts: 24

Re: PROC SURVEYFREQ Issue

Thanks for the warning!

Solution
‎08-14-2015 09:29 AM
Contributor
Posts: 24

Re: PROC SURVEYFREQ Issue

It just took way longer than I expected. I let if run all night and it worked.

Super User
Posts: 10,857

Re: PROC SURVEYFREQ Issue

My office is waiting, not quite with bated breath due to duration expectations, for CDC to finish their standardized approach to dealing with county / small area estimates from BRFSS data. The bits we've seen so far do not make me think that the time or code will be as simple as a direct estimate such as you just performed. It may be time to make noises about improving hardware, possibly faster disks, more memory and spend some time optimizing SAS options for through put.

Another option is SAS-callable SUDAAN may run a bit faster. At least it may be worth a test comparison.

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 514 views
  • 3 likes
  • 2 in conversation