Hi,
I'm trying to get the percentages of women that had a screening mammogram by county using Behavioral Risk Factor Surveillance System (BRFSS, a large survey performed by the Centers for Disease Control and Prevention) data. This is the code I'm trying to use:
proc surveyfreq data=tmp;
strata _ststr;
cluster _psu;
weight _llcpwt;
tables fips*mam/or;
run;
I'm confident the strata, cluster, and weight statements are correct. The issue seems to be in the tables statement, specifically with using the county in the cross tabulation. "fips" refers to the county code (e.g. 22001, 22003, etc. - there's 64 different counties in the state of interest), and "mam" refers to if the woman had a mammogram or not (1=yes, 2=no, 9=unknown).
Can someone tell me why I can't seem to run this code (it just keeps running)? And how I could do this analysis?
Amanda
It just took way longer than I expected. I let if run all night and it worked.
How long is "just keeps running"?
number of records in the data set?
With my data, Idaho, for 44 counties and roughly 6000 records in a single year that takes about 3 minutes to run without the OR option. So expect it to take a bit longer.
You might check but I believe the NOMCAR option is currently preferred for use with BRFSS analysis.
Note that CDC has not provided a standard method for doing small area estimates, i.e. county, in general from BRFSS data. The results are generally best restricted for use at the level the sample was stratified.
After I left it fun for over an hour, I stopped it because I assumed something was wrong.
There are 9,068 records in the dataset.
It seems based on how quit your data ran, I must not be doing something correct.
I read I should use NOMCAR to get better SEs or CIs (can't remember which).
Try the code without OR. Also, is the data local to your machine or on a network? Sometimes local data runs faster.
Will do. I'm going to run it overnight and see what happens. It's on a network. I can move it to my computer.
It might be informative to see just how long it takes to copy to your computer. I've had days where copying a 2mb file on our network took close to an hour. Running any analysis in that environment would have been practically impossible.
Thanks for the warning!
It just took way longer than I expected. I let if run all night and it worked.
My office is waiting, not quite with bated breath due to duration expectations, for CDC to finish their standardized approach to dealing with county / small area estimates from BRFSS data. The bits we've seen so far do not make me think that the time or code will be as simple as a direct estimate such as you just performed. It may be time to make noises about improving hardware, possibly faster disks, more memory and spend some time optimizing SAS options for through put.
Another option is SAS-callable SUDAAN may run a bit faster. At least it may be worth a test comparison.
Registration is open! SAS is returning to Vegas for an AI and analytics experience like no other! Whether you're an executive, manager, end user or SAS partner, SAS Innovate is designed for everyone on your team. Register for just $495 by 12/31/2023.
If you are interested in speaking, there is still time to submit a session idea. More details are posted on the website.
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.