BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mayasak
Quartz | Level 8
I have 2 sets of data that are geocoded for the "small area", years (2003-2007) and (2008 to 2012).
The two sets are geocoded in a different way. I'm trying to concatenate both sets and see if there's a difference in death rates between the two periods (years).
I have this code:
data mort03_07;
set data.mortsarea99_09_newrace;
if NMRes=1;
if 2003<=year<=2007;
sarea134=sarea133;
if sarea133=100 then sarea134=99;

geo=2; *not geocoded;
if 1<=sarea134<=108 then geo=1; *geocoded;

run;
data mort08_12;
set final.death99_13geo_14ungeo_ibis_std;
if NMRes=1;
if 2008<=year<=2012;

geo=2; *not geocoded;
if 1<=sarea134<=108 then geo=1; *geocoded;

run;

/*
proc summary data=mort03_07;
var x geo;
class fipscode;
output out=numgeo1 sum(geo)=numgeo sum(x)=totnum;
run;

proc summary data=mort08_12;
var x geo;
class fipscode;
output out=numgeo2 sum(geo)=numgeo sum(x)=totnum;
run;

data numge01;
set numgeo1;
period=1;
run;
data numge02;
set numgeo2;
period=2;
run;

data numgeo;
set numge01 numge02;
geopct=numgeo/totnum;
run;

proc print data=numgeo1; title 'geocoded, period1';
proc print data=numgeo2; title 'geocoded, period2';
proc print data=numgeo; title 'geocoded, both periods';
run;
 
When I run the proc print data=numgeo; title 'geocoded, both periods';
I get the geopct >1  (because the numgeo > totnum)
I'm not sure what am I doing wrong here.
Thank you,
Ruzeina
1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @mayasak,

 

Without sample data it's a bit hard to say, but my first guess is that the coding of variable GEO might be not ideal: If NUMGEO is to be the number of geocoded items, the code for "not geocoded" should be 0, not 2. Otherwise NUMGEO is likely to be too large, leading to incorrectly large values of GEOPCT, possibly GEOPCT>1, as you've observed.

View solution in original post

3 REPLIES 3
FreelanceReinh
Jade | Level 19

Hi @mayasak,

 

Without sample data it's a bit hard to say, but my first guess is that the coding of variable GEO might be not ideal: If NUMGEO is to be the number of geocoded items, the code for "not geocoded" should be 0, not 2. Otherwise NUMGEO is likely to be too large, leading to incorrectly large values of GEOPCT, possibly GEOPCT>1, as you've observed.

mayasak
Quartz | Level 8

Ya you're so right. We're dealing with counts here. Thanks a lot . I just have another question if you don't mind. As I said, I have to see if there is any difference in death rates in small areas due to difference in geocoding in two different data sets (period 1,years 2003-2007, and period 2,years 2008-2012). Do you have any thoughts about how can I do it ? So far I've calculated percentage of geocoded data in each county (not small areas), and ANOVA tests with "period" and "sarea" as independent variables and "cause of death" as dependent variable (couldn't do the interaction terms due to 0 degrees of freedom for errors). 

Thanks

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Thank you,

 


geocode3.PNG
FreelanceReinh
Jade | Level 19

As this is a completely different question, it will be better if you open a new thread for it. To do this, you should select a different forum within the SAS Support Communities: Analytics --> SAS Statistical Procedures.


There you will attract a more targeted audience. Also, it will be helpful to describe your data a little more (types of variables and their meaning). I am not familiar with geocoding and its implications for epidemiological research questions.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 633 views
  • 0 likes
  • 2 in conversation