BookmarkSubscribeRSS Feed
Teketo
Calcite | Level 5

Hello,

I have got three spatial datasets.

  1. Polygon shape file of a country x (including all administrative regions)
  2. Demographic and Health Survey (DHS) data (population characteristics and cluster coordinates)
  3. Health facility data (health facilities characteristics and facilities coordinate)

I have been trying to merge these datasets using SAS; however, I have got problem in merging the population data (DHS) with health facility data. Both the sampled health facility and population data were collected all over the country.

 

The DHS data has 622 clusters (each cluster has one coordinate information) and an average of 23 individuals (14,300 people in total) were interviewed per cluster. On the other hand, in the second dataset, 1,020 health facilities were interviewed along with their geographic coordinates.

 

The merge should be done using both geographic coordinates (1,020 health facility vs 622 clusters) (using the nearest health facility distance to each DHS cluster) and regions (11 regions where both the data were collected). 

 

The merge that I want is not only by minimum distance between clusters and health facilities, but it should also consider the regional administration boundary. In other words, all nearest distance merges must not cross regional admin boundary. 

 

During merge, how do I manage the multiple observations per cluster in the population dataset; 14,300 people in 622 clusters? There are more location coordinates of health facility (1,020) as compared to 622 clusters.

 

How can I merge the DHS data with the health facility data? How do I manage the attribute data (an average of 23 individuals information per one cluster) while combining it with a single health facility data?

 

Here are sample elements of the two data sets.

Data x; *Health facility dataset;

Set a;

Keep LAT LONG REGION FACTYPE Q102_04 GR1 GR2 FA1 FA2 FR1 FR2 FR3

Run;

 

Data y; *Population dataset;

Set b;

Keep V001 LAT_DHS LONG_DHS V002 V012 REGION V190 V218 M14_1 V501 V313M;

Run;

 

Kind regards

Teketo

1 REPLY 1
Reeza
Super User

This isn't a merge, it's more like finding the nearest which is a different type of analysis.

 

GEODIST will calculate the distances, but those are 'straight line' distances, not driving distance. SAS VA can do this via driving distances and you may want to use ArcGIS or QGIS (free) to find the nearest location via driving distance.

 


@Teketo wrote:

Hello,

I have got three spatial datasets.

  1. Polygon shape file of a country x (including all administrative regions)
  2. Demographic and Health Survey (DHS) data (population characteristics and cluster coordinates)
  3. Health facility data (health facilities characteristics and facilities coordinate)

I have been trying to merge these datasets using SAS; however, I have got problem in merging the population data (DHS) with health facility data. Both the sampled health facility and population data were collected all over the country.

 

The DHS data has 622 clusters (each cluster has one coordinate information) and an average of 23 individuals (14,300 people in total) were interviewed per cluster. On the other hand, in the second dataset, 1,020 health facilities were interviewed along with their geographic coordinates.

 

The merge should be done using both geographic coordinates (1,020 health facility vs 622 clusters) (using the nearest health facility distance to each DHS cluster) and regions (11 regions where both the data were collected). 

 

The merge that I want is not only by minimum distance between clusters and health facilities, but it should also consider the regional administration boundary. In other words, all nearest distance merges must not cross regional admin boundary. 

 

During merge, how do I manage the multiple observations per cluster in the population dataset; 14,300 people in 622 clusters? There are more location coordinates of health facility (1,020) as compared to 622 clusters.

 

How can I merge the DHS data with the health facility data? How do I manage the attribute data (an average of 23 individuals information per one cluster) while combining it with a single health facility data?

 

Here are sample elements of the two data sets.

Data x; *Health facility dataset;

Set a;

Keep LAT LONG REGION FACTYPE Q102_04 GR1 GR2 FA1 FA2 FR1 FR2 FR3

Run;

 

Data y; *Population dataset;

Set b;

Keep V001 LAT_DHS LONG_DHS V002 V012 REGION V190 V218 M14_1 V501 V313M;

Run;

 

Kind regards

Teketo


 

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 1 reply
  • 913 views
  • 0 likes
  • 2 in conversation