How to merge two data sets on a variable with missing values.

Reply
N/A
Posts: 1

How to merge two data sets on a variable with missing values.

I have two data sets that I pulled from the CDC tracking website. They are both in Excel format. The first data set has heart attack data for counties in the United States, that is, one column lists US counties and the next column lists the incidence of heart attacks for the corresponding county. The second data set lists the county in one column and a corresponding measurement of ambient particulate matter 2.5 in the second column. The problem I'm having is that if a county has no data, then the county does not appear on the spreadsheet so the two spreadsheets do not have the same number of rows. In other words, I can't simply copy and paste the columns on one spreadsheet. Is there a way to match the data by county and copy it on the same data set?

My goal is to do a linear regression plotting heart attack vs. particulate matter 2.5.

Super User
Super User
Posts: 6,851

Re: How to merge two data sets on a variable with missing values.

Just merge the two data sets.

data want ;

  merge one two ;

  by COUNTY ;

run;

PROC Star
Posts: 7,439

Re: How to merge two data sets on a variable with missing values.

Or, if the missing data is in file two, and you want to ignore those counties that have missing data:

data want ;

  merge one two (in=in2);

  by COUNTY ;

  if in2;

run;

Ask a Question
Discussion stats
  • 2 replies
  • 152 views
  • 0 likes
  • 3 in conversation