11-26-2014 04:43 PM
I have two data sets that I pulled from the CDC tracking website. They are both in Excel format. The first data set has heart attack data for counties in the United States, that is, one column lists US counties and the next column lists the incidence of heart attacks for the corresponding county. The second data set lists the county in one column and a corresponding measurement of ambient particulate matter 2.5 in the second column. The problem I'm having is that if a county has no data, then the county does not appear on the spreadsheet so the two spreadsheets do not have the same number of rows. In other words, I can't simply copy and paste the columns on one spreadsheet. Is there a way to match the data by county and copy it on the same data set?
My goal is to do a linear regression plotting heart attack vs. particulate matter 2.5.
11-26-2014 05:55 PM
Or, if the missing data is in file two, and you want to ignore those counties that have missing data:
data want ;
merge one two (in=in2);
by COUNTY ;