Hello. I want to merge two data sets.
I have two sets of data as follows:
1) Incidence | 2)Pop | |||||
Country | Year | Number | Country | Year | Pop | |
USA | 2000 | 20 | USA | 1998 | 120 | |
USA | 2001 | 25 | USA | 1999 | 122 | |
USA | 2002 | 30 | USA | 2000 | 123 | |
USA | 2003 | 35 | USA | 2001 | 124 | |
USA | 2004 | 36 | USA | 2002 | 125 | |
USA | 2003 | 126 | ||||
USA | 2004 | 127 | ||||
USA | 2005 | 128 | ||||
USA | 2006 | 129 | ||||
USA | 2007 | 130 | ||||
USA | 2008 | 131 | ||||
USA | 2009 | 132 |
I want to merge as follows:
Country | Year | Number | Pop |
USA | 1998 | 120 | |
USA | 1999 | 122 | |
USA | 2000 | 20 | 123 |
USA | 2001 | 25 | 124 |
USA | 2002 | 30 | 125 |
USA | 2003 | 35 | 126 |
USA | 2004 | 36 | 127 |
USA | 2005 | 128 | |
USA | 2006 | 129 | |
USA | 2007 | 130 | |
USA | 2008 | 131 | |
USA | 2009 | 132 |
I tried like:
data NEW;
Merge incidence pop;
by country year;
run;
But I have different result than I want like:
Country | Year | Number | Pop |
USA | 2000 | 20 | |
USA | 2001 | 25 | |
USA | 2002 | 30 | |
USA | 2003 | 35 | |
USA | 2004 | 36 | |
USA | 1998 | 120 | |
USA | 1999 | 122 | |
USA | 2000 | 123 | |
USA | 2001 | 124 | |
USA | 2002 | 125 | |
USA | 2003 | 126 | |
USA | 2004 | 127 | |
USA | 2005 | 128 | |
USA | 2006 | 129 | |
USA | 2007 | 130 | |
USA | 2008 | 131 | |
USA | 2009 | 132 |
Please help me whats wrong with it? All variable types are same in two data set.
Thanks
Epid
data Incidence;
input Country $ Year Number;
datalines;
USA 2000 20
USA 2001 25
USA 2002 30
USA 2003 35
USA 2004 36
;
data Pop;
input Country $ Year Pop;
datalines;
USA 1998 120
USA 1999 122
USA 2000 123
USA 2001 124
USA 2002 125
USA 2003 126
USA 2004 127
USA 2005 128
USA 2006 129
USA 2007 130
USA 2008 131
USA 2009 132
;
data NEW;
Merge incidence pop;
by country year;
run;
Seems fine to me?
Please post your log.
Your code has to work if the datasets are like you posted them:
data incidence;
input country $ year number;
cards;
USA 2000 20
USA 2001 25
USA 2002 30
USA 2003 35
USA 2004 36
;
run;
data pop;
input country $ year pop;
cards;
USA 1998 120
USA 1999 122
USA 2000 123
USA 2001 124
USA 2002 125
USA 2003 126
USA 2004 127
USA 2005 128
USA 2006 129
USA 2007 130
USA 2008 131
USA 2009 132
;
run;
data new;
merge incidence pop;
by country year;
run;
proc print data=new noobs;
run;
Result:
country year number pop USA 1998 . 120 USA 1999 . 122 USA 2000 20 123 USA 2001 25 124 USA 2002 30 125 USA 2003 35 126 USA 2004 36 127 USA 2005 . 128 USA 2006 . 129 USA 2007 . 130 USA 2008 . 131 USA 2009 . 132
Hello KurtBremser, Thanks.
Log is
887 data merged;
888 merge incidence pop;
889 by country year;
890 run;
NOTE: There were 2370 observations read from the data set WORK.INCIDENCE.
NOTE: There were 4876 observations read from the data set WORK.POP.
NOTE: The data set WORK.MERGED has 7246 observations and 44 variables.
NOTE: DATA statement used (Total process time):
real time 0.01 seconds
cpu time 0.01 seconds.
I only extracted few line of my data as an example for question but I have two sets of data which I imported from excel file with 2370 and 4876 observations. I am just guessing the problem must be with format of year. Once SAS log said ERROR: Variable Year has been defined as both character and numeric then I tried to fix it by using data incidence;
data incidence;
set incidence(rename=(year=yearnum));
Year = put(Yearnum, 7.);
drop yearnum;
run;
data pop;
set pop(rename=(year=yearnum));
Year = put(Yearnum, 7.);
drop yearnum;
run;
After that I didnt receive any error message but still I cant get my wanted result.
Thanks again,
The problem was with spreadsheet. I fixed the problem Thanks
data Incidence;
input Country $ Year Number;
datalines;
USA 2000 20
USA 2001 25
USA 2002 30
USA 2003 35
USA 2004 36
;
data Pop;
input Country $ Year Pop;
datalines;
USA 1998 120
USA 1999 122
USA 2000 123
USA 2001 124
USA 2002 125
USA 2003 126
USA 2004 127
USA 2005 128
USA 2006 129
USA 2007 130
USA 2008 131
USA 2009 132
;
data NEW;
Merge incidence pop;
by country year;
run;
Seems fine to me?
Thanks draycut,
Actually I only extracted few line of my data as an example for question but I have two sets of data with 2370 and 4876 observations which I imported from excel file . I am just guessing the problem must be with format of year. Once SAS log said ERROR: Variable Year has been defined as both character and numeric then I fixed it and I didnt receive any error message again. But I still cant have wanted result.
Thanks again
Thanks. The problem was spreadsheet. I fixed the problem.
I would bet a short stack of nickels that the code you actually ran to generate that output looked more like:
data NEW; SET incidence pop; run;
to generate that output.
Perhaps a previous version of the data made with SET was not replaced when you attempted the merge.
Dear All who kindly replied me,
I fixed my problem with my spreadsheet although I was not sure how I can fix it directly in SAS by using SAS code.
Thanks for valuable comments and I will definitely back to you with another problems as I am new user of SAS.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.