SAS Data Integration Studio, DataFlux Data Management Studio, SAS/ACCESS, SAS Data Loader for Hadoop and others

Combining datasets

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Combining datasets

Hello everyone i'm working on a project to combine two datasets which I successfully imported into sas separately. 

The first dataset is census data for an entire state by county and so each county is listed with its corresponding population estimate. 

The second dataset is data that lists the number of teachers per county with many other variables. I am trying to merge these two datasets but using the merge statement is not matching them correctly. Both datasets are sorted by county. I also tried proc SQL and nothing. Any ideas? below is my syntax. 

Im running version 9.4

 

Failed merge:

data CM;
merge One Two;
by cou;
run;   

 

Failed proc sql: 

Proc sql;

create table cmt as 

select * from One a

join Two b 

on a.cou=b.cou; 

quit; 

 

Thank you! 

 


Accepted Solutions
Solution
2 weeks ago
New Contributor
Posts: 3

Re: Combining datasets

Posted in reply to Missmichelle

Hello ,

a)Did you write appropriate type of join in the proc sql..

that might help the case...

Like ....  'inner' or 'left' or 'right' .....

 

For eg. below..

b) Are the datatype and length of county in both datsaets same ? 

 

 

Proc sql;

create table cmt as 

select * from One a

inner join Two

on a.cou=b.cou;quit;

View solution in original post


All Replies
Super User
Posts: 23,224

Re: Combining datasets

Posted in reply to Missmichelle

The second dataset is data that lists the number of teachers per county with many other variables. I am trying to merge these two datasets but using the merge statement is not matching them correctly.

 

What does not matching them correctly mean? Can you provide examples of records where you would like to see a match and it doesn't match? Matches are case sensitive so 'sas' is not the same as 'SAS'. 

Super User
Posts: 9,867

Re: Combining datasets

Posted in reply to Missmichelle

Your codes seem to be OK. Please post some example data against which we can test code (see my footnotes for how to do it).

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Super User
Posts: 13,283

Re: Combining datasets

Posted in reply to Missmichelle

"Failed" is awful vague.

Are there errors in the log?: Post the code and log in a code box opened with the {i} to maintain formatting of error messages.

No output? Post any log in a code box.

Unexpected output? Provide input data in the form of data step code pasted into a code box, the actual results and the expected results. Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

PROC Star
Posts: 1,286

Re: Combining datasets

Posted in reply to Missmichelle

1. I suggest you stick with PROC SQL for this kind of merge. I think you'll find it conceptually simpler.

 

2. From your description, it sounds like you should have one record per county id in each file. If you have more in one file, or much worse both files, you're going to get confusing results. You might want to check that.

 

Tom

Solution
2 weeks ago
New Contributor
Posts: 3

Re: Combining datasets

Posted in reply to Missmichelle

Hello ,

a)Did you write appropriate type of join in the proc sql..

that might help the case...

Like ....  'inner' or 'left' or 'right' .....

 

For eg. below..

b) Are the datatype and length of county in both datsaets same ? 

 

 

Proc sql;

create table cmt as 

select * from One a

inner join Two

on a.cou=b.cou;quit;

New Contributor
Posts: 2

Re: Combining datasets

Posted in reply to NeerajS1104

Thank you so so much!

New Contributor
Posts: 3

Re: Combining datasets

Posted in reply to Missmichelle

Couple of thoughts.

First make sure each county is unique.  Many Counties have the same name in different states.  For example, there is an Orange County, TX and Orange County, CA.

 

You may need to get the variable State into your Teacher dataset and use both State and County to preform the merge.

 

proc sql;

create table cmt as 

select * from One a

FULL join Two b 

on a.cou=b.cou and a.STATE = b.STATE; 

quit; 

 

You will get a message about variables that are in both datasets.  You might want to consider doing a COALESCE() on those variables that are in both for example cou and state.

 

Carry

 

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 376 views
  • 0 likes
  • 7 in conversation