DATA Step, Macro, Functions and more

Creating a list from two; some IDs match between the two lists and some don't

Reply
Regular Contributor
Posts: 192

Creating a list from two; some IDs match between the two lists and some don't

G'd a y.  I need to create a single list from two separate lists. One list has >10,000 observations; the other has a couple thousand.

Both lists have 7 digit character ID#s (I will list as single digit ID# below to make it easier to understand) . Some IDs match and some don't between the two lists.

I'd like to keep the ID, Name, Phys Street, PhysCity, PhysZip, and Email for the final list.    How would I get a final list where each ID is listed once?

List One

ID     SchoolName   PhysStreet PhysCity PhsZip   Email 

1

2

3

4

5

List Two

ID     SchoolName   PhysStreet PhysCity PhsZip   Email

2

4

5

9

8

Any help you can give is much appreciated!

Trusted Advisor
Posts: 1,204

Re: Creating a list from two; some IDs match between the two lists and some don't

proc sort data=one;
by id;
run;

proc sort data=two;
by id;
run;

data want;
update one two;
by id;
run;

Regular Contributor
Posts: 192

Re: Creating a list from two; some IDs match between the two lists and some don't

Appreciate your response! Correct me if I'm wrong, but I thought the update statetment updates values in dataset one with the values in dataset two that match but doesn't add the observations from dataset two into dataset one that don't match?

Final List

ID     SchoolName   PhysStreet PhysCity PhsZip   Email

1

2

3

4

5

9

8

Trusted Advisor
Posts: 1,204

Re: Creating a list from two; some IDs match between the two lists and some don't

It does both. Observations in list one will be updated across the variables with the values of list two based on common ids and new ids from list two will also be added to list one to make it final list. I am assuming that ids are unique in both list one and list two.

Regular Contributor
Posts: 192

Re: Creating a list from two; some IDs match between the two lists and some don't

Appreciate you helping me understand the UPDATE statement better!!

Super User
Posts: 17,748

Re: Creating a list from two; some IDs match between the two lists and some don't

If the IDS match between lists, will the corresponding information, ie school name, street address automatically match as well?

If not, how do you want to deal with that? If using the second file is fine the solution above is appropriate. If you want the first file, flip the data sets on the update statement. If its some other logic you'll need different code.

Regular Contributor
Posts: 192

Re: Creating a list from two; some IDs match between the two lists and some don't

Good point.  The addresses most likely will match between the two lists. How can you tell if they don't match (i.e, keep the ones in the final list that have the same ID but different addresses)?

Super User
Posts: 17,748

Re: Creating a list from two; some IDs match between the two lists and some don't

If this is manually entered data you'll also have issue with the address, where one person spells out street and the other uses st.

I would append and take a look and see if the multiples have differences first to determine what needs to be done to clean the data. The check dataset below will be all ID's with multiple records.

data temp;

set one two indsname=source;

DSET=source;

run;

proc sort data=temp; by id schoolname phystreet;

run;

data check;

set temp;

by id;

if first.id ne last.id;

run;

Regular Contributor
Posts: 192

Re: Creating a list from two; some IDs match between the two lists and some don't

Great -- appreciate you helping understand how to check the dataset!!

Super User
Posts: 9,662

Re: Creating a list from two; some IDs match between the two lists and some don't

If I understood what you mean.

data a;
 set sashelp.class;
 id+1;
run;
data b;
 set sashelp.class end=last;
 id+1; output;
 if last then do;id+1;name='Arthur.T'; output;end;
run;

data once;
 set a b;
 by id;
if first.id and last.id ;
run;

Xia Keshan

Regular Contributor
Posts: 192

Re: Creating a list from two; some IDs match between the two lists and some don't

Ksharp wrote:

If I understood what you mean.

data a;
 set sashelp.class;
 id+1;
run;
data b;
 set sashelp.class end=last;
 id+1; output;
 if last then do;id+1;name='Arthur.T'; output;end;
run;

data once;
 set a b;
 by id;
if first.id and last.id ;
run;

Xia Keshan

What does the following code refer to or do?  name='Arthur.T';

Super User
Posts: 17,748

Re: Creating a list from two; some IDs match between the two lists and some don't

It adds a record in to data set B, with the name set as Arthur.T, this record would not be data set A.

It's a way of simulating your data.

Super User
Posts: 9,662

Re: Creating a list from two; some IDs match between the two lists and some don't

Thanks. Reeza .   Yes. I just make a dummy table to simulate your data .

Reeza , You gonna surpass Arthur.T to be number one of sas user list . Congratulations !

Xia Keshan

Ask a Question
Discussion stats
  • 12 replies
  • 378 views
  • 0 likes
  • 4 in conversation