Creating a list from two; some IDs match between the two lists and som...

jcis7 · Posted 07-17-2014 02:54 PM

G'd a y. I need to create a single list from two separate lists. One list has >10,000 observations; the other has a couple thousand.

Both lists have 7 digit character ID#s (I will list as single digit ID# below to make it easier to understand) . Some IDs match and some don't between the two lists.

I'd like to keep the ID, Name, Phys Street, PhysCity, PhysZip, and Email for the final list. How would I get a final list where each ID is listed once?

List One

ID SchoolName PhysStreet PhysCity PhsZip Email

1

2

3

4

5

List Two

ID SchoolName PhysStreet PhysCity PhsZip Email

2

4

5

9

8

Any help you can give is much appreciated!

stat_sas · Posted 07-17-2014 03:18 PM

proc sort data=one;
by id;
run;

proc sort data=two;
by id;
run;

data want;
update one two;
by id;
run;

jcis7 · Posted 07-17-2014 03:29 PM

Appreciate your response! Correct me if I'm wrong, but I thought the update statetment updates values in dataset one with the values in dataset two that match but doesn't add the observations from dataset two into dataset one that don't match?

Final List

ID SchoolName PhysStreet PhysCity PhsZip Email

1

2

3

4

5

9

8

stat_sas · Posted 07-17-2014 03:38 PM

It does both. Observations in list one will be updated across the variables with the values of list two based on common ids and new ids from list two will also be added to list one to make it final list. I am assuming that ids are unique in both list one and list two.

jcis7 · Posted 07-17-2014 04:54 PM

Appreciate you helping me understand the UPDATE statement better!!

Reeza · Posted 07-17-2014 04:11 PM

If the IDS match between lists, will the corresponding information, ie school name, street address automatically match as well?

If not, how do you want to deal with that? If using the second file is fine the solution above is appropriate. If you want the first file, flip the data sets on the update statement. If its some other logic you'll need different code.

jcis7 · Posted 07-17-2014 04:50 PM

Good point. The addresses most likely will match between the two lists. How can you tell if they don't match (i.e, keep the ones in the final list that have the same ID but different addresses)?

Reeza · Posted 07-17-2014 04:56 PM

If this is manually entered data you'll also have issue with the address, where one person spells out street and the other uses st.

I would append and take a look and see if the multiples have differences first to determine what needs to be done to clean the data. The check dataset below will be all ID's with multiple records.

data temp;

set one two indsname=source;

DSET=source;

run;

proc sort data=temp; by id schoolname phystreet;

run;

data check;

set temp;

by id;

if first.id ne last.id;

run;

jcis7 · Posted 07-17-2014 07:15 PM

Great -- appreciate you helping understand how to check the dataset!!

Ksharp · Posted 07-18-2014 12:30 PM

If I understood what you mean.

data a;
 set sashelp.class;
 id+1;
run;
data b;
 set sashelp.class end=last;
 id+1; output;
 if last then do;id+1;name='Arthur.T'; output;end;
run;

data once;
 set a b;
 by id;
if first.id and last.id ;
run;

Xia Keshan

jcis7 · Posted 07-22-2014 05:53 PM

Ksharp wrote:

If I understood what you mean.



data a;
 set sashelp.class;
 id+1;
run;
data b;
 set sashelp.class end=last;
 id+1; output;
 if last then do;id+1;name='Arthur.T'; output;end;
run;

data once;
 set a b;
 by id;
if first.id and last.id ;
run;






Xia Keshan

What does the following code refer to or do? name='Arthur.T';

Reeza · Posted 07-22-2014 08:24 PM

It adds a record in to data set B, with the name set as Arthur.T, this record would not be data set A.

It's a way of simulating your data.

Ksharp · Posted 07-23-2014 08:16 AM

Thanks. Reeza . Yes. I just make a dummy table to simulate your data .

Reeza , You gonna surpass Arthur.T to be number one of sas user list . Congratulations !

Xia Keshan

Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Re: Creating a list from two; some IDs match between the two lists and some don't

Registration is open

SAS Training: Just a Click Away