BookmarkSubscribeRSS Feed
jcis7
Pyrite | Level 9

G'd a y.  I need to create a single list from two separate lists. One list has >10,000 observations; the other has a couple thousand.

Both lists have 7 digit character ID#s (I will list as single digit ID# below to make it easier to understand) . Some IDs match and some don't between the two lists.

I'd like to keep the ID, Name, Phys Street, PhysCity, PhysZip, and Email for the final list.    How would I get a final list where each ID is listed once?

List One

ID     SchoolName   PhysStreet PhysCity PhsZip   Email 

1

2

3

4

5

List Two

ID     SchoolName   PhysStreet PhysCity PhsZip   Email

2

4

5

9

8

Any help you can give is much appreciated!

12 REPLIES 12
stat_sas
Ammonite | Level 13

proc sort data=one;
by id;
run;

proc sort data=two;
by id;
run;

data want;
update one two;
by id;
run;

jcis7
Pyrite | Level 9

Appreciate your response! Correct me if I'm wrong, but I thought the update statetment updates values in dataset one with the values in dataset two that match but doesn't add the observations from dataset two into dataset one that don't match?

Final List

ID     SchoolName   PhysStreet PhysCity PhsZip   Email

1

2

3

4

5

9

8

stat_sas
Ammonite | Level 13

It does both. Observations in list one will be updated across the variables with the values of list two based on common ids and new ids from list two will also be added to list one to make it final list. I am assuming that ids are unique in both list one and list two.

jcis7
Pyrite | Level 9

Appreciate you helping me understand the UPDATE statement better!!

Reeza
Super User

If the IDS match between lists, will the corresponding information, ie school name, street address automatically match as well?

If not, how do you want to deal with that? If using the second file is fine the solution above is appropriate. If you want the first file, flip the data sets on the update statement. If its some other logic you'll need different code.

jcis7
Pyrite | Level 9

Good point.  The addresses most likely will match between the two lists. How can you tell if they don't match (i.e, keep the ones in the final list that have the same ID but different addresses)?

Reeza
Super User

If this is manually entered data you'll also have issue with the address, where one person spells out street and the other uses st.

I would append and take a look and see if the multiples have differences first to determine what needs to be done to clean the data. The check dataset below will be all ID's with multiple records.

data temp;

set one two indsname=source;

DSET=source;

run;

proc sort data=temp; by id schoolname phystreet;

run;

data check;

set temp;

by id;

if first.id ne last.id;

run;

jcis7
Pyrite | Level 9

Great -- appreciate you helping understand how to check the dataset!!

Ksharp
Super User

If I understood what you mean.

data a;
 set sashelp.class;
 id+1;
run;
data b;
 set sashelp.class end=last;
 id+1; output;
 if last then do;id+1;name='Arthur.T'; output;end;
run;

data once;
 set a b;
 by id;
if first.id and last.id ;
run;

Xia Keshan

jcis7
Pyrite | Level 9

Ksharp wrote:

If I understood what you mean.

data a;
 set sashelp.class;
 id+1;
run;
data b;
 set sashelp.class end=last;
 id+1; output;
 if last then do;id+1;name='Arthur.T'; output;end;
run;

data once;
 set a b;
 by id;
if first.id and last.id ;
run;

Xia Keshan

What does the following code refer to or do?  name='Arthur.T';

Reeza
Super User

It adds a record in to data set B, with the name set as Arthur.T, this record would not be data set A.

It's a way of simulating your data.

Ksharp
Super User

Thanks. Reeza .   Yes. I just make a dummy table to simulate your data .

Reeza , You gonna surpass Arthur.T to be number one of sas user list . Congratulations !

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1459 views
  • 0 likes
  • 4 in conversation