Yo anybody out there . . . I am using SAS EG (new user) and have had Base SAS courses as well as EG but alas, I am back here at my workstation and have an issue. I have got a data set with 2 million records. My pal in another department has given me a seperate data set that lists 19 thousand of those very same records that should not be in my data set. So I have to remove the 19K that all have unique reciept numbers. Should I create a SAS program to do this? Or is there something in EG to do this. If I write a SAS program do I subset the data and then delete? Any ideas to give . . ???
It's unclear whether you have 'external' files or SAS files? If you have SAS files and the two are identical, you should be able to combine the two files and remove any duplicates. If, however, the two SAS files are not truly identical with all columns, you will need to consider identifying a set of "key" variables to combine the two files and then remove the duplicates.
If you have 'externa' files, you will need to import the two files, and then perform the process listed above.
Chris' solution is definitely the best point-and-click one I know of. FYI, if you wanted to write some code instead, there is a set operator EXCEPT that's part of SQL and is designed to do exactly what you describe: rows in one data set that are not in the other.
create table work.modified as
select id, name
select id, name
The new table would contain the columns ID and NAME, and only unique rows from the first data set (work.large) not found in the second data set (work.small).