06-19-2013 12:41 PM
I have two datasets:
Set A: Michael, Rahul, Dalton, Barb....................1020 observations
Set B: Disouza, Michael, Glory, Victoria, Daniel, Chip...............200 observations
Set A: Michael, rahul, dalton, barb....................1020 observations
Set B: Disouza, Glory, Victoria, Daniel, Chip,...........200 or less observa
I would like to have unique names in set B. i.e., If names from set A are repeated in set B, I have to remove them. At the end, I need unique observations in Set B (200 or less but not more).
What I did: I sorted the two files by name and then merged them by first name. Then I used the nodupkey and dupout to separate the repeated observations. But I couldn't create the same set B. My new set B has values from set A too.
Any kind of help would be greatly appreciated.
06-19-2013 01:07 PM
You need to be more clear in your question.
Is your data in columns or rows?
Post a small example of what you have and what you want and any code you've tried and WHY it didn't work.
How does case affect your data? Is Rahul the same as rahul? SAS comparisons are case sensitive.
From what you have my suggestion would be a proc sql with a where not in.
create table want as
select * from a
where name not in (select name from table b);
You could also try a datastep merge and use something like:
merge have1 (in=a) have2(in=b);
if b and not a;
06-19-2013 01:26 PM
My data is in Excel sheets. Each sheet has 50 variables. Two excel sheets have just the variable name in common. Rest of the variables are company, work phone, personal phone, marrital status, so on...
And the data is not case sensitive. All names start with capital letter and continue with regular case. I am trying your code and I will return to you in about 10 min with sample dataset, if that code doesn't work.
Thank you for your prompt response