BookmarkSubscribeRSS Feed
buddha_d
Pyrite | Level 9

Dear community,

             I am trying to delete a companyid (compid) if one of the employee last and first name matches with another employee last and first name in two different dataset. Of course, these are sample datasets. Even if an employee first and last name matches in both datasets then I want to delete that company id (compid field), there are hundreds of employees in real data. I have sample code which I tried and I am getting error. Please give me any suggestions. 

 

/* sample dataset1 birthday*/
data birthday;
   input compid birthdat date7.
         lastname $18.
         firstnam $15.
         phone $4.;
		 format birthdat mmddyy10.;
         datalines;
459287 05JAN39 RODRIGUES        JUAN           5879
127845 25DEC48 MEDER            VLADIMIR       6231
254896 06APR54 TAYLOR-HUNYADI   ITO            0231
;
run;

/*Sample dataset2 tempbirthday*/
proc sql;
create table WORK.TEMPBIRTHDAY( bufsize=65536 )
  (
   compid num,
   birthdat num format=MMDDYY10.,
   lastname char(18),
   firstnam char(15),
   phone char(4)
  );
insert into WORK.TEMPBIRTHDAY
      values(459287,'04MAY98'd, 'NISHIMATSU-LYNCH','RICHARD', '6778')
      values(765112,'04MAY98'd, 'SMITH','ROBERT', '')
      values(219776,'15APR98'd, 'PASTORELLI','ZORA','')
      values(245233,'10APR98'd, 'ALI','SADIQ','')
      values(245234,'10APR98'd, 'MEHAILESCU','NADIA','')
      values(326721,'01MAY98'd, 'CALHOUN','WILLIS','')
	  values(459287,'05JAN39'd, 'RODRIGUES', 'JUAN', '5879')
	  values(127845,'25DEC48'd, 'MEDER', 'VLADIMIR', '6231')
	  values(254896,'06APR54'd, 'TAYLOR-HUNYADI', 'ITO', '0231')
;

quit;

/*sort both datasets by compid */
proc sort data=birthday ;
	by compid;
proc sort data=tempbirthday (rename=( birthdat=birthdat1 lastname=lastname1 firstnam=firstnam1 phone=phone1));
	by compid;
run;

/* delete the employees who is from same compid (companyid) */
data total1;
	merge birthday (in=ina) tempbirthday(in=inb);
	by compid;
	if ina and inb then do;
	delete compid;
	end;
run;

Thanks in advance 

5 REPLIES 5
Patrick
Opal | Level 21

@buddha_d 

The DELETE statement deletes a row of data not just a single variable. If this is what you want then just remove the variable name from the delete statement and your code will work.

You can't delete a variable (the column) from a specific row (observation). You can set the variable in a specific row ("the cell") to missing or change it but not delete a "cell". SAS tables are rectangular.

buddha_d
Pyrite | Level 9

Yes I agree Patrick. In the sample data case, I want to remove company id (compid) '459287' as it meets the criteria of present in both datasets. Rest of the data after merging, could be the same. The output dataset should look like this

buddha_d_0-1629000386173.png

Thanks,

 

Patrick
Opal | Level 21

Given the required logic you describe not sure why compid 127845 would remain in your desired result.

Below the code for the logic as I understand it.

data total1;
  merge 
    birthday (in=ina)
    tempbirthday(in=inb);
  by compid;
  if ina and inb and lastname=lastname1 and firstnam=firstnam1 then delete;
run;
buddha_d
Pyrite | Level 9

Patrick, you are correct. Compid 127845 shouldn't be there in the dataset. In fact, all the compids 459287, 127845, 254896 shouldn't be seen in the output dataset as there is at least one matching observation (compid on both datasets). The highlighted record should be deleted as well (though the first and last names are different, I don't want to see if the two different employees work for the same company)

buddha_d_0-1629003334734.png

which will be accomplished by using this code

 


/* delete the employees who is from same compid */
data total1;
merge
birthday (in=ina)
tempbirthday(in=inb);
by compid;
if not ina and inb ;
run;

Tom
Super User Tom
Super User

(NOT INA AND INB) is different than NOT (INA and INB).

This first just finds the observations that exist in B that are not in A.

The second also includes the cases that exist in A that are not in B.

 

hackathon24-white-horiz.png

2025 SAS Hackathon: There is still time!

Good news: We've extended SAS Hackathon registration until Sept. 12, so you still have time to be part of our biggest event yet – our five-year anniversary!

Register Now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 5 replies
  • 1631 views
  • 0 likes
  • 3 in conversation