I have some firm data with more half a million firms. I found some firms with the same name have different id numbers in different years. I comfirmed by firm address that the firm with the same name but with different id numbers is actually the same firm.
I wonder how I can change the id number to be the same for each unique firm. Thanks! The sample is as follows:
year | id | name |
1996 | 600535930 | ABC international |
1997 | 600535930 | ABC international |
1998 | 600535930 | ABC international |
1999 | 730135548 | ABC international |
2000 | 730135548 | ABC international |
2001 | 730135548 | ABC international |
2002 | 730135548 | ABC international |
2003 | 745995778 | ABC international |
2004 | 745995778 | ABC international |
2005 | 745995778 | ABC international |
2006 | 745995778 | ABC international |
2007 | 745995778 | ABC international |
2008 | 745995778 | ABC international |
2009 | 745995778 | ABC international |
2010 | 745995778 | ABC international |
1996 | 730415910 | TBL data inquiry |
1997 | 730415910 | TBL data inquiry |
1998 | 737993908 | TBL data inquiry |
1999 | 737993908 | TBL data inquiry |
2000 | 730415910 | TBL data inquiry |
2001 | 730415910 | TBL data inquiry |
2002 | 730415910 | TBL data inquiry |
2003 | 730415910 | TBL data inquiry |
2004 | 730415910 | TBL data inquiry |
2005 | 730415910 | TBL data inquiry |
2006 | 730415910 | TBL data inquiry |
2007 | 737993908 | TBL data inquiry |
2008 | 737993908 | TBL data inquiry |
2009 | 737993908 | TBL data inquiry |
2010 | 737993908 | TBL data inquiry |
Hi,
if you are confirmed with the ID number then :
the easiest and straight way is to use if then statement with data step.
if name ='ABC international' then id = 'XXXXXXXXXX' ;
this would make the unique id for ABC international.
if not satisfied with this..... we have lots of options to change to unique.
just let us know.
Regards
Uma Shanker Saini
Thanks Uma!
But actually, some names also have some slight differences, mainly due to misspelling. I have changed the names into the same if they share the same id number. My problem is how to change the remaining observations with some differences in both id and name.
Hi,
I assumed first ID(that is ID of 1996 in this case) will be same to all different Firms...
proc sort data = have;
by name;
run;
data temp(keep = id name);
set have;
by name;
if first.name then output;
run;
data want(drop = id);
retain year id unique_id name;
merge have temp(rename = (id = unique_id));
by name;
run;
-Urvish
Agree with Urwish's code for your concern. Try it
Thanks Urwish.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.