I have some firm data with more half a million firms. I found some firms with the same name have different id numbers in different years. I comfirmed by firm address that the firm with the same name but with different id numbers is actually the same firm.
I wonder how I can change the id number to be the same for each unique firm. Thanks! The sample is as follows:
year | id | name |
1996 | 600535930 | ABC international |
1997 | 600535930 | ABC international |
1998 | 600535930 | ABC international |
1999 | 730135548 | ABC international |
2000 | 730135548 | ABC international |
2001 | 730135548 | ABC international |
2002 | 730135548 | ABC international |
2003 | 745995778 | ABC international |
2004 | 745995778 | ABC international |
2005 | 745995778 | ABC international |
2006 | 745995778 | ABC international |
2007 | 745995778 | ABC international |
2008 | 745995778 | ABC international |
2009 | 745995778 | ABC international |
2010 | 745995778 | ABC international |
1996 | 730415910 | TBL data inquiry |
1997 | 730415910 | TBL data inquiry |
1998 | 737993908 | TBL data inquiry |
1999 | 737993908 | TBL data inquiry |
2000 | 730415910 | TBL data inquiry |
2001 | 730415910 | TBL data inquiry |
2002 | 730415910 | TBL data inquiry |
2003 | 730415910 | TBL data inquiry |
2004 | 730415910 | TBL data inquiry |
2005 | 730415910 | TBL data inquiry |
2006 | 730415910 | TBL data inquiry |
2007 | 737993908 | TBL data inquiry |
2008 | 737993908 | TBL data inquiry |
2009 | 737993908 | TBL data inquiry |
2010 | 737993908 | TBL data inquiry |
Hi,
if you are confirmed with the ID number then :
the easiest and straight way is to use if then statement with data step.
if name ='ABC international' then id = 'XXXXXXXXXX' ;
this would make the unique id for ABC international.
if not satisfied with this..... we have lots of options to change to unique.
just let us know.
Regards
Uma Shanker Saini
Thanks Uma!
But actually, some names also have some slight differences, mainly due to misspelling. I have changed the names into the same if they share the same id number. My problem is how to change the remaining observations with some differences in both id and name.
Hi,
I assumed first ID(that is ID of 1996 in this case) will be same to all different Firms...
proc sort data = have;
by name;
run;
data temp(keep = id name);
set have;
by name;
if first.name then output;
run;
data want(drop = id);
retain year id unique_id name;
merge have temp(rename = (id = unique_id));
by name;
run;
-Urvish
Agree with Urwish's code for your concern. Try it
Thanks Urwish.
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.