BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
hkim3677
Calcite | Level 5

Hi,

 

I need to identify observations having multiple options (Name -> Name2).

Here is my data like...

 

Data test;

input ID NAME $ NAME2 $ Year;

datalines;

1 Highland HighlandBell 2008

1 Highland HighlandBell 2009

1 Highland HighlandBell 2010

1 Highland HighlandCorp 2008

1 Highland HighlandCorp 2009

1 Highland HighlandCorp 2010

1 Highland HighlandMalt 2008

1 Highland HighlandMalt 2009

1 Highland HighlandMalt 2010

2 HillBrosINC HillBrosINC 2011

2 HillBrosINC HillBrosINC 2012

3 HitachiLTD HitachLTD 2008

;

run;

 

So, I want to create one additional column to the dataset. Like..

 

Data test;

input ID NAME $ NAME2 $ Year Want;

datalines;

1 Highland HighlandBell 2008 1

1 Highland HighlandBell 2009 1

1 Highland HighlandBell 2010 1

1 Highland HighlandCorp 2008 1

1 Highland HighlandCorp 2009 1

1 Highland HighlandCorp 2010 1

1 Highland HighlandMalt 2008 1

1 Highland HighlandMalt 2009 1

1 Highland HighlandMalt 2010 1

2 HillBrosINC HillBrosINC 2011 0

2 HillBrosINC HillBrosINC 2012 0

3 HitachiLTD HitachLTD 2008 0

;

run;

 

Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Then you just need a simple query:

 

Data test;
input ID NAME :$16. NAME2 :$16. Year;
datalines;
1 Highland HighlandBell 2008
1 Highland HighlandBell 2009
1 Highland HighlandBell 2010
1 Highland HighlandCorp 2008
1 Highland HighlandCorp 2009
1 Highland HighlandCorp 2010
1 Highland HighlandMalt 2008
1 Highland HighlandMalt 2009
1 Highland HighlandMalt 2010
2 HillBrosINC HillBrosINC 2011
2 HillBrosINC HillBrosINC 2012
3 HitachiLTD HitachLTD 2008
;
proc sql;
create table test2 as
select *, count(distinct name2) > 1 as want
from test
group by name;
quit;
PG

View solution in original post

8 REPLIES 8
Reeza
Super User

Are there rules for that column? How are you calculating it?

hkim3677
Calcite | Level 5

The Want column indicates whether the "Name" has multiple matching in "Name2".

PGStats
Opal | Level 21

What is the role of year?

PG
hkim3677
Calcite | Level 5
Actually, there is no rule of year. They are just given to show multiple matches across years.
PGStats
Opal | Level 21

Then you just need a simple query:

 

Data test;
input ID NAME :$16. NAME2 :$16. Year;
datalines;
1 Highland HighlandBell 2008
1 Highland HighlandBell 2009
1 Highland HighlandBell 2010
1 Highland HighlandCorp 2008
1 Highland HighlandCorp 2009
1 Highland HighlandCorp 2010
1 Highland HighlandMalt 2008
1 Highland HighlandMalt 2009
1 Highland HighlandMalt 2010
2 HillBrosINC HillBrosINC 2011
2 HillBrosINC HillBrosINC 2012
3 HitachiLTD HitachLTD 2008
;
proc sql;
create table test2 as
select *, count(distinct name2) > 1 as want
from test
group by name;
quit;
PG
novinosrin
Tourmaline | Level 20
Data test;

input ID NAME :$30. NAME2  :$30. Year;

datalines;
1 Highland HighlandBell 2008
1 Highland HighlandBell 2009
1 Highland HighlandBell 2010
1 Highland HighlandCorp 2008
1 Highland HighlandCorp 2009
1 Highland HighlandCorp 2010
1 Highland HighlandMalt 2008
1 Highland HighlandMalt 2009
1 Highland HighlandMalt 2010
2 HillBrosINC HillBrosINC 2011
2 HillBrosINC HillBrosINC 2012
3 HitachiLTD HitachLTD 2008

;

run;

data want;
do until(last.id);
f=0;
do until(last.name);
set test;
by id name name2;
if lag(name2) ne name2 then f+1;
end;
end;
do until(last.id);
set test;
by id name name2;;
Want=f>1;
output;
end;
drop f;
run;
ballardw
Super User

Is the variable assigned a value of 1 because name and name2 are different or because you have more than one value of name2 for each value of name? Those are two different rules that would result in the shown desired result.

 

And for some completeness sake, what would the result be for this data if it were part of your example data?

1 Highland Highland    2008

Ksharp
Super User
Data test;
input ID NAME :$16. NAME2 :$16. Year;
datalines;
1 Highland HighlandBell 2008
1 Highland HighlandBell 2009
1 Highland HighlandBell 2010
1 Highland HighlandCorp 2008
1 Highland HighlandCorp 2009
1 Highland HighlandCorp 2010
1 Highland HighlandMalt 2008
1 Highland HighlandMalt 2009
1 Highland HighlandMalt 2010
2 HillBrosINC HillBrosINC 2011
2 HillBrosINC HillBrosINC 2012
3 HitachiLTD HitachLTD 2008
;
proc sql;
create table test2 as
select *,case when count(distinct catx(' ',name,name2)) ne 1 then 1
         else 0 end as want
from test
group by name;
quit;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 914 views
  • 2 likes
  • 6 in conversation