BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Rohit_1990
Calcite | Level 5

HI Everyone,

 

I have a dataset  AA  as follows:

 

Column1 column2

1 A

2 A

3 A

4 B

5 B

6 C

1 C

8 D

1 D

 

NOW I have Another dataset  BB as follows:

 

Column1

1

2

3

4

5

6

8

 

So i need to represent BB as follows:

 

Column1 column2

1 D

2 D

3 D

4 B

5 B

6 D

8 D

 

So basically I am looking across for a Common value in  different group in table AA.

So in table AA value 1 is common among group A, C, D

 So I assign MAX of group(A,C,D) i.e. D in this case to 1.

 

But if one value in group is updated to new value hence remaining values should also be updated

to the same value in the group.

 

hence value 2,3,6 are also assigned to D.

 

Thanks in advance for your help

 

Regards,

Rohit

 

 

 

 

 

 

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

Any good solution to the first data set should work on this new data as well. My SAS/OR based code above does, of course Smiley Very Happy .

PG

View solution in original post

8 REPLIES 8
novinosrin
Tourmaline | Level 20

Hi @Rohit_1990  Nice and interesting question. OK, can you please clarify whether the sample is good representative of your real i.e the data what's been posted is what it is ?

The reason I am asking is my immediate thought went to sort by col1 col2 and the look up. This approach is only possible if you confirm the above otherwise explain what your real may have

Rohit_1990
Calcite | Level 5
Hi @novinosrin thanks for your response.

The sample is ok .

Basically I am trying to merge different groups is one based on common
column1 value and assign all values of the common group the max value of
column 2

I hope it helps you with the solution.
novinosrin
Tourmaline | Level 20

Hi @Rohit_1990   I trust your clarifications hold good. See if the following helps. Please fee free to come back to us should you need more help/clarifications

 


data have;
input Col1 col2 $;
cards;
1 A
2 A
3 A
4 B
5 B
6 C
1 C
8 D
1 D
;

data have1;
input Col1;
cards;
1
2
3
4
5
6
8
;

proc sql;
create table temp as
select *, max(col2) as max_col2
from have 
where col1 in (select col1 from have1)
group by col1
order by col1,col2;
quit;


data want;
if _n_=1 then do;
   dcl hash H () ;
   h.definekey  ("col2") ;
   h.definedata ("max_col2") ;
   h.definedone () ;
   dcl hiter hh('h');
 end;
do until(last.col1);
set temp;
by col1;
if _n_=1 then rc=h.add();
else do;
if h.find()= 0 then f=max_col2;else rc=h.add();
end;
end;
if _n_>1 and not missing(f) then max_col2=f;
drop rc col2 f;
run;
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @Rohit_1990 

 

This was an interesting problem. I think it can be solved in many ways, and I tried an SQL approach knowing that it is not considered as elegant as a smart data step, but it seems to work and is easy to explain.

 

First find highest group per value and use a cartesian join to get all value combinations pr. group. Then join highest group in for all value combinations, and last get the highest group per value.

 

data aa;
	input c1 c2$;
	datalines;
1 A
2 A
3 A
4 B
5 B
6 C
1 C
8 D
1 D
;
run;
data bb;
	input c1 ;
	datalines;
1
2
3
4
5
6
8
;
run;

* highest group per value;
proc sql;
	create table max_group_value as 
		select distinct 
			c1, 
			max(c2) as c2
		from aa
		group by c1;
quit;

* all combinations of values pr. group;
proc sql;
	create table group_all_values as
		select 
			a.c1, 
			b.c1 as c1x, 
			a.c2
		from aa as a, aa as b
		where a.c2 = b.c2;
quit;

* highest group per combination;
proc sql;
	create table high_group_value as
		select 
			a.c1, 
			a.c1x, 
			a.c2, 
			b.c2 as c2x
		from group_all_values as a, max_group_value as b
		where group_all_values.c1x = max_group_value.c1;
quit;

* highest group per value;
proc sql;
	create table want as
		select distinct 
			c1, 
			max(c2x) as c2
		from high_group_value
		group by c1;
quit;

PGStats
Opal | Level 21

If you have access to SAS/OR then you can frame this problem as finding the connected components in a graph:

 

data aa;
	input col1$ col2$;
	datalines;
1 A
2 A
3 A
4 B
5 B
6 C
1 C
8 D
1 D
;

proc optnet links=aa direction=undirected out_nodes=cc;
data_links_var from=col1 to=col2;
concomp;
run;

proc sort data=cc; by concomp descending node; run;

data dd;
do until(last.concomp);
    set cc; by concomp;
    if first.concomp then col2 = node;
    if anydigit(first(node)) then do;
        col1 = node;
        output;
        end;
    end;
keep col1 col2;
run;
PG
Rohit_1990
Calcite | Level 5

Hi All,

 

Thanks a lot !!!!

 

Well the solution provided by each of you is solving my raised concern.

but  I seek one more help from all of you.

 

in the given dataset below:

 

Col1 Col2

1 A

2 A

3 B

2 B

4 C

1 C

6 D

4 D

7 D

 

So here group A is related to group B on common value 2

Group C  is related to Group A with common element being 1

likewise Group D with Group C on common value 4.

 

So as we can see all groups are interrelated with each other based on some common element.

 

 

So I need to group all groups in one single group as follows:

 

Col1 Col2

1 D

2 D

3 D

4 D

6 D

7 D

 

 

So now my final dataset has all related groups elements in one single group.

 

Please help me on this.

 

Regards and Thanks again

Rohit

 

 

 

 

 

 

 

 

 

 

 

 

PGStats
Opal | Level 21

Any good solution to the first data set should work on this new data as well. My SAS/OR based code above does, of course Smiley Very Happy .

PG
Rohit_1990
Calcite | Level 5
Hi ,
Somehow the third part of code forming dataset DD is not getting correctly populated.

Column c2 is not getting populated.

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 1005 views
  • 0 likes
  • 4 in conversation