Solved: Re: Update value in table

Rohit_1990 · Posted 02-20-2019 10:14 AM

HI Everyone,

I have a dataset AA as follows:

Column1 column2

1 A

2 A

3 A

4 B

5 B

6 C

1 C

8 D

1 D

NOW I have Another dataset BB as follows:

Column1

1

2

3

4

5

6

8

So i need to represent BB as follows:

Column1 column2

1 D

2 D

3 D

4 B

5 B

6 D

8 D

So basically I am looking across for a Common value in different group in table AA.

So in table AA value 1 is common among group A, C, D

So I assign MAX of group(A,C,D) i.e. D in this case to 1.

But if one value in group is updated to new value hence remaining values should also be updated

to the same value in the group.

hence value 2,3,6 are also assigned to D.

Thanks in advance for your help

Regards,

Rohit

PGStats · Posted 02-20-2019 11:18 PM

Any good solution to the first data set should work on this new data as well. My SAS/OR based code above does, of course .

PG

View solution in original post

novinosrin · Posted 02-20-2019 10:27 AM

Hi @Rohit_1990 Nice and interesting question. OK, can you please clarify whether the sample is good representative of your real i.e the data what's been posted is what it is ?

The reason I am asking is my immediate thought went to sort by col1 col2 and the look up. This approach is only possible if you confirm the above otherwise explain what your real may have

Rohit_1990 · Posted 02-20-2019 10:56 AM

Hi @novinosrin thanks for your response.

The sample is ok .

Basically I am trying to merge different groups is one based on common
column1 value and assign all values of the common group the max value of
column 2

I hope it helps you with the solution.

novinosrin · Posted 02-20-2019 12:58 PM

Hi @Rohit_1990 I trust your clarifications hold good. See if the following helps. Please fee free to come back to us should you need more help/clarifications


data have;
input Col1 col2 $;
cards;
1 A
2 A
3 A
4 B
5 B
6 C
1 C
8 D
1 D
;

data have1;
input Col1;
cards;
1
2
3
4
5
6
8
;

proc sql;
create table temp as
select *, max(col2) as max_col2
from have 
where col1 in (select col1 from have1)
group by col1
order by col1,col2;
quit;


data want;
if _n_=1 then do;
   dcl hash H () ;
   h.definekey  ("col2") ;
   h.definedata ("max_col2") ;
   h.definedone () ;
   dcl hiter hh('h');
 end;
do until(last.col1);
set temp;
by col1;
if _n_=1 then rc=h.add();
else do;
if h.find()= 0 then f=max_col2;else rc=h.add();
end;
end;
if _n_>1 and not missing(f) then max_col2=f;
drop rc col2 f;
run;

ErikLund_Jensen · Posted 02-20-2019 12:57 PM

Hi @Rohit_1990

This was an interesting problem. I think it can be solved in many ways, and I tried an SQL approach knowing that it is not considered as elegant as a smart data step, but it seems to work and is easy to explain.

First find highest group per value and use a cartesian join to get all value combinations pr. group. Then join highest group in for all value combinations, and last get the highest group per value.

data aa;
	input c1 c2$;
	datalines;
1 A
2 A
3 A
4 B
5 B
6 C
1 C
8 D
1 D
;
run;
data bb;
	input c1 ;
	datalines;
1
2
3
4
5
6
8
;
run;

* highest group per value;
proc sql;
	create table max_group_value as 
		select distinct 
			c1, 
			max(c2) as c2
		from aa
		group by c1;
quit;

* all combinations of values pr. group;
proc sql;
	create table group_all_values as
		select 
			a.c1, 
			b.c1 as c1x, 
			a.c2
		from aa as a, aa as b
		where a.c2 = b.c2;
quit;

* highest group per combination;
proc sql;
	create table high_group_value as
		select 
			a.c1, 
			a.c1x, 
			a.c2, 
			b.c2 as c2x
		from group_all_values as a, max_group_value as b
		where group_all_values.c1x = max_group_value.c1;
quit;

* highest group per value;
proc sql;
	create table want as
		select distinct 
			c1, 
			max(c2x) as c2
		from high_group_value
		group by c1;
quit;

PGStats · Posted 02-20-2019 03:01 PM

If you have access to SAS/OR then you can frame this problem as finding the connected components in a graph:

data aa;
	input col1$ col2$;
	datalines;
1 A
2 A
3 A
4 B
5 B
6 C
1 C
8 D
1 D
;

proc optnet links=aa direction=undirected out_nodes=cc;
data_links_var from=col1 to=col2;
concomp;
run;

proc sort data=cc; by concomp descending node; run;

data dd;
do until(last.concomp);
    set cc; by concomp;
    if first.concomp then col2 = node;
    if anydigit(first(node)) then do;
        col1 = node;
        output;
        end;
    end;
keep col1 col2;
run;

PG

Rohit_1990 · Posted 02-20-2019 09:47 PM

Hi All,

Thanks a lot !!!!

Well the solution provided by each of you is solving my raised concern.

but I seek one more help from all of you.

in the given dataset below:

Col1 Col2

1 A

2 A

3 B

2 B

4 C

1 C

6 D

4 D

7 D

So here group A is related to group B on common value 2

Group C is related to Group A with common element being 1

likewise Group D with Group C on common value 4.

So as we can see all groups are interrelated with each other based on some common element.

So I need to group all groups in one single group as follows:

Col1 Col2

1 D

2 D

3 D

4 D

6 D

7 D

So now my final dataset has all related groups elements in one single group.

Please help me on this.

Regards and Thanks again

Rohit

PGStats · Posted 02-20-2019 11:18 PM

Any good solution to the first data set should work on this new data as well. My SAS/OR based code above does, of course .

PG

Rohit_1990 · Posted 02-21-2019 12:50 AM

Hi ,
Somehow the third part of code forming dataset DD is not getting correctly populated.

Column c2 is not getting populated.

Registration is open

SAS Training: Just a Click Away