Solved: Re: vertical concatenation based on grouping variables

kyath_sas · Posted 06-28-2018 03:22 PM

Hi guys i have below table i need to concatenate based first two variables

suggest me please.

data have;
input sno dnum b$;
datalines;
1 2 xy
1 3 zk
2 2 df
2 3 hj
2 3 gk
3 2 mn
3 3 jj
3 4 ll
3 5 kk
4 2 hh
4 3 ji
4 4 hl
4 4 jk
4 5 kl
;

run;

output like this:

1	xy,zk
2	df,hj
2	df,gk
3	mn,jj,ll,kk
4	hh,ji,hl,kl
4	hh,ji,jk,kl

novinosrin · Posted 06-28-2018 05:48 PM

Here is an easy solution to your 1st posted question at the top. If you are strong and familiar with Hashes, let us know. We can have some fun with that too to make it much shorter. However as of now, I do not know your comfort level. So going with simple approach->

data have;
input sno dnum  b$;
datalines;
1 2 xy
1 3 zk
2 2 df
2 3 hj
2 3 gk
3 2 mn
3 3 jj
3 4 ll
3 5 kk
4 2 hh
4 3 ji
4 4 hl
4 4 jk
4 5 kl
;

run;


proc transpose data=have out=temp;
by sno dnum;
var b;
run;
data temp2;
set temp;
array c(*) col:;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=coalescec(of c(*));
end;
keep sno col:;
run;

proc transpose data=temp2 out=temp3;
by sno  ;
var col:;
run;

data want;
set temp3;
by col: notsorted;
length want $50;
want=catx(',',of col:);
keep sno want;
run;

proc sort data=want out=final_want nodupkey;
by sno want;
run;

View solution in original post

Reeza · Posted 06-28-2018 03:33 PM

Here's one way:

https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a

Not the easiest but very dynamic.

Another approach uses CATX to append each item and LAST. to export on the last record.

untested, you can uncomment the second last line to get only the last entries.

data want;
set have;
by sno;
length combined $200.;
retain combined;
if first.sno then call missing(combined);

combined= catx(', ', strip(combined), b);

*if last.sno then output;
run;

kyath_sas · Posted 06-28-2018 04:24 PM

sno	primarygroup1	primarygroup2	subgroup	subgroup	grouping logic
1	2
1		3			2,3

2	2
2		3			2,3
2		3			2,3

3	2
3		3
3			4
3				5	2,3,4,5

4	2
4		3
4			4
4			4		2,3,4,5
				5	2,3,4,5

Reeza · Posted 06-28-2018 04:41 PM

Can you add some context around that blurb of data?

@kyath_sas wrote:

sno primarygroup1 primarygroup2 subgroup subgroup grouping logic

1 2

1 3 2,3

2 2

2 3 2,3

2 3 2,3

3 2

3 3

3 4

3 5 2,3,4,5

4 2

4 3

4 4

4 4 2,3,4,5

5 2,3,4,5

novinosrin · Posted 06-28-2018 05:48 PM

Here is an easy solution to your 1st posted question at the top. If you are strong and familiar with Hashes, let us know. We can have some fun with that too to make it much shorter. However as of now, I do not know your comfort level. So going with simple approach->

data have;
input sno dnum  b$;
datalines;
1 2 xy
1 3 zk
2 2 df
2 3 hj
2 3 gk
3 2 mn
3 3 jj
3 4 ll
3 5 kk
4 2 hh
4 3 ji
4 4 hl
4 4 jk
4 5 kl
;

run;


proc transpose data=have out=temp;
by sno dnum;
var b;
run;
data temp2;
set temp;
array c(*) col:;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=coalescec(of c(*));
end;
keep sno col:;
run;

proc transpose data=temp2 out=temp3;
by sno  ;
var col:;
run;

data want;
set temp3;
by col: notsorted;
length want $50;
want=catx(',',of col:);
keep sno want;
run;

proc sort data=want out=final_want nodupkey;
by sno want;
run;

kyath_sas · Posted 06-29-2018 01:02 AM

Thank you novinosrin:-)

Classroom Training Available!