Hi guys i have below table i need to concatenate based first two variables
suggest me please.
data have;
input sno dnum b$;
datalines;
1 2 xy
1 3 zk
2 2 df
2 3 hj
2 3 gk
3 2 mn
3 3 jj
3 4 ll
3 5 kk
4 2 hh
4 3 ji
4 4 hl
4 4 jk
4 5 kl
;
run;
output like this:
1 | xy,zk |
2 | df,hj |
2 | df,gk |
3 | mn,jj,ll,kk |
4 | hh,ji,hl,kl |
4 | hh,ji,jk,kl |
Here is an easy solution to your 1st posted question at the top. If you are strong and familiar with Hashes, let us know. We can have some fun with that too to make it much shorter. However as of now, I do not know your comfort level. So going with simple approach->
data have;
input sno dnum b$;
datalines;
1 2 xy
1 3 zk
2 2 df
2 3 hj
2 3 gk
3 2 mn
3 3 jj
3 4 ll
3 5 kk
4 2 hh
4 3 ji
4 4 hl
4 4 jk
4 5 kl
;
run;
proc transpose data=have out=temp;
by sno dnum;
var b;
run;
data temp2;
set temp;
array c(*) col:;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=coalescec(of c(*));
end;
keep sno col:;
run;
proc transpose data=temp2 out=temp3;
by sno ;
var col:;
run;
data want;
set temp3;
by col: notsorted;
length want $50;
want=catx(',',of col:);
keep sno want;
run;
proc sort data=want out=final_want nodupkey;
by sno want;
run;
Here's one way:
https://gist.github.com/statgeek/d583cfa992bf56da51d435165b07e96a
Not the easiest but very dynamic.
Another approach uses CATX to append each item and LAST. to export on the last record.
untested, you can uncomment the second last line to get only the last entries.
data want;
set have;
by sno;
length combined $200.;
retain combined;
if first.sno then call missing(combined);
combined= catx(', ', strip(combined), b);
*if last.sno then output;
run;
sno | primarygroup1 | primarygroup2 | subgroup | subgroup | grouping logic |
1 | 2 | ||||
1 | 3 | 2,3 | |||
2 | 2 | ||||
2 | 3 | 2,3 | |||
2 | 3 | 2,3 | |||
3 | 2 | ||||
3 | 3 | ||||
3 | 4 | ||||
3 | 5 | 2,3,4,5 | |||
4 | 2 | ||||
4 | 3 | ||||
4 | 4 | ||||
4 | 4 | 2,3,4,5 | |||
5 | 2,3,4,5 |
Can you add some context around that blurb of data?
@kyath_sas wrote:
sno primarygroup1 primarygroup2 subgroup subgroup grouping logic 1 2 1 3 2,3 2 2 2 3 2,3 2 3 2,3 3 2 3 3 3 4 3 5 2,3,4,5 4 2 4 3 4 4 4 4 2,3,4,5 5 2,3,4,5
Here is an easy solution to your 1st posted question at the top. If you are strong and familiar with Hashes, let us know. We can have some fun with that too to make it much shorter. However as of now, I do not know your comfort level. So going with simple approach->
data have;
input sno dnum b$;
datalines;
1 2 xy
1 3 zk
2 2 df
2 3 hj
2 3 gk
3 2 mn
3 3 jj
3 4 ll
3 5 kk
4 2 hh
4 3 ji
4 4 hl
4 4 jk
4 5 kl
;
run;
proc transpose data=have out=temp;
by sno dnum;
var b;
run;
data temp2;
set temp;
array c(*) col:;
do _n_=1 to dim(c);
if missing(c(_n_)) then c(_n_)=coalescec(of c(*));
end;
keep sno col:;
run;
proc transpose data=temp2 out=temp3;
by sno ;
var col:;
run;
data want;
set temp3;
by col: notsorted;
length want $50;
want=catx(',',of col:);
keep sno want;
run;
proc sort data=want out=final_want nodupkey;
by sno want;
run;
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.