Solved: How to concatenate Observations by ID of a variable

Schen · Posted 12-27-2017 10:17 PM

Hello SAS Users,

Thank you for attending to my question.

I need your help in the following

I have a dataset that has the following data

ID	Cycle	Name
11	1	aa
11	1	aa
11	1	bb
11	2	aa
11	2	ss
22	1	dd
22	1	df
22	2	vv
22	2	aa
22	3	cc
33	1	df
33	2	vv
33	3	ss
33	3	cc
33	4	vv
33	4	ss
44	1	cc
44	2	vv
44	2	df
44	2	bb
44	3	aa
44	4	df

I hope to get a output like below

ID	Cycle 1	Cycle 2	Cycle 3	Cycle 4
11	aa-bb	aa-ss	.	.
22	dd-df	vv-aa	cc	.
33	df	vv	ss-cc	vv-ss
44	cc	vv-df-bb	aa	df

I want to know if we can get an output like this and some of ID have more than 4 Cycles so I am hoping if the code is flexible with that too.

Please Help.

Thank you for your time and help.

I hope you have a wonderful day.

novinosrin · Posted 12-27-2017 11:34 PM

Also, to be on the safe side, to remove duplicate names you could use the following instead of proc sort duprecs as follows:

data __have;
set have;
by id cycle name notsorted;
if first.name ;
run;

and then use __have as source with the code i gave you. This saves your order exactly as your sample

View solution in original post

novinosrin · Posted 12-27-2017 10:47 PM

See if this helps;

data have;

input ID Cycle Name $;

datalines;

11 1 aa

11 1 bb

11 2 aa

11 2 ss

22 1 dd

22 1 df

22 2 vv

22 2 aa

22 3 cc

33 1 df

33 2 vv

33 3 ss

33 3 cc

33 4 vv

33 4 ss

44 1 cc

44 2 vv

44 2 df

44 2 bb

44 3 aa

44 4 df

;

proc sql;

select max(cnt) into : array_limit

from

(select id, count(id) as cnt

from have

group by id);

quit;

%put &array_limit;

data want;

set have;

by id cycle;

array cycles(&array_limit) $;

retain cycles temp;

if first.id then call missing(of cycles(*));

if first.cycle and last.cycle then cycles(cycle)=name;

else if first.cycle and not last.cycle then temp=name;

else temp=catx('-',temp,name);

if not first.cycle and last.cycle then cycles(cycle)=temp;

if last.id;

drop temp cycle name;

run;

Schen · Posted 12-27-2017 11:06 PM

Hello @novinosrin

Thank you for taking time and helping.

Your solution to the problem worked to the most extent. Thank you for that. One other requirement was the repetitions of name variable was not dealt with. One observation may be repeated multiple times in the same variable and it should not be repetated while creating the new cycles variable.

Is that possible?

Please let me know.

Thank you for your help.

novinosrin · Posted 12-27-2017 11:11 PM

Oh do you mean you are referring to this?:

11 1 aa

Schen · Posted 12-27-2017 11:12 PM

Yes @novinosrin. The newly created variable should not have repetitions of same name.

novinosrin · Posted 12-27-2017 11:16 PM

Remove the duplications and use _have

proc sort data=have out=_have noduprecs;
by id cycle name;
run;

Use _have as your source with the existing code

Also for array_limit, i think you can use a simpler one as follows

proc sql;
select count(distinct cycle) into :array_limit
from have;
quit;

Schen · Posted 12-27-2017 11:33 PM

Hi @novinosrin,

I Tried running it and it doesnot seem to work as expected. Sorry.

novinosrin · Posted 12-27-2017 11:34 PM

Also, to be on the safe side, to remove duplicate names you could use the following instead of proc sort duprecs as follows:

data __have;
set have;
by id cycle name notsorted;
if first.name ;
run;

and then use __have as source with the code i gave you. This saves your order exactly as your sample

novinosrin · Posted 12-27-2017 11:39 PM

I tested with your sample, and here is my screenshot(attached)

Schen · Posted 12-27-2017 11:48 PM

The new code you provided worked more efficiently. Thank you very much. You saved my day.

novinosrin · Posted 12-27-2017 11:56 PM

Thank you for keeping the conversation steadfast back and forth. I really appreciate it. I will watch this thread with open eyes if you have any problems, so feel free to reach out anytime. Cheers!

Schen · Posted 12-28-2017 12:04 AM

Thank you for the help. If this is asking for more.

I was wondering if you can suggest an efficient way to scan through cycles1 - cycles8 and note for changes,

For Ex. if ID 11 has the same occurance under cycles1 - cycles4 then a New variable "Change" should be = 1 and if it changes twice then Change = 2 and so. on....

Can you please suggest an efficient way.

novinosrin · Posted 12-28-2017 12:10 AM

May be something like this? :

data check

set your_input;

array grp(*) cycles1-cycles8;

call missing(change);

do n=2 to dim(grp);

if grp(n) ne grp(n-1) then change+1;

end;

run;

EDIT: Mind you, since not all id's have values for all cycles, the check being true to missing value can cause a problem

so to counter this: we need to run the loop for the only those variables that have values in sequence, so here you go-

Please notice the change in the loop:

data check

set your_input;

array grp(*) cycles1-cycles8;

call missing(change);

k=cmiss(of grp(*));

do n=2 to dim(grp)-k;

if grp(n) ne grp(n-1) then change+1;

end;

drop k;

run;

Schen · Posted 12-28-2017 12:17 AM

That was the best way. Thank you very much.

Schen · Posted 12-28-2017 12:19 AM

I will check to make sure of any errors thank you.

How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Re: How to concatenate Observations by ID of a variable

Registration is open