Hello SAS Users,
Thank you for attending to my question.
I need your help in the following
I have a dataset that has the following data
ID | Cycle | Name |
11 | 1 | aa |
11 | 1 | aa |
11 | 1 | bb |
11 | 2 | aa |
11 | 2 | ss |
22 | 1 | dd |
22 | 1 | df |
22 | 2 | vv |
22 | 2 | aa |
22 | 3 | cc |
33 | 1 | df |
33 | 2 | vv |
33 | 3 | ss |
33 | 3 | cc |
33 | 4 | vv |
33 | 4 | ss |
44 | 1 | cc |
44 | 2 | vv |
44 | 2 | df |
44 | 2 | bb |
44 | 3 | aa |
44 | 4 | df |
I hope to get a output like below
ID | Cycle 1 | Cycle 2 | Cycle 3 | Cycle 4 |
11 | aa-bb | aa-ss | . | . |
22 | dd-df | vv-aa | cc | . |
33 | df | vv | ss-cc | vv-ss |
44 | cc | vv-df-bb | aa | df |
I want to know if we can get an output like this and some of ID have more than 4 Cycles so I am hoping if the code is flexible with that too.
Please Help.
Thank you for your time and help.
I hope you have a wonderful day.
Also, to be on the safe side, to remove duplicate names you could use the following instead of proc sort duprecs as follows:
data __have;
set have;
by id cycle name notsorted;
if first.name ;
run;
and then use __have as source with the code i gave you. This saves your order exactly as your sample
See if this helps;
data have;
input ID Cycle Name $;
datalines;
11 1 aa
11 1 aa
11 1 bb
11 2 aa
11 2 ss
22 1 dd
22 1 df
22 2 vv
22 2 aa
22 3 cc
33 1 df
33 2 vv
33 3 ss
33 3 cc
33 4 vv
33 4 ss
44 1 cc
44 2 vv
44 2 df
44 2 bb
44 3 aa
44 4 df
;
proc sql;
select max(cnt) into : array_limit
from
(select id, count(id) as cnt
from have
group by id);
quit;
%put &array_limit;
data want;
set have;
by id cycle;
array cycles(&array_limit) $;
retain cycles temp;
if first.id then call missing(of cycles(*));
if first.cycle and last.cycle then cycles(cycle)=name;
else if first.cycle and not last.cycle then temp=name;
else temp=catx('-',temp,name);
if not first.cycle and last.cycle then cycles(cycle)=temp;
if last.id;
drop temp cycle name;
run;
Hello @novinosrin
Thank you for taking time and helping.
Your solution to the problem worked to the most extent. Thank you for that. One other requirement was the repetitions of name variable was not dealt with. One observation may be repeated multiple times in the same variable and it should not be repetated while creating the new cycles variable.
Is that possible?
Please let me know.
Thank you for your help.
Oh do you mean you are referring to this?:
11 1 aa
11 1 aa
Remove the duplications and use _have
proc sort data=have out=_have noduprecs;
by id cycle name;
run;
Use _have as your source with the existing code
Also for array_limit, i think you can use a simpler one as follows
proc sql;
select count(distinct cycle) into :array_limit
from have;
quit;
Also, to be on the safe side, to remove duplicate names you could use the following instead of proc sort duprecs as follows:
data __have;
set have;
by id cycle name notsorted;
if first.name ;
run;
and then use __have as source with the code i gave you. This saves your order exactly as your sample
I tested with your sample, and here is my screenshot(attached)
Thank you for keeping the conversation steadfast back and forth. I really appreciate it. I will watch this thread with open eyes if you have any problems, so feel free to reach out anytime. Cheers!
Thank you for the help. If this is asking for more.
I was wondering if you can suggest an efficient way to scan through cycles1 - cycles8 and note for changes,
For Ex. if ID 11 has the same occurance under cycles1 - cycles4 then a New variable "Change" should be = 1 and if it changes twice then Change = 2 and so. on....
Can you please suggest an efficient way.
May be something like this? :
data check
set your_input;
array grp(*) cycles1-cycles8;
call missing(change);
do n=2 to dim(grp);
if grp(n) ne grp(n-1) then change+1;
end;
run;
EDIT: Mind you, since not all id's have values for all cycles, the check being true to missing value can cause a problem
so to counter this: we need to run the loop for the only those variables that have values in sequence, so here you go-
Please notice the change in the loop:
data check
set your_input;
array grp(*) cycles1-cycles8;
call missing(change);
k=cmiss(of grp(*));
do n=2 to dim(grp)-k;
if grp(n) ne grp(n-1) then change+1;
end;
drop k;
run;
That was the best way. Thank you very much.
Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.
Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.
Find more tutorials on the SAS Users YouTube channel.