BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Schen
Fluorite | Level 6

Hello SAS Users,

 

Thank you for attending to my question.

 

I need your help in the following

 

I have a dataset that has the following data

 

IDCycleName
111aa
111aa
111bb
112aa
112ss
221dd
221df
222vv
222aa
223cc
331df
332vv
333ss
333cc
334vv
334ss
441cc
442vv
442df
442bb
443aa
444df

 

I hope to get a output like below

IDCycle 1Cycle 2Cycle 3Cycle 4
11aa-bbaa-ss..
22dd-dfvv-aacc.
33dfvvss-ccvv-ss
44ccvv-df-bbaadf

 I want to know if we can get an output like this  and some of ID have more than 4 Cycles so I am hoping if the code is flexible with that too.

 

Please Help.

 

Thank you for your time and help.

 

I hope you have a wonderful day.

1 ACCEPTED SOLUTION

Accepted Solutions
novinosrin
Tourmaline | Level 20

Also, to be on the safe side, to remove duplicate names you could use the following instead of proc sort duprecs as follows:

 

data __have;
set have;
by id cycle name notsorted;
if first.name ;
run;

 

and then use __have as source with the code i gave you. This saves your order exactly as your sample

View solution in original post

14 REPLIES 14
novinosrin
Tourmaline | Level 20

See if this helps;

 

data have;

input ID   Cycle Name $;

datalines;

11   1    aa

11   1    aa

11   1    bb

11   2    aa

11   2    ss

22   1    dd

22   1    df

22   2    vv

22   2    aa

22   3    cc

33   1    df

33   2    vv

33   3    ss

33   3    cc

33   4    vv

33   4    ss

44   1    cc

44   2    vv

44   2    df

44   2    bb

44   3    aa

44   4    df

;

 

proc sql;

select max(cnt)  into : array_limit

from

(select id, count(id) as cnt

from have

group by id);

quit;

 

%put &array_limit;

 

data want;

set have;

by id cycle;

array cycles(&array_limit) $;

retain cycles temp;

if first.id then call missing(of cycles(*));

if first.cycle and last.cycle then cycles(cycle)=name;

else if first.cycle and not last.cycle then temp=name;

else temp=catx('-',temp,name);

if not first.cycle and last.cycle then cycles(cycle)=temp;

if last.id;

drop temp cycle name;

run;

 

Schen
Fluorite | Level 6

Hello @novinosrin

 

Thank you for taking time and helping.

 

Your solution to the problem worked to the most extent. Thank you for that. One other requirement was the repetitions of name variable was not dealt with. One observation may be repeated multiple times in the same variable and it should not be repetated while creating the new cycles variable. 

Is that possible?

 

Please let me know.

Thank you for your help.

novinosrin
Tourmaline | Level 20

Oh do you mean you are referring to this?:

 

11   1    aa

11   1    aa

Schen
Fluorite | Level 6
Yes @novinosrin. The newly created variable should not have repetitions of same name.
novinosrin
Tourmaline | Level 20

Remove the duplications and use _have 

 

proc sort data=have out=_have noduprecs;
by id cycle name;
run;

 

Use _have as your source with the existing code

 

Also for array_limit, i think you can use a simpler one as follows


proc sql;
select count(distinct cycle) into :array_limit
from have;
quit;

Schen
Fluorite | Level 6
Hi @novinosrin,

I Tried running it and it doesnot seem to work as expected. Sorry.
novinosrin
Tourmaline | Level 20

Also, to be on the safe side, to remove duplicate names you could use the following instead of proc sort duprecs as follows:

 

data __have;
set have;
by id cycle name notsorted;
if first.name ;
run;

 

and then use __have as source with the code i gave you. This saves your order exactly as your sample

novinosrin
Tourmaline | Level 20

I tested with your sample, and here is my screenshot(attached)

Schen
Fluorite | Level 6
The new code you provided worked more efficiently. Thank you very much. You saved my day.
novinosrin
Tourmaline | Level 20

Thank you for keeping the conversation steadfast back and forth. I really appreciate it. I will watch this thread with open eyes if you have any problems, so feel free to reach out anytime. Cheers!

Schen
Fluorite | Level 6

Thank you for the help. If this is asking for more.

 

I was wondering if you can suggest an efficient way to scan through cycles1 - cycles8 and note for changes, 

 

For Ex. if ID 11 has the same occurance under cycles1 - cycles4 then a New variable "Change" should be = 1 and if it changes twice then Change = 2 and so. on....

 

Can you please suggest an efficient way.

novinosrin
Tourmaline | Level 20

May be something like this? :

 

data check

set your_input;

array grp(*) cycles1-cycles8;

call missing(change);

do n=2 to dim(grp);

 if grp(n) ne grp(n-1) then change+1;

end;

run;

 

EDIT: Mind you, since not all id's have values for all cycles, the check being true to missing value can cause a problem 

 

 so to counter this: we need to run the loop for the only those variables that have values in sequence, so here you go-

 

Please notice the change in the loop:

 

data check

set your_input;

array grp(*) cycles1-cycles8;

call missing(change);

k=cmiss(of grp(*));

do n=2 to dim(grp)-k;

 if grp(n) ne grp(n-1) then change+1;

end;

drop k;

run;

 

 

Schen
Fluorite | Level 6

That was the best way. Thank you very much.

Schen
Fluorite | Level 6
I will check to make sure of any errors thank you.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 14 replies
  • 1631 views
  • 4 likes
  • 2 in conversation