New Contributor
Posts: 4

# Can I use arrays to capture "length" of multiple sequences of variables?

[ Edited ]

Dear All

I'm using SAS university edition. I'll outline my problem by first outlining my data.  I've used the array command to create a variable for each week (in 2008) displaying whether an observation lives in place A or B.

As my dataset is confidentiel I've created a quick look-a-like-version here.

As you see from the picture each observation can possibly move back and forth.

I'm wondering whether it's possible, using the array command, to create new variables for each sequence respectively and in that variable for the specific sequence report the length of the sequence.  In the example above there would at most be 3 new variables (lets call them sequence1, sequence2 and sequence3). For the first observation sequence1 = 2, sequence2 = 3 and sequence3= 2. For the second observation  sequence1 = 8, sequence2 = 0 and sequence3= 0.

So far i've only succeed in capturing the length of the first sequence by the following code

data newdata (compress=binary);
set olddata;
array weeks (52) \$2;

endB=.;
periodB=.;

if indicatorB = 1 then /* where indicatorB shows whether or not B is present within any of the weeks variables */
do i=startB to 52 by 1 UNTIL(spell(i) ne 'B'); /* where startB is a variable telling me in what week we first observe B (between 1-52) */
if weeks(i) in ('B') then endB = i;
end;
periodB=endB-startB+1;
run;

Hope you can help me update my code so  I can capture the different lengths of multiple sequences

Kind regards

Marius

Super User
Posts: 2,061

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

[ Edited ]

HI @mariusmorthorst  Good afternoon, If you could post the sample of what you HAVE and what you WANT as plain text, a lazy person like me would be interested to copy paste the data on to my sas software and do some program/testing. Thank you!

New Contributor
Posts: 4

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

[ Edited ]

hi @novinosrin

Is the following sufficient

What i have

 ID week 1 week 2 week 3 week 4 week 5 week 6 week 7 week 8 1 A A B B B A A A 2 B B B B B B B B 3 B B A A A A A A 4 A A A A B B B B

What i want
 ID week 1 week 2 week 3 week 4 week 5 week 6 week 7 week 8 sequence1 sequence2 sequence3 number of sequence 1 A A B B B A A A 2 3 3 3 2 B B B B B B B B 8 0 0 1 3 B B A A A A A A 2 6 0 2 4 A A A A B B B B 4 4 0 2

Super User
Posts: 2,061

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

[ Edited ]
``````data have;
input ID	(week1 	week2 	week3 	week4	week5	week6	week7	week8) (\$);
datalines;
1	A	A	B	B	B	A	A	A
2	B	B	B	B	B	B	B	B
3	B	B	A	A	A	A	A	A
4	A	A	A	A	B	B	B	B
;

data want;
set have;
array t(*) week:;
array s(*) seq1-seq3 ;
i=0;
_count=1;
do _n_=2 to dim(t);
if t(_n_)=t(_n_-1) then _count+1;
else  do;i=i+1;s(i)=_count;_count=1;continue;end;
if _n_=dim(t) then do;i=i+1;s(i)=_count;end;
end;
number_of_sequence=n(of s(*));
drop _: i;
run;``````

Yes you can use arrays. I like your thinking and at the same time I concur with other et al's recommendations too. Have fun and welcome to SAS forum. Cheers!

Super User
Posts: 24,003

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

I would recommend transposing your data to a long format, then the problem is trivial.

use the approach outlined here:

https://stats.idre.ucla.edu/sas/faq/how-can-i-create-an-enumeration-variable-by-groups/

With the modification of the NOTSORTED option on your BY statement so that the order is correct.

``````by id week value NOTSORTED;

if first.value then seq+1;
``````

You can easily do this with arrays as well, but it's harder because you'll have to account for almost all weeks, so you can create seq1-seq8 to hold the values but I see issues with using this later on.

Untested because no data

``````data want;
set have;
array week(8) week1-week8;
array seq(8);

index=1;
do i=2 to dim(week);
if week(i)=week(i-1) then seq(index)=seq(index)+1;
else do;
index+1;
seq(index) = 1;
end;
end;
run;``````
Posts: 5,624

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

[ Edited ]

This is very easy to do with SAS once your data is formatted properly. The proper format is long as opposed to wide, with columns ID, Week, and Place. With that format the sequence lengths will be given by :

``````proc sort data=have; by ID week; run;

data want;
do seqLength = 1 by 1 until(last.place);
set have; by id place notsorted;
end;
run;
``````
PG
Super User
Posts: 6,933

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

How many sequences do you want?  All?  Just the first 3?  It's possible that the number ranges from 1 to 52 for one ID.

Wouldn't sequence3 equal 3 (not 2) for the first ID?

New Contributor
Posts: 4

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

I would like as many sequences as the observation with the most changes has. As i have 52 weeks in the original dataset, this would in theory be the maximum. The data i put up was only a small part used to illustrate the problem at hand.

And yes your right for the first observation the second sequence should equal 3

Super User
Posts: 13,939

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

[ Edited ]

@mariusmorthorst wrote:

Dear All

I'm using SAS university edition. I'll outline my problem by first outlining my data.  I've used the array command to create a variable for each week (in 2008) displaying whether an observation lives in place A or B.

I'm wondering whether it's possible, using the array command, to create new variables for each sequence respectively and in that variable for the specific sequence report the length of the sequence.  In the example above there would at most be 3 new variables (lets call them sequence1, sequence2 and sequence3). For the first observation sequence1 = 2, sequence2 = 3 and sequence3= 2. For the second observation  sequence1 = 8, sequence2 = 0 and sequence3= 0.

What rule assigns values to those sequence variables? It isn't obvious to me. Why is sequence 3=2 for the first record, I might think 3 was more appropriate for the shown data.

And with SAS 9.4 I get an error from spell(I).

If you are counting sequential recurrence of the same value are you allowing for up to 52 sequence variables? (Assuming that you only have 52 weeks (365 or 366 days in a year yields 52 and a fraction weeks so 53 may be needed).

Super User
Posts: 10,849

## Re: Can I use arrays to capture "length" of multiple sequences of variables?

``````data have;
infile cards expandtabs;
input ID	(week1 	week2 	week3 	week4	week5	week6	week7	week8) (\$);
datalines;
1	A	A	B	B	B	A	A	A
2	B	B	B	B	B	B	B	B
3	B	B	A	A	A	A	A	A
4	A	A	A	A	B	B	B	B
;
proc transpose data=have out=temp;
by id;
var week:;
run;

proc summary data=temp;
by id col1 notsorted;
output out=temp1 ;
run;

proc transpose data=temp1 out=temp2 prefix=seq;
by id;
var _freq_;
run;

data temp3;
set temp2;
total=n(of seq:);
drop _name_;
run;

data want;
merge have temp3;
by id;
run;
proc print;run;``````
Discussion stats
• 9 replies
• 164 views
• 0 likes
• 7 in conversation