- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi everyone,
I keep getting errors when I run the following program. Dataset two has 5 variables, year, var1-var4 and 5 observations. The dataset one that I am trying to create should have 20 observations with variables year, gender, color and team. I want to split var1-var4 for for the two genders and 2 colors and do the same for each of the five years(2016-2012). But everytime i run this program i get an error that the array subscript is out of range. It feel like I am making a logic mistake somewhere but am not able to figure out where.
data one;
set two;
array tm (*) $ var1 var2 var3 var4;
do year=2016 to 2012 by -1;
do gender="Male","Female";
do color="Red","Green";
team=tm(year);
output;
end;
end;
end;
run;
What I want:
dataset one:
year gender color team
2016 male red var1 value
2016 male green var2 value
2016 female red var3 value
2016 female green var4 value
and so on.
Any input is much appreciated. Thanks!
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here you go:
data two; input year var1 $ var2 $ var3 $ var4 $; cards; 2016 john jack sarah susan 2015 ben bill britney anne 2014 chris peter monica robin 2013 jeff matt christie christine 2012 mike david dia diane ; run; data want (keep=year gender color team); set two; length gender color $20; array var{4}; i=1; do gender="male","female"; do color="red","green"; team=var{i}; output; i=i+1; end; end; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
This:
array tm (*) $ var1 var2 var3 var4;
do year=2016 to 2012 by -1;
There are five iterations: 2016, 2015, 2014, 2013, 2012, however there are only four elements var1-var4. Hence you get an out of range.
If you can post some sample test data in the form of a datastep, then I can supply some appropriate code to process it. Just from a guess, I would say that having a counter within the year do loop which goes from 1 to 4 would work.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks RW9. So my dataset two looks like this:
data two;
input year var1 $ var2 $ var3 $ var4 $;
cards;
2016 john jack sarah susan
2015 ben bill britney anne
2014 chris peter monica robin
2013 jeff matt christie christine
2012 mike david dia diane
;
run;
dataset one should look like this:
year gender color team
2016 male red John
2016 male green Jack
2016 female red Sarah
2016 female green Susan
Please let me know if I should provide more information. Thanks again
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Here you go:
data two; input year var1 $ var2 $ var3 $ var4 $; cards; 2016 john jack sarah susan 2015 ben bill britney anne 2014 chris peter monica robin 2013 jeff matt christie christine 2012 mike david dia diane ; run; data want (keep=year gender color team); set two; length gender color $20; array var{4}; i=1; do gender="male","female"; do color="red","green"; team=var{i}; output; i=i+1; end; end; run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much RW9. But if I could bother you just a little bit more, I don't understand how the same year is repeated 4 times. This is exactly what I want but I am not clear how that is happening. Also, shouldn't I have to specify that the array is character array? I apologize if I am asking too many questions or if my questions are too basic.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
No probs, for your two questions:
The array statement is a reference to variables in the dataset, if they do not exist then SAS needs to create them. In this instance however the variables var1-4 already exist and have their properties, so we only need the reference to them. If they did not exists then you would need to supply values or properties.
The year is not populated 4 times, what actually happens is that the data is written out 4 times due to the output in the two do loops. As we do not change the year value, it is the same at each output statement call - only at the next loop round the data set (i.e. when a set is encountered) does year change.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thank you so much for taking the time to explain. I really appreciate all your help.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Why not just read it in that way?
data want ;
input year @ ;
length gender color name $20;
do gender="male","female";
do color="red","green";
input name @ ;
output;
end;
end;
cards;
2016 john jack sarah susan
2015 ben bill britney anne
2014 chris peter monica robin
2013 jeff matt christie christine
2012 mike david dia diane
;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Besides the number of elements in the array, you have to consider how you will refer to them. You are using this reference:
tm(year)
When YEAR is 2016, that refers to the 2016th element of the array ... a far cry from your intention.
The easiest syntax for the array definition ... one that would let you refer to tm(year) ... would be:
array tm {2012:2016} $ var1-var5;
Now SAS will expect numbers in the range of 2012 to 2016, to refer to the five elements of the array.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Thanks Astounding. I can see that I need to be more careful when writing the programs.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
The reason why you get the message is obvious: the array TM is indexed 1 to 4, not 2012 to 2016. So an array entry like tm(2016) does not exist, and you get the error message.
Apart from that, you have another problem: your array has 4 elements, but you are trying to loop through 5 elements (indexes 2012, 2013, 2014 2015 and 2016).
So you need to remove a year from your loop, or come up with another variable for the last (or first?) year. Apart from that, the simple (but not much used) solution is to use another index for your array, e.g.:
data one;
set two;
array tm(2012:2016) var1-var5;
Now you can fetch an array element like tm(2013).