Learning SAS? Welcome to the exclusive online community for all SAS learners.

How do I use Array to organize dataset?

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

How do I use Array to organize dataset?

I am trying to take a raw data set and then format it into a chart. I am having no problem using the _n_ statements, but the array, i, if, then statements seem to be messing up. Here is an example that I have been working off of:

 

data example33;

input bp1 - bp12;

 

if _n_ < 4 then biofeed = 'P';

else biofeed = 'A';

 

if _n_ in (1, 4) then drug = 'X';

else if _n_ in (2, 5) then drug = 'Y';

else drug = 'Z';

 

array ex33[*] _numeric_

do i = 1 to dim(ex33);

if i > 6 then diet = 'Y';

else diet = 'N';

bp = ex33[i];

cell = drug||biofeed||diet;

output;

end;

drop i bp1 - bp12;

datalines;

170 175 165 180 160 158 161 173 157 152 181 190

186 194 201 215 219 209 164 166 159 182 187 174

180 187 199 170 204 194 162 184 183 156 180 173

173 194 197 190 176 198 164 190 169 164 176 175

189 194 217 206 199 195 171 173 196 199 180 203

202 228 190 206 224 204 205 199 170 160 179 179

;

proc print data=example33;

title 'Tabulation of data from EXAMPLE33 data set';

 

 

Unfortunately, when I translate it to my data, it does not turn out the same. Here is the program I have developed:

 

options nonumber nodate;

data HW4_F17; 

input number1 - number8;

 

if _n_ < 7 then Profession = 'Physician';

if _n_ > 12 then Profession = 'Engineer';

else Profession = 'Professor';

 

if _n_ in (1, 2, 7, 8, 13, 14) then Tenure = 'Rent';

if _n_ in (5, 6, 11, 12, 17, 18)then Tenure = 'Lease';

else Tenure = 'Own';

 

if _n_ in (1, 3, 5, 7, 9, 11, 13, 15, 17) then Response = 'Yes';

else Response = 'No';

 

array HW4[*];

do i = 1 to dim(HW4);

if i > 4 then Housingtype = 'Townhomes';

else Housingtype = 'Condominiums';

number = HW4[i];

output;

drop i;

end;

 

 

array HW4_F17[*] _numeric_;

do i = 1 to dim(HW4_F17);

if i < 2 then agegroup = 'Less than 30';

if i = 5 then agegroup = 'Less than 30';

if i = 2 then agegroup = '31 to 45';

if i = 6 then agegroup = '31 to 45';

if i = 3 then agegroup = '46 to 60';

if i = 7 then agegroup = '46 to 60';

if i = 4 then agegroup = '60 plus';

if i > 7 then agegroup = '60 plus';

number = HW4_F17[i];

output;

end;

drop i number1 - number8;

 

datalines;

18 15 6 8 34 10 2 3

15 13 9 10 28 4 6 9

5 3 1 2 56 56 35 45

1 1 1 3 12 21 8 15

16 18 6 17 37 11 3 22

12 10 3 9 23 9 1 14

17 10 15 13 29 3 7 8

34 17 19 23 44 13 16 15

2 0 3 1 23 52 49 25

3 2 0 2 9 31 51 19

15 18 6 8 32 11 3 7

14 9 3 10 23 11 1 5

30 23 21 18 22 13 11 10

25 19 40 30 25 16 12 16

8 5 1 6 54 191 102 95

4 2 2 3 19 76 61 54

10 21 8 11 29 10 4 11

11 9 4 7 21 9 2 13

;

 

proc print data=HW4_F17;

title 'Tabulation of data from HW4_F17 data set';

run;

 

Unfortunately when I run it, the array statements don't actually read the column. Basically, I am supposed to get something to look like the chart at the bottom of the Excel sheet I am attaching. 

 

Thank you!


Accepted Solutions
Solution
‎10-13-2017 06:52 PM
Super User
Posts: 11,558

Re: How do I use Array to organize dataset?

If the purpose of this exercise is to learn how to use arrays it is a poor example. The restriction on reading data from data lines is a tad odd but one way is to realize that the data rows are determined in a specific order so all of those _n_ statements are not needed. It does help to understand that INPUT is an executable statement and can be done conditionally. The second bit I would use is that the determined order of the rows allows using DO Loop indexes of specific values to read in order.

The example below only uses the array to generate the housingtype and agegroup variables. The drop statement drops all of the "number" variables as read.

I have left turning agegroup into the text you want. You could do that either working with the "do agegroup"  index values like the others or an If/then/else comparison, or a format, or a Select/when and likely a few other things that would work.

 

data HW4_F17; 
length Profession $ 9 Housingtype $ 11 tenure $ 5 Response $ 3;
array c condo1-condo4;
array t town1 - town4;
do profession = 'Physician','Professor','Engineer';
   do tenure ='Rent','Own','Lease';
      do response='Yes','No';
         input condo1-condo4 Town1-Town4;
         do agegroup=1 to 4;
            Housingtype='Condominium';
            Number = c[agegroup];
            output;
         end;
         do agegroup=1 to 4;
            Housingtype='Townhomes';
            Number = t[agegroup];
            output;
         end;
      end;
   end;
end;
drop condo: town:;
datalines;
18 15 6 8 34 10 2 3
15 13 9 10 28 4 6 9
5 3 1 2 56 56 35 45
1 1 1 3 12 21 8 15
16 18 6 17 37 11 3 22
12 10 3 9 23 9 1 14
17 10 15 13 29 3 7 8
34 17 19 23 44 13 16 15
2 0 3 1 23 52 49 25
3 2 0 2 9 31 51 19
15 18 6 8 32 11 3 7
14 9 3 10 23 11 1 5
30 23 21 18 22 13 11 10
25 19 40 30 25 16 12 16
8 5 1 6 54 191 102 95
4 2 2 3 19 76 61 54
10 21 8 11 29 10 4 11
11 9 4 7 21 9 2 13
;
run;

The order of variables in a data set from left right is normally based on the order they occur in the program code, so the Length statement puts Housingtype second even though it is not assigned until after reading the numeric values.

 

View solution in original post


All Replies
Solution
‎10-13-2017 06:52 PM
Super User
Posts: 11,558

Re: How do I use Array to organize dataset?

If the purpose of this exercise is to learn how to use arrays it is a poor example. The restriction on reading data from data lines is a tad odd but one way is to realize that the data rows are determined in a specific order so all of those _n_ statements are not needed. It does help to understand that INPUT is an executable statement and can be done conditionally. The second bit I would use is that the determined order of the rows allows using DO Loop indexes of specific values to read in order.

The example below only uses the array to generate the housingtype and agegroup variables. The drop statement drops all of the "number" variables as read.

I have left turning agegroup into the text you want. You could do that either working with the "do agegroup"  index values like the others or an If/then/else comparison, or a format, or a Select/when and likely a few other things that would work.

 

data HW4_F17; 
length Profession $ 9 Housingtype $ 11 tenure $ 5 Response $ 3;
array c condo1-condo4;
array t town1 - town4;
do profession = 'Physician','Professor','Engineer';
   do tenure ='Rent','Own','Lease';
      do response='Yes','No';
         input condo1-condo4 Town1-Town4;
         do agegroup=1 to 4;
            Housingtype='Condominium';
            Number = c[agegroup];
            output;
         end;
         do agegroup=1 to 4;
            Housingtype='Townhomes';
            Number = t[agegroup];
            output;
         end;
      end;
   end;
end;
drop condo: town:;
datalines;
18 15 6 8 34 10 2 3
15 13 9 10 28 4 6 9
5 3 1 2 56 56 35 45
1 1 1 3 12 21 8 15
16 18 6 17 37 11 3 22
12 10 3 9 23 9 1 14
17 10 15 13 29 3 7 8
34 17 19 23 44 13 16 15
2 0 3 1 23 52 49 25
3 2 0 2 9 31 51 19
15 18 6 8 32 11 3 7
14 9 3 10 23 11 1 5
30 23 21 18 22 13 11 10
25 19 40 30 25 16 12 16
8 5 1 6 54 191 102 95
4 2 2 3 19 76 61 54
10 21 8 11 29 10 4 11
11 9 4 7 21 9 2 13
;
run;

The order of variables in a data set from left right is normally based on the order they occur in the program code, so the Length statement puts Housingtype second even though it is not assigned until after reading the numeric values.

 

Occasional Contributor
Posts: 14

Re: How do I use Array to organize dataset?

Thank you so much! I think it was designed to teach Array statements, but everything I read just confused me more. I appreciate it!

Occasional Contributor
Posts: 14

Re: How do I use Array to organize dataset?

[ Edited ]

I'm sorry, I probably sound really dumb but I am not computer savvy and have only been using SAS for 2 months in an all online class. I am familiar with some of the if/then and format statements but not as much the "do" loops. I presume if I wanted to use the format it would be something like:

 

proc format;
value agegroup 1='>30'
2='31-45'
3='46-60'
4='60+';

 

or an if/then statement would be:

 

if i=1 then agegroup='>30';

if i=2 then agegroup='31-45';

if i=3 then agegroup='46-60';

if i=4 then agegroup='60+';

 

Can you tell me where I am going wrong? 

Again, thank you.

 

Super User
Posts: 20,224

Re: How do I use Array to organize dataset?

One recommendation - comment your code. It makes it really hard to reuse your code if you never comment it.

 


proc format;
value agegroup 1='>30'
2='31-45'
3='46-60'
4='60+';

 

This creates a format that maps a number to a group - it does not use it anywhere, but only creates it and makes it available for use. 

You would use it via a PUT statement. 

 

age_group = put(i, agegroup.);

 

Your If is correct, but I would change it to be IF/IF ELSE. This way once it finds the correct value it doesn't keep checking. 

 

 

if i=1 then agegroup='>30';
else if i=2 then agegroup='31-45';
else if i=3 then agegroup='46-60';
else if i=4 then agegroup='60+';
 
Occasional Contributor
Posts: 14

Re: How do I use Array to organize dataset?

This is what I came up with but now it just seems to be showing me another variable 'i'

data HW4_F17; 
length Profession $ 9 Housingtype $ 11 tenure $ 5 Response $ 3;
array c condo1-condo4;
array t town1 - town4;
do profession = 'Physician','Professor','Engineer';
   do tenure ='Rent','Own','Lease';
      do response='Yes','No';
         input condo1-condo4 Town1-Town4;
         do agegroup=1 to 4;
            Housingtype='Condominium';
            Number = c[agegroup];
            output;
         end;
         do agegroup=1 to 4;
            Housingtype='Townhomes';
            Number = t[agegroup];
            output;
         end;
		 do agegroup=1 to 4;
		 	if i=1 then agegroup='>30';
			else if i=2 then agegroup='31-45';
			else if i=3 then agegroup='46-60';
			else if i=4 then agegroup='60+';
		end;
      end;
   end;
end;
drop condo: town:;		
datalines;
18 15 6 8 34 10 2 3
15 13 9 10 28 4 6 9
5 3 1 2 56 56 35 45
1 1 1 3 12 21 8 15
16 18 6 17 37 11 3 22
12 10 3 9 23 9 1 14
17 10 15 13 29 3 7 8
34 17 19 23 44 13 16 15
2 0 3 1 23 52 49 25
3 2 0 2 9 31 51 19
15 18 6 8 32 11 3 7
14 9 3 10 23 11 1 5
30 23 21 18 22 13 11 10
25 19 40 30 25 16 12 16
8 5 1 6 54 191 102 95
4 2 2 3 19 76 61 54
10 21 8 11 29 10 4 11
11 9 4 7 21 9 2 13
;	

proc print data=HW4_F17;
	title 'Tabulation of data from HW4_F17 data set';
run;
Super User
Posts: 20,224

Re: How do I use Array to organize dataset?

You have no INPUT or SET statement so your procedure has no data. 

 

I highly suggest separating your steps. First read in your data. Then process it. Then display it. One baby step at a time. 

 

I also think you're not understanding how SAS processes data. It goes through it one line at a time. Arrays in SAS are not the same as arrays in other languages. 

 

Your original question has also been answered, I would suggest starting new questions for new topics. 


But, back to my original suggestion. When programming, breaks it down into steps that need to be done and then figure out how to do each step in order.

 

 

Super User
Posts: 20,224

Re: How do I use Array to organize dataset?

Out of curiosity, where did you find this example?

 

In general, I'm not a fan of hard coding data like this and prefer to bring it in other ways when possible. This is very dependent on your current data and can never be used anywhere else...

Super User
Posts: 11,558

Re: How do I use Array to organize dataset?

I absolutely agree with @Reeza that hard coding steps to read data is suboptimal in most uses.

But with the apparent requirement to remove some of the data and use inline data (the datalines block) the artificial restrictions make it the quick-and-dirtiest solution to a very small problem.

 

I will say that as many places as you were using _n_ there were many chances of indicating the wrong lines.

Occasional Contributor
Posts: 14

Re: How do I use Array to organize dataset?

This was an assignment for my introduction to research class that I very obviously got wrong. Cat Frustrated

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 428 views
  • 5 likes
  • 3 in conversation