How to combine the below two data step into one? When I run this program, I got the error as
ERROR: Undeclared array referenced: minimum_distance.
ERROR: Variable minimum_distance has not been declared as an array.
data summary1 (drop=i);
set summary;
If distance1 > 0 or distance2 > 0 or distance3 > 0 or distance4 > 0 or distance5 > 0 or distance6 > 0 or distance7 > 0 or distance8 > 0 or
distance9 > 0 or distance10 > 0;
array minimum_distance{3};
do i=1 to 3;
minimum_distance{i}=smallest(i,distance1,distance2,distance3,distance4,distance5,distance6,distance7,distance8,distance9,distance10);
end;
run;
data summary2 (drop=i);
set summary1;
array nearest_distance{3};
do i=1 to 3;
nearest_distance{i}=whichn(minimum_distance{i},distance1-distance10);
new_nearest_distance{i}=put(nearest_distance{i} Nearest_branch.);
end;
run;
Also can we use a format statement within do loop?
a) a format statement is valid for the whole data step and is "executed" (actually, the attributes are set before data step logic starts) along with length or label statements before everything else.
b) an array is a temporary construct internal to the data step that is not written to a data set. Therefore you cannot find it in the second data step. if you want to preserver its values, you need to supply it with non-temporary variable names:
array minimum_distance{3} minimum_distance1-minimum_distance3;
Repeat that array statement in the second data step to reassign the array.
a.I got error as follows when I used a format statement in the do loop.
18 array nearest_distance{3};
19 array minimum_distance{3};
20 *array new_nearest_distance{3};
21 do i=1 to 3;
22 nearest_distance{i}=whichn(minimum_distance{i},distance1,distance2,distance3,distance4,distance5,distance6,distance7,distance8,distance9,distance10);
23 format nearest_distance{i} Nearest_branch.;
_
85
76
ERROR 85-322: Expecting a format name.
ERROR 76-322: Syntax error, statement will be ignored.
My user defined format is,
proc format;
value Nearest_branch
1='Carrollton'
2='Cedar_Hill'
3='Garland'
4='Houston_Jones'
5='Houston_Oak'
6='Irving'
7='Mesquite'
8='One_Arts_Plaza'
9='R1_Tower'
10='Southside';
run;
b.please suggest me the following in arrays as well as i'm not sure to accomplish it with 'geodist' function.
distance1=geodist(N_LAT1,N_LONG1,LAT1,LONG1,'m');
distance2=geodist(N_LAT1,N_LONG1,LAT2,LONG2,'m');
distance3=geodist(N_LAT1,N_LONG1,LAT3,LONG3,'m');
distance4=geodist(N_LAT1,N_LONG1,LAT4,LONG4,'m');
distance5=geodist(N_LAT1,N_LONG1,LAT5,LONG5,'m');
distance6=geodist(N_LAT1,N_LONG1,LAT6,LONG6,'m');
distance7=geodist(N_LAT1,N_LONG1,LAT7,LONG7,'m');
distance8=geodist(N_LAT1,N_LONG1,LAT8,LONG8,'m');
distance9=geodist(N_LAT1,N_LONG1,LAT9,LONG9,'m');
distance10=geodist(N_LAT1,N_LONG1,LAT10,LONG10,'m');
As I said, a format is assigned while the data step is being compiled BEFORE execution. You cannot assign a format dynamically, so format xxxx{i} someformat.; cannot work. Keep in mind that i has no value when the data step is compiled.
You can only assign formats to the elements of the array, like
format minimum_distance1-minimum_distance3 someformat.;
For the format statement you can reference the variable name, e.g. an array is just the number after the array name. So temp{4} is temp1 temp2 temp3 temp4. Do the format statement outside the loop:
format temp1-temp4 best.;
For your second point replace the number with the array and loop:
data want;
set have;
array distance{10};
array lat{10};
array long{10};
do I=1 to 10;
distance{I}=geodist(N_LAT1,N_LONG1,LAT{I},LONG{I},'m');
end;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.