Hi friends, I have a question to ask. When one runs an array job, the array's dimension-size value must be consecutive, like from 5-9 (5,6,7,8,9). I have a data set for schools with subgroups. which is not consecutive. Its value is 1,4,5,6,7,8,9,13,14,25, in long format. In order for the array to run,I have to create another variable, called group, to re-assign value to be consecutive, like, 1,2,3,4,5,6,7,8,9,10, (when subgroup=1, group=1, subgroup=4, group=2, subgroup=5, group=3, etc. Otherwise my array won't run properly. This program is to turn data file from long to wide, to control the output, to avoid using a transpose, which is done so in the below codes. Problem: After my data set was processed, I have to rename my output columns back to subgroup values, which is additional work and tedious. My question: is there some way that one can use original data' s value (nonconsecutive) to run an array and get the job done? *Note I have a data step to create a new var (group) before I can run my data step with an array; *Also the file is sorted by sch_id and group; Here is my program that is tested and working: DATA want_wide ; *create a new data file name to be in wide format; SET have_long; *read in the existing data file in long format; By sch_id ; *read in data by the order of sch_id-original data each school has 10 rows (or less) of data; KEEP sch_id EM_ELA_statusCY1 - EM_ELA_statusCY10; *keep output data's these column only; RETAIN sch_id EM_ELA_statusCY1 - EM_ELA_statusCY10 ; *keep output data's column in this order; ARRAY emstatus (1:10) $ EM_ELA_statusCY1 - EM_ELA_statusCY10 ; *process data by group value, from 1 to 10; *based on sh_id, do the data step by the order of group id value; IF first.sch_id THEN DO; DO i = 1 to 10 ; emstatus ( i ) = "" ; *output data is character type; *array's order is based on i's value; *above defaulted array index output to be missing (" "), when it has data,then it will be replaced; END;*finish array execution; END;* finish the data round beginning with sch_id do loop; emstatus ( group ) = EM_ELA_statusCY ; *this is the column to be transposed; * this will assign output new column name (EM subject=ELA) combined with group value for each new column; IF last.sch_id THEN OUTPUT ;*when reached the last sch_id, output data; RUN; *It worked, got the output as desired-long data set become wide, one school per row with 10 new columns; * After this I have to rename the 10 columns back to the subgroup value, say EM_ELA_statusCY8=EM_ELA_statusCY13, EM_ELA_statusCY9= EM_ELA_statusCY14 EM_ELA_statusCY10=EM_ELA_statusCY25,etc.; *Since I have 40 such columns, renaming them can be time consuming, the example only show 10 of them, there are three other sets of data, each set will produce 10 columns; Any idea to go around so I do not need to create a new group var, and after the job rename my output columns back to subgroups' value? Thank you. Curious Peter
... View more