About LinderVogel

FreelanceReinh · ‎03-22-2016

Hello @LinderVogel, What strikes me first is that in dataset M.CHAMPS there are two sets of 28 variables each, which could be more easily dealt with if the dataset had a vertical structure ("long" rather than "wide" format). But let's assume that you can't change M.CHAMPS. So, this dataset contains at least 57 variables (the above 2*28 plus ACTUALWEIGHTKG). With your code M.CHAMPS2 will have (at least) 202 variables, many of which are either exact copies or only slight modifications of other variables. The case of FREQWK1-FREQWK28, which are copies of the elements of array CMP_ALL_FREQ, is most striking. This is what I would call inefficient. Do you really need all these new variables or are you only aiming at the aggregated variables used in the PROC MEANS step at the end? If you needed only the latter, you could perform the calculations without creating most of the new arrays (all in one DO loop, as ballardw pointed out). Thus, the data step could shrink from 40 to less than 25 lines of code. Also, arrays METAB_WEIGHT and CAL_EXPEN_WK could be candidates for _temporary_ arrays in this case. (Alternatively, the constants in METAB_WEIGHT could be stored in a separate dataset.) Dataset M.CHAMPS2 would then have only 61 variables (plus variables not shown from M.CHAMPS).

Online Status	Offline
Date Last Visited	‎03-22-2016 03:15 PM

Code efficiency

Re: Code efficiency