BookmarkSubscribeRSS Feed
LinderVogel
Calcite | Level 5

Hello everyone, I would like you to take a look at this code and tell me if you consider it to be an effiecent way to write out the variables and fomulas. Thanks.

 



DATA M.CHAMPS2;
SET M.CHAMPS;

ARRAY CMP_ALL {*} CHAMPS7HOURS CHAMPS9HOURS CHAMPS10HOURS CHAMPS14HOURSNEW CHAMPS15HOURSNEW CHAMPS16HOURS CHAMPS19HOURS CHAMPS20HOURS CHAMPS21HOURS CHAMPS22HOURS CHAMPS23HOURS CHAMPS24HOURS CHAMPS25HOURS CHAMPS26HOURS CHAMPS27HOURS CHAMPS28HOURS CHAMPS29HOURS CHAMPS30HOURS CHAMPS31HOURS CHAMPS32HOURS CHAMPS33HOURSNEW CHAMPS34HOURS CHAMPS35HOURS CHAMPS36HOURS CHAMPS37HOURS CHAMPS38HOURS CHAMPS39HOURS CHAMPS40HOURS;

ARRAY CMP_ALL_FREQ {*} CHAMPS7TIMES CHAMPS9TIMES CHAMPS10TIMES CHAMPS14TIMESNEW CHAMPS15TIMESNEW CHAMPS16TIMES CHAMPS19TIMES CHAMPS20TIMES CHAMPS21TIMES CHAMPS22TIMES CHAMPS23TIMES CHAMPS24TIMES CHAMPS25TIMES CHAMPS26TIMES CHAMPS27TIMES CHAMPS28TIMES CHAMPS29TIMES CHAMPS30TIMES CHAMPS31TIMES CHAMPS32TIMES CHAMPS33TIMESNEW CHAMPS34TIMES CHAMPS35TIMES CHAMPS36TIMES CHAMPS37TIMES CHAMPS38TIMES CHAMPS39TIMES CHAMPS40TIMES;

DO I=1 TO 28;
SELECT(CMP_ALL(I));
WHEN (.) CMP_ALL(I)=0;
WHEN (1) CMP_ALL(I)=.5;
WHEN (2) CMP_ALL(I)=1.75;
WHEN (3) CMP_ALL(I)=3.75;
WHEN (4) CMP_ALL(I)=5.75;
WHEN (5) CMP_ALL(I)=7.75;
WHEN (6) CMP_ALL(I)=9.75;
OTHERWISE CMP_ALL(I)=9.75;
END;
END;

ARRAY METAB_WEIGHT{28} (4.5 3.0 2.0 6.0 4.0 4.5 3.0 2.5 4.0 2.25 3.0 7.0 6.0 3.5 2.5 2.5 4.0 5.0 3.0 5.0 3.0 2.0 2.0 3.5 4.5 3.0 2.5 5.0);
ARRAY CAL_EXPEN_WK {28};
DO I=1 TO 28;
CAL_EXPEN_WK(I)=CMP_ALL(I)*METAB_WEIGHT(I)*3.5*60*ACTUALWEIGHTKG/200;
END;
CAL_EXPEN_WKTOTAL= SUM(OF CAL_EXPEN_WK1-CAL_EXPEN_WK28);


ARRAY FREQWK {28};
DO I=1 TO 28;
FREQWK(I)=CMP_ALL_FREQ(I);
END;
FREQWKTOTAL=SUM(OF FREQWK1-FREQWK28);

ARRAY CAL_MOD {28};
DO I=1 TO 28;
IF METAB_WEIGHT[I]>=3 THEN CAL_MOD(I)=CAL_EXPEN_WK(I);
ELSE CAL_MOD(I)=0;
END;
CAL_MODTOT= SUM(OF CAL_MOD1-CAL_MOD28);

ARRAY CAL_MODFREQ {28};
DO I=1 TO 28;
IF METAB_WEIGHT[I]>=3 THEN CAL_MODFREQ(I)=FREQWK(I);
ELSE CAL_MODFREQ(I)=0;
END;
CAL_MODFREQTOT= SUM(OF CAL_MODFREQ1-CAL_MODFREQ28);
RUN;

PROC MEANS DATA=M.CHAMPS2 N MEAN STD SKEW;
VAR CAL_EXPEN_WKTOTAL FREQWKTOTAL CAL_MODTOT CAL_MODFREQTOT;
RUN;

4 REPLIES 4
ballardw
Super User

"Effencieny" could mean a number of things: run time, coding time and maintenance, disk space, other resources perhaps.

 

The only thing I see that would be obviously inefficient is that you have multiple loop of the Do i = 1 to 28; variety.

If I were worried about run time I might have them all in one loop and then calculate all the summaries after that loop but there isn't likely to be a noticeable difference in run time. Removing the 8 lines of code doesn't make the code noticeably smaller. One loop would be slightly easier to maintain if you add or remove items from the arrays. I might switch to use: Do i = 1 to dim(<any of the arrays of the same size>); instead of the literal 28.

Astounding
PROC Star

Here's a line that can be eliminated:

 

WHEN (6) CMP_ALL(I)=9.75;

 

The OTHERWISE condition would take care of it.  That being said, that's a tiny change.  You might not even be able to measure the difference. 

 

Here is a more constructive idea.  You can can cut down on the number of loops, and increase program readability at the same time.  Put all your array statements up front.  Then consider this variation:

 

DO I=1 TO 28;
   FREQWK(I)=CMP_ALL_FREQ(I);

   IF METAB_WEIGHT[I]>=3 THEN DO;

       CAL_MOD(I)=CAL_EXPEN_WK(I);

       CAL_MODFREQ(I)=FREQWK(I);

   END;

   ELSE DO;

      CAL_MOD(I)=0;

      CAL_MODFREQ(I)=0;

   END;
END;
FREQWKTOTAL=SUM(OF FREQWK1-FREQWK28);
CAL_MODTOT= SUM(OF CAL_MOD1-CAL_MOD28);
CAL_MODFREQTOT= SUM(OF CAL_MODFREQ1-CAL_MODFREQ28);

 

 EDITED:  In retrospect, this is essentially the same as ballardw's suggestion ... just spelling out the coding changes.

FreelanceReinh
Jade | Level 19

Hello @LinderVogel,

 

What strikes me first is that in dataset M.CHAMPS there are two sets of 28 variables each, which could be more easily dealt with if the dataset had a vertical structure ("long" rather than "wide" format).

 

But let's assume that you can't change M.CHAMPS. So, this dataset contains at least 57 variables (the above 2*28 plus ACTUALWEIGHTKG). With your code M.CHAMPS2 will have (at least) 202 variables, many of which are either exact copies or only slight modifications of other variables. The case of FREQWK1-FREQWK28, which are copies of the elements of array CMP_ALL_FREQ, is most striking. This is what I would call inefficient.

 

Do you really need all these new variables or are you only aiming at the aggregated variables used in the PROC MEANS step at the end?

 

If you needed only the latter, you could perform the calculations without creating most of the new arrays (all in one DO loop, as ballardw pointed out). Thus, the data step could shrink from 40 to less than 25 lines of code. Also, arrays METAB_WEIGHT and CAL_EXPEN_WK could be candidates for _temporary_ arrays in this case. (Alternatively, the constants in METAB_WEIGHT could be stored in a separate dataset.) Dataset M.CHAMPS2 would then have only 61 variables (plus variables not shown from M.CHAMPS).

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to connect to databases in SAS Viya

Need to connect to databases in SAS Viya? SAS’ David Ghan shows you two methods – via SAS/ACCESS LIBNAME and SAS Data Connector SASLIBS – in this video.

Find more tutorials on the SAS Users YouTube channel.

Discussion stats
  • 4 replies
  • 1026 views
  • 0 likes
  • 5 in conversation