BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
PierreYvesILY
Pyrite | Level 9

hi,

 

I use a long programm to download, process, and export data, on a monthly basis. I want to use this programm several times to cover several months, one after the other, and export a combined file gathering all monthly data.

 

For instance, I now have a result : month n / structure / data 1..x

Target is:

month n / structure / data 1..x

month n+1 / structure / data 1..x

...

month n+i / structure / data 1..x,   i being the number of selected months.

 

How can I achieve this?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

@PierreYvesILY wrote:

hello,

 

thanks for your answer.

 

1. The names of the variables with &Jahr11. and so forth are created exactly for the purpose of copying the results in separate columns in Excel afterwards.

2. I have to calculate evolutions in % between both years, so I build yearly variables to do so :  (Var&Jahr11.- Var &Jahr12.)=Delta &Jahr11. in %

If you can suggest a way to calculate the Delta with only one variable + an additional variable Jahr, I'd be pleased to use it.

It would simplify the code a LOT.

 

I don't understand what you mean by: proc summary and a data step with array code 

 

thanks and have a nice day

PY


Would have to see some example INPUT data to see what is possible.

 

Proc summary (or means) will group data and do sums (or other statistics) into the same variable with less fuss than SQL.

proc summary data=  Liste_teil_5 nway;
   class STUFE MARKTREGION_BT NL_BT NLBEZ_BT MBRBEZ_BT;
   var  ORB_Anzahl_Akt_&Jahr11.  ORB_TVOL_Akt_&Jahr11.
        ORB_Anzahl_Akt_&Jahr11.    /* continue with all the 
                                   variables that need sums*/
   ;
   output out =temp  sum= ;
run;

If it happens that you need every single one of your variables that start with ORB to be summed the above could reduce the VAR statement to something like:

 

   var ORB: ;

as proc summary/ means can use variable lists.

proc summary data=  Liste_teil_5 nway;
   class STUFE MARKTREGION_BT NL_BT NLBEZ_BT MBRBEZ_BT;
   var  ORB: ngs: ;  
   output out =temp  sum= ;
run;

The above would sum ALL numeric variables whose names start with ORB and NGS and place the sum in a variable of the same name in the output set.

 

Then a data step similar to:

data want;
   set temp;
   /* calculate the rates*/
   array n <list the numerator variables>;
   array d <list the denominator variables in corresponding order>;
   array r <list the results variables of the rate calculations>;
   /* this do loop */
   do i= 1 to dim(n);
      r[i] = n[i] / d[i];
   end;
   /* would replace all of these */
   ORB_Anzahl_VZK_&Jahr11. =  ORB_Anzahl_Akt_&Jahr11./Anzahl_VZK_&Jahr11.;

   format orb: NGS: commax17.;
run;

  I wasn't about to copy out all of your variable names.

The ability to use variable lists in data step code can make things much simpler. If you need to process columns that appear in order you can use a list with two dashes in an array definition (or the var statement in Proc Summary / means).

Example: if the variables ABC ABD ABQ BDD BDE are adjacent and I want to process them in some way in that order I could use an array definition of : array v   ABC -- BDE;

Also not the variable list in the Format statement much simpler. Or if I had a lot of variables to assign a format I might be tempted to use Proc Datasets after the set is created to assign the format instead of having so many format commax17. scattered through the code.

View solution in original post

18 REPLIES 18
PaigeMiller
Diamond | Level 26

Why would month N have to be processed before month N+1? Can't you just process all the months together and then export one file?

--
Paige Miller
PierreYvesILY
Pyrite | Level 9

No, it would be a cumulative File as a result, which I already process also.

The program selects data for a certain time period, and then for each variable the data are summed, or other calculations take place.

 

The programm has 3 main blocs where the data downloaded form diffrent sources are processed this way. So, I have for each block to select the data of the 1st month, then process, then store the results. Afterwards, I want to do the same for the next month.

After all this, I need to merge all tables obtained and process the export.

 

My goal is to edit i monthly data stands for each structure of the company.

i.e: if i=6 months => 6 lines, and each line having the result of each variable for each month.

 

I'm not familiar with SAS or programming, I first started to learn SAS programming in August last year.

 

Thanks for your help

PaigeMiller
Diamond | Level 26

Nothing you have said, if I am understanding you properly, makes it necessary to perform the step separaetly for each month. SAS has many ways to do one big analysis, BY MONTH, so you don't have the run the code many times; you run it once and let SAS break it up.

--
Paige Miller
PierreYvesILY
Pyrite | Level 9

I've no idea how to do this with the data I have (month has to be added to the data), and break it correctly down afterwards.

 

For instance, I have this step, that has to be run for each month and for each level of the company structure (the levels are quoted in the group option below):

 

/* Aggregation und Quoten-Neu Berechnung für FIL */

proc sql;

create table FIL2 as

select STUFE ,

 

MARKTREGION_BT,

NL_BT,

NLBEZ_BT,

MBRBEZ_BT,

/* '' as FILHB_BT length=5, */

FILBEZ_BT length=35,

sum(ORB_Anzahl_Akt_&Jahr11.) as ORB_Anzahl_Akt_&Jahr11. format commax17.,

sum(ORB_TVOL_Akt_&Jahr11.) as ORB_TVOL_Akt_&Jahr11. format commax17.,

sum(ORB_Anzahl_Akt_&Jahr11.) / sum(Anzahl_VZK_&Jahr11.) as ORB_Anzahl_VZK_&Jahr11. format commax17.1,

sum(ORB_Anzahl_Gen_&Jahr11.) as ORB_Anzahl_Gen_&Jahr11. format commax17.,

sum(ORB_TVOL_Gen_&Jahr11.) as ORB_TVOL_Gen_&Jahr11. format commax17.,

sum(ORB_Anzahl_Gen_&Jahr11.) / sum(ORB_Anzahl_Akt_&Jahr11.) as ORB_GEN_Quote_&Jahr11. format commax17.3,

(case

when (sum(ORB_kein_Fino_&Jahr11.) = 0 and sum(ORB_Anzahl_Akt_&Jahr11.) not in (0,.)) then 1

when (sum(ORB_kein_Fino_&Jahr11.) = . and sum(ORB_Anzahl_Akt_&Jahr11.) not in (0,.)) then 1

else (1 - (sum(ORB_kein_Fino_&Jahr11.) / sum(ORB_Anzahl_Akt_&Jahr11.)))

end)

as ORB_FINO_QUOTE_&Jahr11. format commax17.3,

sum(NGS_Anzahl_&Jahr11.) as NGS_Anzahl_&Jahr11. format commax17.,

sum(NGS_NGV_&Jahr11.) as NGS_NGV_&Jahr11. format commax17.,

sum(NGS_NGV_&Jahr11.) / sum(NGS_Anzahl_&Jahr11.) as NGS_TICKET_AVG_&Jahr11. format commax17.1,

sum(NGS_RSV_Anz_&Jahr11.) / sum(NGS_Anzahl_&Jahr11.) as RSV_QUOTE_&Jahr11. format commax17.3,

sum(NGS_Anzahl_&Jahr11.) / sum(ORB_Anzahl_Gen_&Jahr11.) as NGS_Auszahlungsquote_&Jahr11. format commax17.3,

sum(Anzahl_VZK_&Jahr11.) as Anzahl_VZK_&Jahr11. format commax17.2,

sum(ORB_kein_Fino_&Jahr11.) as ORB_kein_Fino_&Jahr11. format commax17.,

sum(NGS_RSV_Anz_&Jahr11.) as NGS_RSV_Anz_&Jahr11. format commax17.,

sum(ORB_Anzahl_Akt_Online_&Jahr11.) as ORB_Anzahl_Akt_Online_&Jahr11. format commax17.,

sum(ORB_TVOL_Akt_Online_&Jahr11.) as ORB_TVOL_Akt_Online_&Jahr11. format commax17.,

sum(ORB_Anzahl_Gen_Online_&Jahr11.) as ORB_Anzahl_Gen_Online_&Jahr11. format commax17.,

sum(ORB_TVOL_Gen_Online_&Jahr11.) as ORB_TVOL_Gen_Online_&Jahr11. format commax17.,

sum(NGS_Anzahl_Online_&Jahr11.) as NGS_Anzahl_Online_&Jahr11. format commax17.,

sum(NGS_NGV_Online_&Jahr11.) as NGS_NGV_Online_&Jahr11. format commax17.,

sum(NGS_Anzahl_Online_&Jahr11.) / sum(ORB_Anzahl_Gen_Online_&Jahr11.) as NGS_Auszahlungsquote_Online_&Jahr11. format commax17.3,

sum(NGS_Anzahl_&Jahr11.) / sum(Anzahl_VZK_&Jahr11.) as NGS_Anzahl_VZK_&Jahr11. format commax17.1,

sum(ORB_Anzahl_Gruen_&Jahr11.) / sum(ORB_Anzahl_Gen_&Jahr11.) as ORB_GRUEN_Quote_&Jahr11. format commax17.3,

sum(ORB_Anzahl_Akt_&Jahr12.) as ORB_Anzahl_Akt_&Jahr12. format commax17.,

sum(ORB_TVOL_Akt_&Jahr12.) as ORB_TVOL_Akt_&Jahr12. format commax17.,

sum(ORB_Anzahl_Akt_&Jahr12.) / sum(Anzahl_VZK_&Jahr12.) as ORB_Anzahl_VZK_&Jahr12. format commax17.1,

sum(ORB_Anzahl_Gen_&Jahr12.) as ORB_Anzahl_Gen_&Jahr12. format commax17.,

sum(ORB_TVOL_Gen_&Jahr12.) as ORB_TVOL_Gen_&Jahr12. format commax17.,

sum(ORB_Anzahl_Gen_&Jahr12.) / sum(ORB_Anzahl_Akt_&Jahr12.) as ORB_GEN_Quote_&Jahr12. format commax17.3,

(case

when (sum(ORB_kein_Fino_&Jahr12.) = 0 and sum(ORB_Anzahl_Akt_&Jahr12.) not in (0,.)) then 1

when (sum(ORB_kein_Fino_&Jahr12.) = . and sum(ORB_Anzahl_Akt_&Jahr12.) not in (0,.)) then 1

else (1 - (sum(ORB_kein_Fino_&Jahr12.) / sum(ORB_Anzahl_Akt_&Jahr12.)))

end)

as ORB_FINO_QUOTE_&Jahr12. format commax17.3,

sum(NGS_Anzahl_&Jahr12.) as NGS_Anzahl_&Jahr12. format commax17.,

sum(NGS_NGV_&Jahr12.) as NGS_NGV_&Jahr12. format commax17.,

sum(NGS_NGV_&Jahr12.) / sum(NGS_Anzahl_&Jahr12.) as NGS_TICKET_AVG_&Jahr12. format commax17.1,

sum(NGS_RSV_Anz_&Jahr12.) / sum(NGS_Anzahl_&Jahr12.) as RSV_QUOTE_&Jahr12. format commax17.3,

sum(NGS_Anzahl_&Jahr12.) / sum(ORB_Anzahl_Gen_&Jahr12.) as NGS_Auszahlungsquote_&Jahr12. format commax17.3,

sum(Anzahl_VZK_&Jahr12.) as Anzahl_VZK_&Jahr12. format commax17.2,

sum(ORB_kein_Fino_&Jahr12.) as ORB_kein_Fino_&Jahr12. format commax17.,

sum(NGS_RSV_Anz_&Jahr12.) as NGS_RSV_Anz_&Jahr12. format commax17.,

sum(ORB_Anzahl_Akt_Online_&Jahr12.) as ORB_Anzahl_Akt_Online_&Jahr12. format commax17.,

sum(ORB_TVOL_Akt_Online_&Jahr12.) as ORB_TVOL_Akt_Online_&Jahr12. format commax17.,

sum(ORB_Anzahl_Gen_Online_&Jahr12.) as ORB_Anzahl_Gen_Online_&Jahr12. format commax17.,

sum(ORB_TVOL_Gen_Online_&Jahr12.) as ORB_TVOL_Gen_Online_&Jahr12. format commax17.,

sum(NGS_Anzahl_Online_&Jahr12.) as NGS_Anzahl_Online_&Jahr12. format commax17.,

sum(NGS_NGV_Online_&Jahr12.) as NGS_NGV_Online_&Jahr12. format commax17.,

sum(NGS_Anzahl_Online_&Jahr12.) / sum(ORB_Anzahl_Gen_Online_&Jahr12.) as NGS_Auszahlungsquote_Online_&Jahr12. format commax17.3,

sum(NGS_Anzahl_&Jahr12.) / sum(Anzahl_VZK_&Jahr12.) as NGS_Anzahl_VZK_&Jahr12. format commax17.1,

sum(ORB_Anzahl_Gruen_&Jahr12.) / sum(ORB_Anzahl_Gen_&Jahr12.) as ORB_GRUEN_Quote_&Jahr12. format commax17.3

from Liste_teil_5

group by

 

STUFE,

MARKTREGION_BT,

NL_BT,

NLBEZ_BT,

MBRBEZ_BT,

FILBEZ_BT;

quit;

 

in the next step, I take the next level:

/* Aggregation und Quoten-Neu Berechnung für MBR */

proc sql;

create table MBR as

select 'MBR' as STUFE ,

 

MARKTREGION_BT,

NL_BT,

NLBEZ_BT,

MBRBEZ_BT,

/* '' as FILHB_BT length=5, */

'' as FILBEZ_BT length=35,

sum(ORB_Anzahl_Akt_&Jahr11.) as ORB_Anzahl_Akt_&Jahr11. format commax17.,

sum(ORB_TVOL_Akt_&Jahr11.) as ORB_TVOL_Akt_&Jahr11. format commax17.,

sum(ORB_Anzahl_Akt_&Jahr11.) / sum(Anzahl_VZK_&Jahr11.) as ORB_Anzahl_VZK_&Jahr11. format commax17.1,

sum(ORB_Anzahl_Gen_&Jahr11.) as ORB_Anzahl_Gen_&Jahr11. format commax17.,

sum(ORB_TVOL_Gen_&Jahr11.) as ORB_TVOL_Gen_&Jahr11. format commax17.,

sum(ORB_Anzahl_Gen_&Jahr11.) / sum(ORB_Anzahl_Akt_&Jahr11.) as ORB_GEN_Quote_&Jahr11. format commax17.3,

(case

when (sum(ORB_kein_Fino_&Jahr11.) = 0 and sum(ORB_Anzahl_Akt_&Jahr11.) not in (0,.)) then 1

when (sum(ORB_kein_Fino_&Jahr11.) = . and sum(ORB_Anzahl_Akt_&Jahr11.) not in (0,.)) then 1

else (1 - (sum(ORB_kein_Fino_&Jahr11.) / sum(ORB_Anzahl_Akt_&Jahr11.)))

end)

as ORB_FINO_QUOTE_&Jahr11. format commax17.3,

sum(NGS_Anzahl_&Jahr11.) as NGS_Anzahl_&Jahr11. format commax17.,

sum(NGS_NGV_&Jahr11.) as NGS_NGV_&Jahr11. format commax17.,

sum(NGS_NGV_&Jahr11.) / sum(NGS_Anzahl_&Jahr11.) as NGS_TICKET_AVG_&Jahr11. format commax17.1,

sum(NGS_RSV_Anz_&Jahr11.) / sum(NGS_Anzahl_&Jahr11.) as RSV_QUOTE_&Jahr11. format commax17.3,

sum(NGS_Anzahl_&Jahr11.) / sum(ORB_Anzahl_Gen_&Jahr11.) as NGS_Auszahlungsquote_&Jahr11. format commax17.3,

sum(Anzahl_VZK_&Jahr11.) as Anzahl_VZK_&Jahr11. format commax17.2,

sum(ORB_kein_Fino_&Jahr11.) as ORB_kein_Fino_&Jahr11. format commax17.,

sum(NGS_RSV_Anz_&Jahr11.) as NGS_RSV_Anz_&Jahr11. format commax17.,

sum(ORB_Anzahl_Akt_Online_&Jahr11.) as ORB_Anzahl_Akt_Online_&Jahr11. format commax17.,

sum(ORB_TVOL_Akt_Online_&Jahr11.) as ORB_TVOL_Akt_Online_&Jahr11. format commax17.,

sum(ORB_Anzahl_Gen_Online_&Jahr11.) as ORB_Anzahl_Gen_Online_&Jahr11. format commax17.,

sum(ORB_TVOL_Gen_Online_&Jahr11.) as ORB_TVOL_Gen_Online_&Jahr11. format commax17.,

sum(NGS_Anzahl_Online_&Jahr11.) as NGS_Anzahl_Online_&Jahr11. format commax17.,

sum(NGS_NGV_Online_&Jahr11.) as NGS_NGV_Online_&Jahr11. format commax17.,

sum(NGS_Anzahl_Online_&Jahr11.) / sum(ORB_Anzahl_Gen_Online_&Jahr11.) as NGS_Auszahlungsquote_Online_&Jahr11. format commax17.3,

sum(NGS_Anzahl_&Jahr11.) / sum(Anzahl_VZK_&Jahr11.) as NGS_Anzahl_VZK_&Jahr11. format commax17.1,

sum(ORB_Anzahl_Gruen_&Jahr11.) / sum(ORB_Anzahl_Gen_&Jahr11.) as ORB_GRUEN_Quote_&Jahr11. format commax17.3,

sum(ORB_Anzahl_Akt_&Jahr12.) as ORB_Anzahl_Akt_&Jahr12. format commax17.,

sum(ORB_TVOL_Akt_&Jahr12.) as ORB_TVOL_Akt_&Jahr12. format commax17.,

sum(ORB_Anzahl_Akt_&Jahr12.) / sum(Anzahl_VZK_&Jahr12.) as ORB_Anzahl_VZK_&Jahr12. format commax17.1,

sum(ORB_Anzahl_Gen_&Jahr12.) as ORB_Anzahl_Gen_&Jahr12. format commax17.,

sum(ORB_TVOL_Gen_&Jahr12.) as ORB_TVOL_Gen_&Jahr12. format commax17.,

sum(ORB_Anzahl_Gen_&Jahr12.) / sum(ORB_Anzahl_Akt_&Jahr12.) as ORB_GEN_Quote_&Jahr12. format commax17.3,

(case

when (sum(ORB_kein_Fino_&Jahr12.) = 0 and sum(ORB_Anzahl_Akt_&Jahr12.) not in (0,.)) then 1

when (sum(ORB_kein_Fino_&Jahr12.) = . and sum(ORB_Anzahl_Akt_&Jahr12.) not in (0,.)) then 1

else (1 - (sum(ORB_kein_Fino_&Jahr12.) / sum(ORB_Anzahl_Akt_&Jahr12.)))

end)

as ORB_FINO_QUOTE_&Jahr12. format commax17.3,

sum(NGS_Anzahl_&Jahr12.) as NGS_Anzahl_&Jahr12. format commax17.,

sum(NGS_NGV_&Jahr12.) as NGS_NGV_&Jahr12. format commax17.,

sum(NGS_NGV_&Jahr12.) / sum(NGS_Anzahl_&Jahr12.) as NGS_TICKET_AVG_&Jahr12. format commax17.1,

sum(NGS_RSV_Anz_&Jahr12.) / sum(NGS_Anzahl_&Jahr12.) as RSV_QUOTE_&Jahr12. format commax17.3,

sum(NGS_Anzahl_&Jahr12.) / sum(ORB_Anzahl_Gen_&Jahr12.) as NGS_Auszahlungsquote_&Jahr12. format commax17.3,

sum(Anzahl_VZK_&Jahr12.) as Anzahl_VZK_&Jahr12. format commax17.2,

sum(ORB_kein_Fino_&Jahr12.) as ORB_kein_Fino_&Jahr12. format commax17.,

sum(NGS_RSV_Anz_&Jahr12.) as NGS_RSV_Anz_&Jahr12. format commax17.,

sum(ORB_Anzahl_Akt_Online_&Jahr12.) as ORB_Anzahl_Akt_Online_&Jahr12. format commax17.,

sum(ORB_TVOL_Akt_Online_&Jahr12.) as ORB_TVOL_Akt_Online_&Jahr12. format commax17.,

sum(ORB_Anzahl_Gen_Online_&Jahr12.) as ORB_Anzahl_Gen_Online_&Jahr12. format commax17.,

sum(ORB_TVOL_Gen_Online_&Jahr12.) as ORB_TVOL_Gen_Online_&Jahr12. format commax17.,

sum(NGS_Anzahl_Online_&Jahr12.) as NGS_Anzahl_Online_&Jahr12. format commax17.,

sum(NGS_NGV_Online_&Jahr12.) as NGS_NGV_Online_&Jahr12. format commax17.,

sum(NGS_Anzahl_Online_&Jahr12.) / sum(ORB_Anzahl_Gen_Online_&Jahr12.) as NGS_Auszahlungsquote_Online_&Jahr12. format commax17.3,

sum(NGS_Anzahl_&Jahr12.) / sum(Anzahl_VZK_&Jahr12.) as NGS_Anzahl_VZK_&Jahr12. format commax17.1,

sum(ORB_Anzahl_Gruen_&Jahr12.) / sum(ORB_Anzahl_Gen_&Jahr12.) as ORB_GRUEN_Quote_&Jahr12. format commax17.3

from Liste_teil_5

group by

 

STUFE,

MARKTREGION_BT,

NL_BT,

NLBEZ_BT,

MBRBEZ_BT;

quit;

 

there is no month / time variable, as all data refer to a selected time period.

 

Thus, I don't know how to do.

 

 

 

 

PaigeMiller
Diamond | Level 26

What is the difference between the two different blocks of code you show?

--
Paige Miller
PierreYvesILY
Pyrite | Level 9

hi Paige,

 

the difference lies is the level FILBEZ_BT:

1. Block: 

select STUFE ,

MARKTREGION_BT,

NL_BT,

NLBEZ_BT,

MBRBEZ_BT,

/* '' as FILHB_BT length=5, */

FILBEZ_BT length=35,

 

2nd Block:

select 'MBR' as STUFE ,

MARKTREGION_BT,

NL_BT,

NLBEZ_BT,

MBRBEZ_BT,

/* '' as FILHB_BT length=5, */

'' as FILBEZ_BT length=35,

 

the 1st block calculates all values on the FILBEZ_BT level ; the second one on the MBRBEZ_BT level.

In a further step, the 2 tables are pu together, so that both levels are present in the final data table that will be exported.

 

This programm generates a reporting that is used by the final users in excel and they want to select different levels of the company structure.

PaigeMiller
Diamond | Level 26

@PierreYvesILY wrote:

hi Paige,

 

the 1st block calculates all values on the FILBEZ_BT level ; the second one on the MBRBEZ_BT level.

In a further step, the 2 tables are pu together, so that both levels are present in the final data table that will be exported.

 


I don't know what these two levels are. I also don't see how this relates to your original problem where you wanted to run the program over different months.

--
Paige Miller
PierreYvesILY
Pyrite | Level 9

to the levels: FILBEZ_BT is the most detailed level, MBRBEZ_BT is a gathering of several FILBEZ_BT.

 

I'll post an image of the desired result data file to make it clear.

koyelghosh
Lapis Lazuli | Level 10

@PierreYvesILY  Writing Pseudo-code/algorithm will help the readers. Somebody, who is not familiar with your code requirement, may find very hard to traverse through it and understand. The Pseudo-code/algorithm will help you as well and who knows may be you can solve the problem on your own. One thing I can say that currently this is a very long continuous code but may be I am wrong and the length suits the complexity of the problem. Pseudo-code/algorithm will help regardless but will mean more work from your side.

ballardw
Super User

Having many variables with a common base name differentiated by a suffix such as ORB_Anzahl_Akt_&Jahr11.

often indicates that perhaps the data should have a separate variable, possibly named JAHR that would be used as BY variable to get calculations for each level of the variable for those calculations.

 

I suspect proc summary and a data step with array code would be much cleaner .

PierreYvesILY
Pyrite | Level 9

hello,

 

thanks for your answer.

 

1. The names of the variables with &Jahr11. and so forth are created exactly for the purpose of copying the results in separate columns in Excel afterwards.

2. I have to calculate evolutions in % between both years, so I build yearly variables to do so :  (Var&Jahr11.- Var &Jahr12.)=Delta &Jahr11. in %

If you can suggest a way to calculate the Delta with only one variable + an additional variable Jahr, I'd be pleased to use it.

It would simplify the code a LOT.

 

I don't understand what you mean by: proc summary and a data step with array code 

 

thanks and have a nice day

PY

ballardw
Super User

@PierreYvesILY wrote:

hello,

 

thanks for your answer.

 

1. The names of the variables with &Jahr11. and so forth are created exactly for the purpose of copying the results in separate columns in Excel afterwards.

2. I have to calculate evolutions in % between both years, so I build yearly variables to do so :  (Var&Jahr11.- Var &Jahr12.)=Delta &Jahr11. in %

If you can suggest a way to calculate the Delta with only one variable + an additional variable Jahr, I'd be pleased to use it.

It would simplify the code a LOT.

 

I don't understand what you mean by: proc summary and a data step with array code 

 

thanks and have a nice day

PY


Would have to see some example INPUT data to see what is possible.

 

Proc summary (or means) will group data and do sums (or other statistics) into the same variable with less fuss than SQL.

proc summary data=  Liste_teil_5 nway;
   class STUFE MARKTREGION_BT NL_BT NLBEZ_BT MBRBEZ_BT;
   var  ORB_Anzahl_Akt_&Jahr11.  ORB_TVOL_Akt_&Jahr11.
        ORB_Anzahl_Akt_&Jahr11.    /* continue with all the 
                                   variables that need sums*/
   ;
   output out =temp  sum= ;
run;

If it happens that you need every single one of your variables that start with ORB to be summed the above could reduce the VAR statement to something like:

 

   var ORB: ;

as proc summary/ means can use variable lists.

proc summary data=  Liste_teil_5 nway;
   class STUFE MARKTREGION_BT NL_BT NLBEZ_BT MBRBEZ_BT;
   var  ORB: ngs: ;  
   output out =temp  sum= ;
run;

The above would sum ALL numeric variables whose names start with ORB and NGS and place the sum in a variable of the same name in the output set.

 

Then a data step similar to:

data want;
   set temp;
   /* calculate the rates*/
   array n <list the numerator variables>;
   array d <list the denominator variables in corresponding order>;
   array r <list the results variables of the rate calculations>;
   /* this do loop */
   do i= 1 to dim(n);
      r[i] = n[i] / d[i];
   end;
   /* would replace all of these */
   ORB_Anzahl_VZK_&Jahr11. =  ORB_Anzahl_Akt_&Jahr11./Anzahl_VZK_&Jahr11.;

   format orb: NGS: commax17.;
run;

  I wasn't about to copy out all of your variable names.

The ability to use variable lists in data step code can make things much simpler. If you need to process columns that appear in order you can use a list with two dashes in an array definition (or the var statement in Proc Summary / means).

Example: if the variables ABC ABD ABQ BDD BDE are adjacent and I want to process them in some way in that order I could use an array definition of : array v   ABC -- BDE;

Also not the variable list in the Format statement much simpler. Or if I had a lot of variables to assign a format I might be tempted to use Proc Datasets after the set is created to assign the format instead of having so many format commax17. scattered through the code.

PierreYvesILY
Pyrite | Level 9

Hello,

 

thank you for all this. I'll need some time to digest this and be able to move forward with my programm.

 

thanks a lot,
PY

PierreYvesILY
Pyrite | Level 9

Hi,

I sorted in the following table the type of data I have:

- yellow is alphanumeric

- blue is numeric

- from Marktregion_BT until Filbez_bt, it's a cascade of entities, each one being a substructure of the previous one

- Stufe is a generic variable to select the accurate level of data, and also give the last level : AG (being the summary of all Marktregion_BT). It means that when 'MBR' is selected, all FILBEZ_BT values are empty, as the FIL are the level below MBR. When NL is selected, the MBRBEZ_BT and FILBEZ_BT values are empty etc.

- the yellow columns give the 'type' of data I find, not the dependencies between the columns.

 

Datadashboard1.PNG

I need to:

- calculate the sum of each variable VAR1, VAR2... for each level Filbez_bt till Marktregion_bt AND each STUFE, for each year ('JAHR), and each month ('MONAT'), and with additional selections like: all Plattform values / Plattform = O / Plattform = F AND AMPEL = OFFEN or GRUEN or AMPEL = all values. Each new variable (= result variable) has to be named in a customized way

- calculate the evolutions 2019/2018 for each result variable.

 

My original program works but is very heavy and obviously can be optimized, as I'm new to SAS I'm learning by doing, quite slowly.

 

I'd appreciate any suggestion to solve the many problems mentioned above.

 

Thanks a lot,

PY

 

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 18 replies
  • 1179 views
  • 4 likes
  • 4 in conversation