BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
_Manhattan
Obsidian | Level 7

Good afternoon,

I would like to create a report via ods pdf. The goal for the report is to print general Information for different Variables aswell as the corresponding frequency or mean tables. You can see an example screenshot of the desired output below. Furthermore, you can see a screenshot of the Data Structure of "LoopQuestion.xlsx" in case you do not intend to download the excel file. The screenshot shows the desired output for 1 Variable (AGHK50A-C). In the example code I attach below, I would like to create this output for the other variables aswell. There should result about 3 pdf pages with the general Variable Information and their corresponding frequency or mean tables. It's completely fine if page 1 already contains information from variable 2, or if the information from variable 1 occupies more than one page.

Problem: As far as I can tell, I am probably missing out on an inner do-loop that identifies the Variables in the general Information Table and reports the corresponding statistics-tables (which are intented to be produced before the macro "Testloop" by the way). I have tried some trial and error with the %if %sysfunc(%exist(dataset)) %then do% kind of statements, but I am failing to apply the correct conditional logic here.

Question: Do you know how to code the inner do loop so that it "scans" my general Information Table and just reports the corresponding statistic tables? Any other conditional logic that prints the tables where they are needed is welcome aswell!

Note: I do need some kind of automatised solution for that as I am just showing example code here. My original data has several Variables and there is no defined pattern when to print a specific frequency or mean table. Best case would be if the code itself identifies specific variables and prints the corresponding statistic tables.

/*** Example Data ***/
data work.sample_data;
  input EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;
  datalines;
  10 20 30 40 50  1 2 3 2 3
  15 25 35 45 55  1 3 4 3 2
  5  10 15 20 25  2 2 3 3 2
  20 30 40 50 60  4 3 2 1 1
  25 35 45 55 65  1 1 1 2 3
  ;
run;

proc import
  datafile="yourpath\LoopQuestion.xlsx"
  out=Excel_Items
  dbms=xlsx
  replace;
run;

/* Using proc sql for a Sorting Variable which is used to iterate trough the do-loop */ proc sql noprint; select distinct Sort into :Sort separated by " " from Excel_Items; quit; %put &Sort; %let varlistE = EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E; %let varlistA = AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B; /* Mean Calculation */ proc means data=work.sample_data noprint; var &varlistE; output out=work.mean_ezgh; run; %macro meandata; %do i = 1 %to %sysfunc(countw(&varlistE)); %let vmean = %scan(&varlistE, &i); data work.&vmean; set work.mean_ezgh (keep=&vmean); run; %end; %mend meandata; %meandata; /* Freq calculation */ %macro freqdata; %do i = 1 %to %sysfunc(countw(&varlistA)); %let vfreq = %scan(&varlista, &i); proc freq data=work.sample_data; tables &vfreq/ out=work.&vfreq; run; %end; %mend freqdata; %freqdata; /*** Macro Loop ***/ %macro Testloop; %do i = 1 %to %sysfunc(countw(&Sort.)); %let SortID = %scan(&sort, &i); %let VariableA = %scan(&varlistA, &i); %let VariableE = %scan(&varlistE, &i); ods pdf file="yourpath\LoopTest.pdf"; ods pdf startpage=now; ods layout start width=19cm height=28cm; /* Compute Variable Information for each Variable */ data VariableInfo; set Excel_items; where Sort = &SortID.; drop Sort; run; /* Report Variable Information for each Variable */ proc report data=VariableInfo noheader; column Info1 Info2; run; /* Print Frequency Table */ proc report data=&VariableA; column &VariableA COUNT PERCENT; define &VariableA / display "&VariableA"; define COUNT / display "Absolut"; define PERCENT / display "Percent"; run; /* Print Mean Table */ proc report data=&VariableE; column &VariableE; run; ods layout end; %end; ods pdf close; %mend Testloop; %Testloop;

LoopQuestion.xlsx:

LoopQ.JPG

Desired Output for 1 Variable:

LoopQ2.JPG

1 ACCEPTED SOLUTION

Accepted Solutions
Oligolas
Barite | Level 11

Hi,

I kept as much of your code as possible and commented what I changed:

Try this:

proc datasets lib=work kill nolist;run;quit;/*new*/
%let yourpath=C:\;/*new*/
/*** Example Data ***/
data work.sample_data;
  input EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;
  datalines;
  10 20 30 40 50  1 2 3 2 3
  15 25 35 45 55  1 3 4 3 2
  5  10 15 20 25  2 2 3 3 2
  20 30 40 50 60  4 3 2 1 1
  25 35 45 55 65  1 1 1 2 3
  ;
run;

proc import
  datafile="&yourpath.\LoopQuestion.xlsx"/*changed yourpath to &yourpath.*/
  out=Excel_Items
  dbms=xlsx
  replace;
run;
/* Using proc sql for a Sorting Variable which is used to iterate trough the do-loop */
proc sql noprint;
   select distinct Sort into :Sort separated by " "
   from Excel_Items;
quit;

%put &Sort; 

%let varlistE = EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E;
%let varlistA = AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;

/* Mean Calculation */
proc means data=work.sample_data noprint;
  var &varlistE;
  output out=work.mean_ezgh;
run;

%macro meandata;
   %do i = 1 %to %sysfunc(countw(&varlistE));
   %let vmean = %scan(&varlistE, &i);
   
   data work.mean&vmean;/*changed work.&vmean to work.mean&vmean*/
      set work.mean_ezgh (keep=&vmean);
   run;
   
   %end;
%mend meandata;
%meandata;

/* Freq calculation */
%macro freqdata;
   %do i = 1 %to %sysfunc(countw(&varlistA));
   %let vfreq = %scan(&varlista, &i);
   
   proc freq data=work.sample_data;
      tables &vfreq/ out=work.freq&vfreq;/*changed work.&vfreq to work.freq&vfreq*/
   run;
   
   %end;
%mend freqdata;

%freqdata;


/*** Macro Loop ***/
%macro Testloop;
  %do i = 1 %to %sysfunc(countw(&Sort.));
    %let SortID = %scan(&sort, &i);
    %let VariableA = %scan(&varlistA, &i);
    %let VariableE = %scan(&varlistE, &i);
    
    ods pdf file="&yourpath.\LoopTest.pdf";/*changed yourpath to &yourpath.*/
    ods pdf startpage=now;
    ods layout start width=19cm height=28cm;

/* Compute Variable Information for each Variable */
    data VariableInfo;
      set Excel_items;
      where Sort = &SortID.;
      drop Sort;
    run;
    
   %local Info1Items;/*new*/
   %let Info1Items=;/*new*/
   PROC SQL noprint;/*new*/
      SELECT distinct Info1 INTO :Info1Items SEPARATED BY " "/*new*/
      FROM Excel_Items/*new*/
      WHERE Sort = &SortID./*new*/
      AND Info1 not in ('' 'Variable');/*new*/
   QUIT;/*new*/



/* Report Variable Information for each Variable */
    proc report data=VariableInfo noheader;
      column Info1 Info2;
    run;
    
   %MACRO PrinItemStats(Item=);/*new*/
      %if %sysfunc(exist(freq&Item.)) %then %do; /*new*/
         /* Print Frequency Table */
         proc report data=freq&Item.;/*changed VariableA to Item*/
            column &Item. COUNT PERCENT;/*changed VariableA to Item*/
            define &Item / display "&Item.";/*changed VariableA to Item*/
            define COUNT / display "Absolut";
            define PERCENT / display "Percent";
         run;
      %end;/*new*/

      %if %sysfunc(exist(mean&Item.)) %then %do; /*new*/
         /* Print Mean Table */
         proc report data=mean&Item.;/*changed VariableE to Item*/
            column &Item.; /*changed VariableE to Item*/
         run;
      %end;/*new*/
   %MEND PrinItemStats;/*new*/


   %local j currItem;/*new*/
   %LET j=1;/*new*/
   %LET currItem=%SCAN(&Info1Items.,&j.,%STR( ));/*new*/
   %DO %WHILE(%LENGTH(&currItem.)>0);/*new*/

      %PrinItemStats(Item=&currItem.);/*new*/
      %LET j=%EVAL(&j.+1);/*new*/
      %LET currItem=%SCAN(&Info1Items.,&j.,%STR( ));/*new*/
   %END;/*new*/


    ods layout end;
  %end;
  ods pdf close;
%mend Testloop;
%Testloop;
________________________

- Cheers -

View solution in original post

11 REPLIES 11
Tom
Super User Tom
Super User

Rather than posting a photograph of your metadata why not just post it the same way you posted the actual data.

 

Without data I cannot try out any solutions.

 

But it looks like you just need to do an actual MERGE of the generated statistics with the metadata so that you can select the statistics you want and order them by the order indicated in the metaata.

ballardw
Super User

Many users here don't want to download Excel files because of virus potential, others have such things blocked by security software. Also if you give us Excel we have to create a SAS data set and due to the non-existent constraints on Excel data cells the result we end up with may not have variables of the same type (numeric or character) and even values.

 

If all of the variables of interest are indeed numeric variables I would reshape the data so that you have data in the form:

Variablename Value.

 

Then use the variable Variablename to report/group data.

Example:

data work.sample_data;
  input EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;
  datalines;
  10 20 30 40 50  1 2 3 2 3
  15 25 35 45 55  1 3 4 3 2
  5  10 15 20 25  2 2 3 3 2
  20 30 40 50 60  4 3 2 1 1
  25 35 45 55 65  1 1 1 2 3
  ;
run;

data reshape;
   set sample_data;
   array n(*) _numeric_;
   do i=1 to dim(n);
      VariableName= vname(n[i]);
      value= n[i];
      output;
   end;
   keep variablename value;
run;

proc tabulate data=reshape;
   class variablename;
   var value;
   table variablename,
         value*(mean min max std sum q1 median q3)
   ;
run;

Proc tabulate data=reshape;
   class variablename value;
   table variablename,
         value,
         n Colpctn
   ;
run;

I used Tabulate here because of the way the second creates a single table per value of Variablename (page dimension).

 

PaigeMiller
Diamond | Level 26

@ballardw wrote:

 

If all of the variables of interest are indeed numeric variables I would reshape the data so that you have data in the form:

Variablename Value.

 

Then use the variable Variablename to report/group data.


Excellent point. People create data in a poor layout for SAS and then jump into macros when its not necessary. Re-shaping the data into a more usable form is the approach that will work, and then relatively simple PROC TABULATE gets the job done. All this macro code not needed and in fact the OP stating the solution must have "conditional logic" with do-loops and with macros sends the discussion in the wrong direction.

--
Paige Miller
_Manhattan
Obsidian | Level 7

Hey, thank you for your input. I am sorry if my approach to formulate the question lead you in the wrong direction. It is not always easy to formulate a question that is easily understandable while also encompassing a topic area that gets to the heart of the matter. I would appreciate it if you could briefly explain your approach to data restructuring, as I don't understand how the code could be made more efficient.

_Manhattan
Obsidian | Level 7

Thank you for your reply! If I understand your code correctly the  "reshaping the structure" approach leads to the display of wanted freq/mean tables with the corresponding Variablenames for each variable, right? Unfortunately, the variables for the VariableInfo are character variables. The idea is/was to produce some information about the variables (e. g. Itemtext for each of the 5 variables (BGKO28A, BKG028B,..., BKGO28E)) and print all of the corresponding statistics below the Information (5 tables in total). Maybe I do not understand you correctly but as far as I can tell your approach would not include the VariableInfo but only the statistics, correct?

Quentin
Super User

Big picture, you're making something like a data dictionary, right?  That will summarize each variable.  And the variables are in groups, is that right?  

 

The first variable group would be AGHK50, which has variables AGHK50A AGHK50B AGHK50C.

The second variable group would be BGKO28 which has variables BGKO28A BGKO28B

 

Do you really need the Excel file?  If the variables are sorted alphabetically, I don't think it's providing any information that isn't available in your dataset.  Maybe the point is to use the column B ("item") to store variable labels that are not present in the dataset?

 

If you are going to use the Excel file to store information, you probably want it to store all the necessary information.  So instead of having a list to separate the PROC freq variables from the PROC means variables:

%let varlistE = EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E;
%let varlistA = AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;

You might want to add a column to the Excel sheet to indicate whether a variable is categorical (PROC FREQ) or continuous (PROC MEANS).

 

I know I'm not helping you with an answer.  Just trying to understand the setup of what your are doing.  If you're going to use the Excel file to store information to drive this report, I would probably suggest a revised format for that file, and then a revised macro structure to generate the report.  

 

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
_Manhattan
Obsidian | Level 7

Thank you all for your fast replies. I will try to include the excel data directly in the code and answer you in detail tomorrow!

Oligolas
Barite | Level 11

Hi,

I kept as much of your code as possible and commented what I changed:

Try this:

proc datasets lib=work kill nolist;run;quit;/*new*/
%let yourpath=C:\;/*new*/
/*** Example Data ***/
data work.sample_data;
  input EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;
  datalines;
  10 20 30 40 50  1 2 3 2 3
  15 25 35 45 55  1 3 4 3 2
  5  10 15 20 25  2 2 3 3 2
  20 30 40 50 60  4 3 2 1 1
  25 35 45 55 65  1 1 1 2 3
  ;
run;

proc import
  datafile="&yourpath.\LoopQuestion.xlsx"/*changed yourpath to &yourpath.*/
  out=Excel_Items
  dbms=xlsx
  replace;
run;
/* Using proc sql for a Sorting Variable which is used to iterate trough the do-loop */
proc sql noprint;
   select distinct Sort into :Sort separated by " "
   from Excel_Items;
quit;

%put &Sort; 

%let varlistE = EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E;
%let varlistA = AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;

/* Mean Calculation */
proc means data=work.sample_data noprint;
  var &varlistE;
  output out=work.mean_ezgh;
run;

%macro meandata;
   %do i = 1 %to %sysfunc(countw(&varlistE));
   %let vmean = %scan(&varlistE, &i);
   
   data work.mean&vmean;/*changed work.&vmean to work.mean&vmean*/
      set work.mean_ezgh (keep=&vmean);
   run;
   
   %end;
%mend meandata;
%meandata;

/* Freq calculation */
%macro freqdata;
   %do i = 1 %to %sysfunc(countw(&varlistA));
   %let vfreq = %scan(&varlista, &i);
   
   proc freq data=work.sample_data;
      tables &vfreq/ out=work.freq&vfreq;/*changed work.&vfreq to work.freq&vfreq*/
   run;
   
   %end;
%mend freqdata;

%freqdata;


/*** Macro Loop ***/
%macro Testloop;
  %do i = 1 %to %sysfunc(countw(&Sort.));
    %let SortID = %scan(&sort, &i);
    %let VariableA = %scan(&varlistA, &i);
    %let VariableE = %scan(&varlistE, &i);
    
    ods pdf file="&yourpath.\LoopTest.pdf";/*changed yourpath to &yourpath.*/
    ods pdf startpage=now;
    ods layout start width=19cm height=28cm;

/* Compute Variable Information for each Variable */
    data VariableInfo;
      set Excel_items;
      where Sort = &SortID.;
      drop Sort;
    run;
    
   %local Info1Items;/*new*/
   %let Info1Items=;/*new*/
   PROC SQL noprint;/*new*/
      SELECT distinct Info1 INTO :Info1Items SEPARATED BY " "/*new*/
      FROM Excel_Items/*new*/
      WHERE Sort = &SortID./*new*/
      AND Info1 not in ('' 'Variable');/*new*/
   QUIT;/*new*/



/* Report Variable Information for each Variable */
    proc report data=VariableInfo noheader;
      column Info1 Info2;
    run;
    
   %MACRO PrinItemStats(Item=);/*new*/
      %if %sysfunc(exist(freq&Item.)) %then %do; /*new*/
         /* Print Frequency Table */
         proc report data=freq&Item.;/*changed VariableA to Item*/
            column &Item. COUNT PERCENT;/*changed VariableA to Item*/
            define &Item / display "&Item.";/*changed VariableA to Item*/
            define COUNT / display "Absolut";
            define PERCENT / display "Percent";
         run;
      %end;/*new*/

      %if %sysfunc(exist(mean&Item.)) %then %do; /*new*/
         /* Print Mean Table */
         proc report data=mean&Item.;/*changed VariableE to Item*/
            column &Item.; /*changed VariableE to Item*/
         run;
      %end;/*new*/
   %MEND PrinItemStats;/*new*/


   %local j currItem;/*new*/
   %LET j=1;/*new*/
   %LET currItem=%SCAN(&Info1Items.,&j.,%STR( ));/*new*/
   %DO %WHILE(%LENGTH(&currItem.)>0);/*new*/

      %PrinItemStats(Item=&currItem.);/*new*/
      %LET j=%EVAL(&j.+1);/*new*/
      %LET currItem=%SCAN(&Info1Items.,&j.,%STR( ));/*new*/
   %END;/*new*/


    ods layout end;
  %end;
  ods pdf close;
%mend Testloop;
%Testloop;
________________________

- Cheers -

_Manhattan
Obsidian | Level 7

Thank you so much Oligolas! Your code is working exactly as intended. I appreciate that you took your time and commented the code aswell 🙂

Quentin
Super User

To use a data-driven approach, I would probably have three macros: %varinfo(), %freq(), %means() to generate each section of the report.  Then you can read in the dictionary as data, and use CALL EXECUTE to invoke the macros.  Something like:

 

data work.have;
  input EZGH25A EZGH25B EZGH25C EZGH25D EZGH25E AGHK50A AGHK50B AGHK50C BGKO28A BGKO28B;
  datalines;
  10 20 30 40 50  1 2 3 2 3
  15 25 35 45 55  1 3 4 3 2
  5  10 15 20 25  2 2 3 3 2
  20 30 40 50 60  4 3 2 1 1
  25 35 45 55 65  1 1 1 2 3
  ;
run;

data work.dict;
  input VariableGroup :$10. Variable :$10. Label :$10. Freq Means;
  cards ;
EZGH25 EZGH25A ItemA 1 0
EZGH25 EZGH25B ItemB 1 0
EZGH25 EZGH25C ItemC 1 1
EZGH25 EZGH25D ItemD 0 1
EZGH25 EZGH25E ItemE 0 1
AGHK50 AGHK50A ItemA 1 1
AGHK50 AGHK50B ItemB 1 1
AGHK50 AGHK50C ItemC 0 1
BGKO28 BGKO28A ItemA 1 0
BGKO28 BGKO28B ItemB 0 1
;
run;

%macro VariableInfo(VariableGroup=,dict=) ;
  ods pdf startpage=now ;
  proc print data=&dict noobs;
    where VariableGroup="&VariableGroup" ;
  run ;
%mend ;

%macro Freq(variable=,data=) ;
  proc freq data=&data ;
    tables &variable ;
  run ;
%mend freq ;

%macro means(variable=,data=) ;
  proc means data=&data ;
    var &variable ;
  run ;
%mend freq ;

ods pdf file="%sysfunc(pathname(work))/want.pdf" startpage=never; 

data _null_ ;
  set dict ;
  by VariableGroup notsorted;
  if first.VariableGroup then do ;
    call execute 
      (
      '%nrstr('
          ||  '%VariableInfo'
          ||    '('
          ||    ' VariableGroup='     || trim(VariableGroup)
          ||    ',dict=dict'
          ||    ' )'
          || ')'          
      );
  end ;

  if freq then call execute 
    (
    '%nrstr('
        ||  '%Freq'
        ||    '('
        ||    ' Variable=' || trim(Variable)
        ||    ',data=have'
        ||    ' )'
        || ')'          
    );

  if means then call execute 
    (
    '%nrstr('
        ||  '%Means'
        ||    '('
        ||    ' Variable=' || trim(Variable)
        ||    ',data=have'
        ||    ' )'
        || ')'          
    );

run ;

ods pdf close ;

From an efficiency standpoint, it's not very efficient, because you call PROC FREQ and PROC MEANS  once per variable.

 

But  it keeps the code simple.  You don't have to write any looping code yourself, because you use the data to drive everything.  And it's easy to develop and test each of the three macros to give you the output you want.  

BASUG is hosting free webinars Next up: Mark Keintz presenting History Carried Forward, Future Carried Back: Mixing Time Series of Differing Frequencies on May 8. Register now at the Boston Area SAS Users Group event page: https://www.basug.org/events.
_Manhattan
Obsidian | Level 7

Thank you Quentin, I appreciate your time but I need to compute the mean/freq tables before the macro for several reasons (e. g. I do not work with the proc freq/mean procedure to produce the tables but I use certain macros for the calculation which take some time to resolve).

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 11 replies
  • 1819 views
  • 5 likes
  • 6 in conversation