BookmarkSubscribeRSS Feed
rgettys
Fluorite | Level 6

Context: I am creating an automated report about feelings toward study. In the full version, I have responses from 25 regions with 10-40 cities each, and 100-300 responses from each city (50,000+ students total). The report has 40 modules that are similar to the one I am working on, so automating everything is essential, else I will be hand writing 400+ reports 2 times a year. I use macros for the report because I can isolate the code for each module.

 

Task: I am trying to make the report bullet proof by allowing it to work no matter how many cities there are per region (1 region might have 2 cities, another might have 30, and reports aggregate by region). In the report, I want to list the city frequency next to the region frequency.

 

Version: SAS EG 7.13. It is posted in this forum because I am hard coding everything, not using drag and drop menus. 

 

Dataset description: The sample dataset 'SCHOOL.csv' is simplified to have 2 regions with 2-3 cities each, with 5-45 students each. There are two study quality variables studyqual1_1 (Study brings personal fulfillment) studyqual1_2 (I feel study prepares me for a career). region is the administrative region and city is the metro area the respondent lives in.

 

Help needed:

I want to SAS to detect how many cities are in each region automatically for the macro. As of now, I would have to manually input the number of cities per region. The current code produces one page with all '.' responses, and scaling this up would produce hundreds of pages of blank reports.

 

Code:

 

/*I set up an index variable to loop in the macros*/
proc sort data=school out= edu ;
by region city;
run;

data regionindex; set edu ;
by region city;
retain regionidx 0;
retain cityidx 0;
if first.region then regionidx=regionidx+1;
if first.region or first.city then do;
if first.region then cityidx=0;
cityidx=cityidx+1;
end;
run;

/*Creating report*/
ods escapechar='~';
ods pdf file="path/testnew.pdf";
options nobyline nodate; 

%macro region;
%do j=1 %to 2;
/*I want to change this to be automated, like a dim(var) function in an array*/
	%macro city;
	%do k=1 %to 2;
	/*I want to change this to be automated, like a dim(var) function in an array, but nested within region*/
		%macro m22;
		%let vars=studyqual1_1|studyqual1_2;
		%let names=one|two;

			proc format;
			value onefive(notsorted)
			0='0'
			1='1'
			2='2'
			3='3'
			4='4'
			5='5'; run;
/*			calculate city responses*/
			%do I=1 %to 2;
				%let vars1=%scan(&vars,&I,|);
				%let names1=%scan(&names,&I,|);
/*				data output by city*/
				proc summary data=regionindex nway completetypes;
				class &vars1 /preloadfmt order=data missing;
				where regionidx=&j and cityidx=&k;
				format &vars1 onefive.;
				output out=citykeeprow;
				run;

				proc freq data=citykeeprow order=data noprint;
				tables &vars1 /out=citydatcol&i (keep=percent rename=(Percent=&names1 )) missing nocol nocum nofreq;
				weight _freq_/ zeros;
				format &vars1 onefive.;
				run;

				data citydattab (drop=i); merge citydatcol1-citydatcol2;
				array new (*) one--two;
					do I= 1 to 2;
					new(I)=new(I)/100;
				end;
				run;
/*				region output setup*/
				proc summary data=regionindex nway completetypes;
				class &vars1 /preloadfmt order=data missing;
				where regionidx=&j;
				format &vars1 onefive.;
				output out=keeprowregion;
				run;

				proc freq data=keeprowregion order=data noprint;
				tables &vars1 /out=datcolreg&i (keep=percent rename=(Percent=&names1 )) missing nocol nocum nofreq;
				weight _freq_/ zeros;
				format &vars1 onefive.;
				run;

				data regdattab (drop=i); merge datcolreg1-datcolreg2;
				array new (*) one--two;
					do I= 1 to 2;
					new(I)=new(I)/100;
				end;
				run;


			%end;

			proc transpose data=citydattab out=citydatatable prefix=citycol;
			format _NUMERIC_ Percent8.0;
			run;

			proc transpose data=regdattab out=regdatatable prefix=regcol;
			format _NUMERIC_ Percent8.0;
			run;

			data labelformats; 
			length formatlab $ 500;
			formatlab='~S={color=#7F7F7F font_face="Arial Light" font_size=14pt}My study brings personal fulfillment~S={}' ; output;
			formatlab='~S={color=#7F7F7F font_face="Arial Light" font_size=14pt}My study prepares me for a career~S={}' ; output;
			run;

			data combtab;
			merge labelformats citydatatable regdatatable;
			citycol4=citycol4+citycol5;
			regcol4=regcol4+regcol5;
			run;

			proc report data=combtab
			style(report)=[frame=void bordercolor=white background=white]
			style(header)={font_size=10pt font_face='Arial Light' color=#1F4E79}
			style(column)={font_size=14pt font_face='Arial' just=center};
			col formatlab citycol4 regcol4;
			define formatlab/style(column)={cellwidth=3.75in cellheight=.25in} "~S={ color=#026FA6 font_face='Arial' font_size=16pt just=left}Fulfillment from study";
			define citycol4 /right "Almost always or always" style(column)={cellheight=.33in cellwidth=.75in};
			define regcol4 /right "Region" style(column)={cellheight=.33in cellwidth=.75in};
			run;

			%mend m22;
		%m22;
	%end;
	%mend city;
	%city;
%end;
%mend region;
%region;
ods pdf close ;

 

8 REPLIES 8
Astounding
PROC Star

It sounds like you just need to capture the number of regions as a macro variable.  If you have that, you could just switch to:

 

%do j=1 %to &n_regions;

 

If that's really all you need, you can make that happen before the macros begin.  You have a DATA step that creates REGIONINDEX.  You can add to the program at that point.  Adding within the DATA step is possible, or adding a step after it would also be possible:

 

proc sql noprint;

select max(regionidx) into : n_regions from regionindex;

quit;

Shmuel
Garnet | Level 18

To expand @Astounding answer, you can create the macro variables in your first steo:

data regionindex;
  set edu end=eof ;
   by region city;
        retain regionidx 0;
        retain cityidx 0;
        if first.region then regionidx=regionidx+1;
        if first.region or first.city then do;
           if first.region then cityidx=0;
          cityidx=cityidx+1;
      end;

      if eof then do;
         call symput('n_regionidx, strip(regionidx));
         call symput('n_cityidx', strip(cityidx));
     end;
run;

later replace the do loops into:

        do i=1 to &n_regionidx;   /* or */

        do i=1 to &b_cityidx;

ballardw
Super User

Why do you think you need all those macros?

Nesting macro definitions inside another macro is usually a very poor idea if any macro is actually needed. Write one generic macro that does one task and pass needed values as parameters. One macro then call them as needed.

 

It is likely that if you could provide a small example of the data as a data step and what the report looks like we can simplify this significantly, most likely using BY Group processing.

 

It would help to have some data and code that works for at least one region and city without any macro variables at all and indicate what the

 

Often it is better to create your summary data for all the groups. And then if you need a macro such as creating to different report sections so they group correctly then you only need to have something that controls selecting which data to use.

 

I have one project where I have statewide, state region, county, school district, school grade, school type (public private) for 30 odd report variables.

I use proc summary with all of those categories to create the summary values needed and then select the _type_ for specific pages of an 800+ report. I do use macro variables in controlling the selection because I am creating specific groups of output to documents such as a report document for each region, school district and school type and the macro variables provide name elements as well as data selection. But defining what you want clearly is needed before getting into all of that.

 

Some other hints:

Define macros outside of output.

Do all of the data manipulation needed before the ODS destination output and only include procedures or data steps inside the ODS PDF/ ODS PDF Close; that are creating output for the report.

 

rgettys
Fluorite | Level 6

All excellent tips, it would make it easier if the definitions weren't nested within one another. Thanks.

 

The original post had it creating 2 region macros which was just a typo, but beside that, I feel like I need the 3 levels of macros because there are 45+ proc report segments that need to go out to each city, and each segment needs to have region data within the report. 

 

After I posted this, a colleague contacted me and we are working on it in person to simplify as much as possible, I will post our solution when we are done.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

You could eliminate almost all, if not all of the macro code and looping and such like, just by using by group processing.   For instance, just one proc:

proc summary data=regionindex nway completetypes;
  by regionidx cityidx;
  ...
run;

...

proc report data=...;
  by regionidx cityidx;
...
run;

This by grouping automatically does the processing on each by group and being just one dataset will be both quicker and easier to work with.  For more ocmplete code, post example test data in the form of a datastep in the body of the post using the code window (its the {i} above post window) - not going to guess what your data should look like from attachments.  Also post what the expected output should be.

Reeza
Super User

@RW9 wrote:

You could eliminate almost all, if not all of the macro code and looping and such like, just by using by group processing.   For instance, just one proc:

proc summary data=regionindex nway completetypes;
  by regionidx cityidx;
  ...
run;

...

proc report data=...;
  by regionidx cityidx;
...
run;

This by grouping automatically does the processing on each by group and being just one dataset will be both quicker and easier to work with.  For more ocmplete code, post example test data in the form of a datastep in the body of the post using the code window (its the {i} above post window) - not going to guess what your data should look like from attachments.  Also post what the expected output should be.


That doesn't work in cases like this because BY processing occurs sequentially and what s/he wants is a report with all tables for each region in a single report, with multiple reports. 

ballardw
Super User

CSV (or spreadsheet or other text) files do NOT tell us your SAS dataset. Values that appear numeric may be character in your data, dates/times/datetime values could be character or SAS date/time/datetime valued numeric. If I have to convert your data to a SAS data set to attempt to test code with I am very likely to end up with something that differs from your data and provide a solution that will not work with your data

 

Best way to provide example data is in the form of data step code. There is code to create data step code from a data set:

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the {i} icon or attached as text to show exactly what you have and that we can test code against.

ballardw
Super User

See if this gives you an idea of one way to select things.

data junk;
   input region $ city $ school $;
datalines;
A  Milltown Jeffers
B  Bigcity  Orange
A  Milltown Wash
A  Treetown Franklin
A  Treetown Oddem
B  Bigcity  Red
A  Treetown Betsy
B  Bigcity  Blue
A  Milltown Franklin
B  Bigcity  Green
B  Bigcity  Purple
;
run;

%macro dostuff ();
   /* determine how many REGIONS and create macro variable to hold them*/
   proc sql noprint;
      select distinct region into : regions separated by ' '
      from junk
      ;
   quit;
   /* proc sql creates an automatic variable that has the
      number of results for the last query. Save it for use
   */
   %let NumRegions = &sqlobs;
   %do i=1 %to &NumRegions;
      %let region = %scan(&regions,&i);
         Proc print data=junk;
            Title "Here is where operations involving all of Selected Region would go";
            where region="&region";
         run; title;



      proc sql noprint;
         select distinct city into : cities separated by ' '
         from junk
         where region ="&region"
         ;
      quit;
      %let NumCities= &sqlobs;
      %do j=1 %to &NumCities;
         %let city = %scan(&cities,&j);

         Proc print data=junk;
            Title "Here is where operations involving each city of Selected Region would go";
            where region="&region" and city="&city";
         run; title;

         Proc sql noprint;
            select distinct school into : schools separated by ' '
            from junk
            where region="&region" and city="&city"
            ;
         quit;
         %let NumSchools=&sqlobs;

         %do k = 1 %to &NumSchools;
            %let school= %scan(&schools,&k);
            Proc print data=junk;
               Title "Here is where operations involving each school of Selected Region and City would go";
               where region="&region" and city="&city" and school="&school";
            run; title;

         %end; /* k loop (schools)*/
      %end; /*j loop (city)*/
   %end; /*i loop (region)*/
%mend;

%dostuff;

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 2620 views
  • 4 likes
  • 6 in conversation