The SAS Output Delivery System and reporting techniques

Create dynamic format for use in proc report

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 5
Accepted Solution

Create dynamic format for use in proc report

In SAS 9.3

 

I'm creating a table in Proc Report and would like to display the values for a variable named "country" with all values (country names and their corresponding counts) displayed unless the country represents <20% of the total count.  My dataset is huge, so I've created a sample of the data for illustration.  

 

SAMPLE DATA

PersonID   country      count       

1               Australia     1    

2               USA            1   

3               Australia     1

4               Canada      1

5               Canada      1

6               Canada      1

5               Germany    1   

 

I created new variable "countrycount" to sum totals for each country.


proc sql /*noprint */;
create table finalsql as
select * , sum(count) as countrycount label='Country Count'
from final
group by country
quit;

 

to give me output:

PersonID   country      count    countrycount      

1               Australia     1          2

2               USA            1          1

3               Australia     1          2

4               Canada      1          3

5               Canada      1          3

6               Canada      1          3

5               Germany    1          1

 

I would like to show a table where Canada and Australia are displayed, and USA and Germany are collapsed into the 'other' category.  I'd like to display the total counts by country, like this:

 

TABLE

Country    Total

Canada    3

Australia   2

Other        2

 

I am trying to automate report generation, and the arrival data will be changing monthly, so I need a dynamic process to collapse countries into the "other" category.  Is there a way to create a format that would essentially work like this? My challenge is to create a format based on the values from two variables, one character and one numeric.  OR is there another way to do this? 

 

 /*I realize this wouldn't actually work in proc format - hope it helps explains my question*/  

proc format

      value      $country_collapsed

                     country = country                         /* do this for all values where countrycount >=20% of total */

                     country totals <20% = 'other';       /* collapse all countries into 'other' category if countrycount <20% total */

                                                             

run;

 

 


Accepted Solutions
Solution
‎09-25-2015 06:23 AM
Grand Advisor
Posts: 10,239

Re: Create dynamic format for use in proc report

I think this is an example of what you might want. Basically get a summary of the data, created a cntlin dataset for proc format filtered on your value range, feed that to proc format and use. This overwrites the existing, if any format, and recreates it.

 

 

proc freq data=sashelp.cars noprint ;
tables cylinders / out=percent;
run;

data cylindercntlin;
set percent;
fmtname = 'Cylinders';      
start = cylinders; end=cylinders;
label = 'Other';
Type='N';  /* for character this would be C*/
IF .< percent < 20; /* so only values with the percent less than 20 will be mapped*/
run;

proc format cntlin=cylindercntlin library=work;
run;

proc freq data=sashelp.cars ;
tables cylinders ;
format cylinders Cylinders.;
run;

View solution in original post


All Replies
Occasional Contributor
Posts: 12

Re: Create dynamic format for use in proc report

If proc format is not the only option you have than you can also try creating a view and use the view in your program

example:

 

proc sql;
        create view collapsed as
           select personid,country,count,
           case country
            when 'Canada' then 'Canada'
            when 'Australia' then 'Australia'
            else 'other' end
      from sample;

Occasional Contributor
Posts: 5

Re: Create dynamic format for use in proc report

Thanks, Stevyfargose.  


I will try that - I am not tied to proc format, but I do need to figure out how to dynamically create a collapsed "other" category (I don't want to hard code anything because the country names will change from month to month, and so will the counts associated with them).

SAS Super FREQ
Posts: 8,721

Re: Create dynamic format for use in proc report

Hi:
It sounds to me like something you would implement with a SAS Macro program, where the collapsing would happen inside a macro and the categories would be created.
cynthia
Occasional Contributor
Posts: 5

Re: Create dynamic format for use in proc report

Thanks, Cynthia.  I'm going to try that and the CNTLIN options that you guys have suggested.  I appreciate your thoughts on how to do this!

Solution
‎09-25-2015 06:23 AM
Grand Advisor
Posts: 10,239

Re: Create dynamic format for use in proc report

I think this is an example of what you might want. Basically get a summary of the data, created a cntlin dataset for proc format filtered on your value range, feed that to proc format and use. This overwrites the existing, if any format, and recreates it.

 

 

proc freq data=sashelp.cars noprint ;
tables cylinders / out=percent;
run;

data cylindercntlin;
set percent;
fmtname = 'Cylinders';      
start = cylinders; end=cylinders;
label = 'Other';
Type='N';  /* for character this would be C*/
IF .< percent < 20; /* so only values with the percent less than 20 will be mapped*/
run;

proc format cntlin=cylindercntlin library=work;
run;

proc freq data=sashelp.cars ;
tables cylinders ;
format cylinders Cylinders.;
run;
Occasional Contributor
Posts: 5

Re: Create dynamic format for use in proc report

Thanks, ballardw.  I'm going to try this, as well.  If I can figure it out, I'll post again with my code to make it work.

Occasional Contributor
Posts: 5

Re: Create dynamic format for use in proc report

I wanted to follow up.  I took the cntlin suggested by ballardw and with a little modification, it worked!  

 

The only thing I am still working through is how to force the "Other" and missing categories to sort to the last position while allowing everything else to sort alphabetically (and again, I can't hard code the formatting since the countries will change, more will be added, etc.).  I may end up having to manually do that in the end report - I can't figure out an order that will do the trick.

 

Thank you all for your help and suggestions.  Here is my final code:

 


proc freq data=final noprint;
tables country_of_birth / out=r.percent ;
run;

 

data countryofbirthcntlin;
set r.percent;
fmtname= 'COB';
start=country_of_birth; end=country_of_birth;
Type='C';                                                /*this would be N for numeric*/
IF 0< percent < 1 then label='Other* ' ;  /*so only values with percent less than 20 will be mapped to the other category*/
ELSE IF percent >=1 then label= country_of_birth;
run;

 

proc format cntlin=countryofbirthcntlin library=work;
run;

 

proc report data=countryofbirthcntlin;

column country_of_birth count;
define country_of_birth /  group format=$COB.;
define count / 'Number';

rbreak after / summarize page ;

compute after / style={just=l};

endcomp;

run;

Grand Advisor
Posts: 10,239

Re: Create dynamic format for use in proc report

If you look at the Proc Format documentation you will see that there are a number of options that we have not addressed in this topic. One thing is to code a value format with different options and see if that helps. Then use the CNTLOUT option on proc format to see what the values set in the format used were so you can incorporate those elements into your format.

 

Also note that several procs have ORDER = directives that may interact with formats. For instance the DEFINE block supporst ORDER = with options of Data, Formatted, Freq and Internal. Without a specific instruction the default is Formatted, which will generally be the alphabetic order of your country format. So you would Lithuania, Other, Poland.

 

For Proc report you could either insure that the order of the data read into the procedure is the order you want and use the Order=Data option or go another step to create a variable that would be in the sequence you want, probably a numeric, and assign that sequence variable an equivalent and use the Order=internal option.

 

The steps above to get the format could also get you an informat, Invalue in proc format, to transform the name to a sequence variable.

 

 

 

 

 

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 805 views
  • 3 likes
  • 4 in conversation