DATA Step, Macro, Functions and more

Retaining proc format values in the print regardless of the raw values

Reply
Contributor
Posts: 43

Retaining proc format values in the print regardless of the raw values

Hi All,

 

Is there a way to retail the proc format values in the proc print regardless of the raw values? 

 

Below is the below sample data & the code. I need to have the proc format values in the output even there is no data for the proc format buckets. when there is no data it is fine to populate it with 0 values. Please let me know if there is a way to do this. Thanks

 

Proc format buckets:
"0"
"1-1,000"
"1,001-10,000"
"10,001-20,000"
"20,001-30,000"
"30,001-40,000"
"40,001-50,000"
"50,001-60,000"
"GT 60,000"
"ND"

 

 

 

 

Sample data & the Code:
data test1;
infile datalines dsd missover dlm="#";
input VAR1$ VAR2$ VAR3$ ID;
datalines;
LLLL#1#ND#101
MMMM#2#2454#101
XXXX#3#0#102
YYYY#4#195#101
PPPP#6#70000#102
ZZZZ#7#25000#104
;
run;

Proc format;
value gfmt
low-0 = " 0"
1-1000 = " 1-1,000"
1001-10000 = " 1,001-10,000"
10001-20000 = " 10,001-20,000"
20001-30000 = " 20,001-30,000"
30001-40000 = " 30,001-40,000"
40001-50000 = " 40,001-50,000"
50001-60000 = " 50,001-60,000"
60001-high = " GT 60,000"
other = "ND"
;
run;


 
data test2;
set test1;
if VAR3 = "ND" then VAR4=. ;
else VAR4=input(VAR3,12.);
run;
 
%macro M_Format (app);
proc sql;
create table test3 as
select strip(&app.) as Group
,case when VAR1 = "LLLL" and ID in (101) then "USA"
when VAR1 = "MMMM" and ID in (101) then "ASIA"
when VAR1 = "XXXX" and ID in (102) then "AUSTRALIA"
when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK"
when VAR1 = "ZZZZ" and ID in (104) then "EUROPE"
when VAR1 = "PPPP" and ID in (102,103,104) then "JAPAN"
when VAR1 = "VVVV" and ID ne . then "ROI"
else VAR1 end as Region,
count(VAR2) as Freq_Month1

from test2
group by Group, Region
order by (case when VAR4 is null then 9999999999999999999999 else VAR4 end), Region
;
quit;
%mend;
%M_Format(case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end);

proc print data=test3;
run;

Current Output:

SAS Output

Obs Group Freq_Month1 1 2 3 4 5 6
01
1-1,0001
1,001-10,0001
20,001-30,0001
GT 60,0001
ND

1

 

Expected Output:

ObsGroupFreq_Month1
101
21-1,0001
31,001-10,0001
420,001-30,0001
530,001-40,0000
640,001-50,0000
750,001-60,0000
8GT 60,0001
9ND1

 

Super User
Posts: 17,963

Re: Retaining proc format values in the print regardless of the raw values

You have to add the missing level in manually if you're using SQL. 

 

You can use CNTLOUT to generate the table of all values from PROC FORMAT and merge that table in, if you were using a SAS proc, ie proc tabulate you could use the PRELOADFMT option to have all values generated. 

 

AFAIK there is no way to get this behaviour via PROC PRINT, it has to be done prior to this step. 

Contributor
Posts: 43

Re: Retaining proc format values in the print regardless of the raw values

Thanks for your views on this. Can someone explain how can I use PRELOADFMT option in the below code? or may by adding anything before or after this code, but I do not want to change the below code as lots of data needs to be passed through it using macro functions.

 

%macro M_Format (app);
proc sql;
create table test3 as
select strip(&app.) as Group
,case when VAR1 = "LLLL" and ID in (101) then "USA"
when VAR1 = "MMMM" and ID in (101) then "ASIA"
when VAR1 = "XXXX" and ID in (102) then "AUSTRALIA"
when VAR1 = "YYYY" and ID in (101,102,103,106) then "UK"
when VAR1 = "ZZZZ" and ID in (104) then "EUROPE"
when VAR1 = "PPPP" and ID in (102,103,104) then "JAPAN"
when VAR1 = "VVVV" and ID ne . then "ROI"
else VAR1 end as Region,
count(VAR2) as Freq_Month1

from test2
group by Group, Region
order by (case when VAR4 is null then 9999999999999999999999 else VAR4 end), Region
;
quit;
%mend;
%M_Format(case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end);

Contributor
Posts: 43

Re: Retaining proc format values in the print regardless of the raw values

Also please note that, it's not just for one table. I am looking to have all the format values for more than 100 tables & the sample code I have posted here is the exact format in which I am trying to achieve it. Please suggest if there a way to get this done by tweaking that code. Thanks

PROC Star
Posts: 290

Re: Retaining proc format values in the print regardless of the raw values

You're going to want to look for "preloadfmt" using proc report. 

Super User
Posts: 10,552

Re: Retaining proc format values in the print regardless of the raw values

Brief SAS terminology note: Retain means to keep something that is in the data, not to assume a value not present.

 

Proc Tabulate with the Preloadfmt option will display all of the formatted values as will proc report. Other procedures not so much.

Respected Advisor
Posts: 3,777

Re: Retaining proc format values in the print regardless of the raw values

data test1;
   infile datalines dsd missover dlm="#";
   input VAR1$ VAR2$ VAR3 ID;
   datalines;
LLLL#1#ND#101
MMMM#2#2454#101
XXXX#3#0#102
YYYY#4#195#101
PPPP#6#70000#102
ZZZZ#7#25000#104
;;;;
   run;
proc print;
   run;
Proc format;
   value gfmt(notsorted)
      low-0 = " 0"
      1-1000 = " 1-1,000"
      1001-10000 = " 1,001-10,000"
      10001-20000 = " 10,001-20,000"
      20001-30000 = " 20,001-30,000"
      30001-40000 = " 30,001-40,000"
      40001-50000 = " 40,001-50,000"
      50001-60000 = " 50,001-60,000"
      60001-high = " GT 60,000"
      . = "ND"
      ;
   run;
proc summary nway missing completetypes;
   class var3 / order=data preloadfmt;
   format var3 gfmt.;
   output out=test2;
   run;
proc print;
   run;

Capture.PNG

Contributor
Posts: 43

Re: Retaining proc format values in the print regardless of the raw values

@data_null__ Thanks very much for the workaround. But can we the results like by tweaking the code I have posted here? As I mentioned, lots of tables needs to be created by passing the different fields so I am looking to tweak my code and get the results like you achieve above.

Respected Advisor
Posts: 3,777

Re: Retaining proc format values in the print regardless of the raw values

[ Edited ]

My output matches you "Expected Output". Your program doesn't make much sense with regards to your expected output.  Why don't you show a representative example of what you have and what you want.

Contributor
Posts: 43

Re: Retaining proc format values in the print regardless of the raw values

@data_null__. I understand your code is giving the results as expected. But if you notice there is a macro function in the end in my code [%M_Format(case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end);] through which i have to pass over 100 variables to get the formatted & it's corresponding values. Hence it would be great if we can tweak my code and get the results as expected. Is there a workaround to get it done please?

Super User
Posts: 10,552

Re: Retaining proc format values in the print regardless of the raw values

Note that Proc tabulate can generate multiple tables with one procedure call as long the variables are used in the same role, class or var, in all table statements and any subsets (where statemen) are the same. Example:

proc tabulate data=sashelp.class;
   class sex age;
   var height weight;
   table sex,
         height*(max min mean std)
   ;
   table age,
         (height weight) *(max mean);
run;

I have some reports using Proc tabulate to generate as many as 15 tables with one procedure call. If the tables are similar it is very easy. And with a few style options can make the appearance very uniform.

 

Contributor
Posts: 43

Re: Retaining proc format values in the print regardless of the raw values

@ballardw. Thanks for looking into it & for your workaround. But I am looking to achieve it by tweaking my code as I have few variables to pass through the macro function defined. Do you have any thoughts on it please?

 

Thanks

DU

Super User
Posts: 10,552

Re: Retaining proc format values in the print regardless of the raw values


Data_User wrote:

@ballardw. Thanks for looking into it & for your workaround. But I am looking to achieve it by tweaking my code as I have few variables to pass through the macro function defined. Do you have any thoughts on it please?

 

Thanks

DU


1) describe the Actual purpose of the macro

2) describe the exact output desired by the macro. Since your apparent desire is to have random values inserted into the data then there are LOTS of things that would have to be considered, such as what values are assingned to any other variables involved in the data set

3) describe the variables that may be used how many and what types.

4) describe HOW the resulting data set is to be used.

 

As a minimum from when I seea parameter like "case when VAR4 = . then "ND" else put(input(VAR3,12.),gfmt.) end" I get very concerned about maintainability because passing the start of a sytax element (CASE) without the appropiate complete statement (ending with AS Var insted of just End) it looks very fragile and like the conditions aren't really well determined before writing the macro or process in general. And really have little interest in tweaking such

 

 

Contributor
Posts: 43

Re: Retaining proc format values in the print regardless of the raw values

@ballardw Please see below. I am getting the results as expected, but the only problem is if there is no raw data then all the format values are not displaying. I want the format values to display with 0 or missing values.

 

Example:

 

Current   
Field2Count1Count2Count3
"0"
684203896
"1-10"807537767
"30-40"793834817
"50-60"420892597
"ND" 745252935
    
Expecting:   
Field2Count1Count2Count3
"0"
684203896
"1-10"807537767
"10-20"000
"20-30"000
"30-40"793834817
"40-50"000
"50-60"420892597
">60"000
"ND" 745252935

 

1) describe the Actual purpose of the macro -

Ans1: To create seperate datasets for each field passing through the macro

2) describe the exact output desired by the macro. Since your apparent desire is to have random values inserted into the data then there are LOTS of things that would have to be considered, such as what values are assingned to any other variables involved in the data set -

Ans2: I have different formats for each variable passing through the macro. Based on the raw values for each field corresponding formats will be applied

3) describe the variables that may be used how many and what types

Ans3: There are over 100 variables & it's combination of character and numeric variables, but the end results will be grouped as per the formats applied.

4) describe HOW the resulting data set is to be used.

Ans4: Resulting dataset will be having groups created through the formats & it's counts, sums, percentages, etc.....

 

Examples as below:

 

Output1   
Field1Count1Count2Count3

"0"

14777199
"1-1,000"66881521
"1,001-10,000"708759522
"10,001-20,000"660592895
"20,001-30,000"577422711
"30,001-40,000"599783187
"40,001-50,000"801786796
"50,001-60,000"75567701
"GT 60,000"460512407
"ND"964547843
    
Output2   
Field2Count1Count2Count3

"0"

93189195
">1 <200"766813156
">200 <400"517370299
">400 <600"75792985
">600 <800"89547079
">800 <1000"315495808
"1000+"25857024
"ND"698393255
    
Output3   
Field2Count1Count2Count3

"0"

684203896
"1-10"807537767
"10-20"246172940
"20-30"127892301
"30-40"793834817
"40-50"529158866
"50-60"420892597
">60"81221421
"ND"745252935
Super User
Posts: 17,963

Re: Retaining proc format values in the print regardless of the raw values

If you want to stay in the method you're using, you need to create the empty table and then fill in the values.

 

1. Create table with all possible values - can extract from your format if desired

2. Join with desired table using a left/right/full join to ensure all values are in the table. You'll need to add some coalesces to make sure the 0's are in the table.  Unfortunately this means you need to have a consistent structure to be able to automate it and ensure that the zero's are set properly. 

 

 

Your 'macro' is not dynamic enough to do what you're stating. You have a fixed table name which gets replaced each iteration. 

 

You may also want to look into PUTC which allows the 'format' to be a text string, then maybe you can avoid some of these CASE statements if you have predefined formats. If it's a table you can automate it, if it's in your head you can't. All the 'region' calculations should also be placed in a table/format for automation instead of hard coding the values. 

 

Here's a paper that goes over the multiple methods to get the zero rows.

http://www.lexjansen.com/pharmasug/2005/CodersCorner/cc22.pdf 

 

You're looking at the first option in your case.

 

 

 

Ask a Question
Discussion stats
  • 15 replies
  • 176 views
  • 3 likes
  • 5 in conversation