BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mtnbikerjoshua
Obsidian | Level 7

I am using proc report with the `out=` option and I am seeing some strange behaviour using SAS Enterprise Guide version 7.15 HF8 (7.100.5.6214) (32-bit).

 

I have measurements of several metrics (CAT2) as well as change in measurements for the same metrics (CAT1). My intent is to create a table with summary statistics for each metric. To ensure that the metrics are listed in the same order for each category, and to keep the report clean, I created a format that standardizes the param names between the two categories. In theory, proc report should order by and print the formatted value, but the values of param in the output dataset should be the unformatted values.

 

What I am seeing instead, is some strange mix of  param values from both categories as shown in the table below. Here is a reproducible version of the issue with invented data and the output table `unexpected` for reference.

 

CATPARAMNMEAN_BREAK_
CAT1Anxiety and Fears22 
CAT1Change in Autism20 
CAT1Change in Hyperactivity and Inattention22 
CAT2Anxiety and Fears25 
CAT2Change in Autism26 
CAT2Change in Hyperactivity and Inattention24 

 

proc format;
  value $ my_format
    "Change in Anxiety and Fears" = "Anxiety and Fears"
    "Change in Autism" = "Global Severity of Autism"
    "Change in Hyperactivity and Inattention" = "Hyperactivity and Inattention";
run;

data indata;
  length cat $4 param $50;
  input cat $ param $ val;
  infile datalines delimiter=',';
  datalines;
CAT1, Change in Anxiety and Fears, 1
CAT1, Change in Anxiety and Fears, 3
CAT1, Change in Autism, -1
CAT1, Change in Autism, 1
CAT1, Change in Hyperactivity and Inattention, 2
CAT1, Change in Hyperactivity and Inattention, 2
CAT2, Anxiety and Fears, 5
CAT2, Anxiety and Fears, 5
CAT2, Global Severity of Autism, 7
CAT2, Global Severity of Autism, 5
CAT2, Hyperactivity and Inattention, 3
CAT2, Hyperactivity and Inattention, 5
run;

proc report data=indata out=unexpected;
  column cat param n val=mean;
  define cat / group;
  define param / group format=$my_format.;
  define mean / analysis mean;
run;

data expected;
  length cat $4 param $50;
  input cat $ param $ n sum _break_;
  infile datalines delimiter=',' dsd;
  datalines;
CAT1, Change in Anxiety and Fears, 2, 2,
CAT1, Change in Autism, 2, 0,
CAT1, Change in Hyperactivity and Inattention, 2, 2,
CAT2, Anxiety and Fears, 2, 5,
CAT2, Global Severity of Autism, 2, 6,
CAT2, Hyperactivity and Inattention, 2, 4,
;
run;

Can anybody explain this strange behaviour?

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Is it just that you printed UNEXPECTED without using the display format?

Why not just use a FORMAT statement to attach the format instead of an option on the DEFINE statement?

proc report data=indata out=unexpected;
  column cat param n val=mean;
  define cat / group;
  define param / group ;
  define mean / analysis mean;
  format param $my_format.;

run;

proc print data=unexpected; 
run;

ods select position;
proc contents data=unexpected varnum;
run;
SAS 9.4 on WINDOWS                                                                                  13:52 Thursday, April 21, 2022  10

  cat   param                                  n        val
  CAT1  Anxiety and Fears                      2          2
        Global Severity of Autism              2          0
        Hyperactivity and Inattention          2          2
  CAT2  Anxiety and Fears                      2          5
        Global Severity of Autism              2          6
        Hyperactivity and Inattention          2          4

SAS 9.4 on WINDOWS                                                                                  13:52 Thursday, April 21, 2022  11

Obs    cat     param                            n    mean    _BREAK_

 1     CAT1    Anxiety and Fears                2      2
 2     CAT1    Global Severity of Autism        2      0
 3     CAT1    Hyperactivity and Inattention    2      2
 4     CAT2    Anxiety and Fears                2      5
 5     CAT2    Global Severity of Autism        2      6
 6     CAT2    Hyperactivity and Inattention    2      4

SAS 9.4 on WINDOWS                                                                                  13:52 Thursday, April 21, 2022  12

The CONTENTS Procedure

             Variables in Creation Order

#    Variable    Type    Len    Format         Label

1    cat         Char      4                   cat
2    param       Char     50    $MY_FORMAT.    param
3    n           Num       8                   n
4    mean        Num       8                   val
5    _BREAK_     Char     32

Formats to not change the stored values, just how they are displayed.  Most procedures that group the formatted still will return one of the actual values (I assume the first one seen for that grouped value).

 

If you want to change the value add another step to store the formatted value. Either into the same variable (since it is character) or into a new variable (required if it is numeric).

data want;
  set unexpected;
  param=put(param,$my_format.);
run;

 

 

PS Who uses an output dataset from PROC REPORT?  If you want calculate statistics use procedures that are designed to do that.

View solution in original post

4 REPLIES 4
ballardw
Super User

When a formatted value does not appear correct it may be that the actual value of the variable is different than the value the format is looking for/expecting so displays the raw value unless an other= option is used in the format.

 

You need to double check, NOT by printing, that your data does not have any leading spaces in the value of the "Change in Hyperactivity". If the function length for that value in the data set returns something other than 39 that is likely (or some other non-printable character)

 

Note that proc print, freq and most of the procedures that display values will left justify text values and in effect hide the existence of leading characters. There is also a minor chance that there is a non-space non-printable character somewhere in the value or a UTF character that doesn't quite align with the font used to create the format which is likely ASCII characters.

Tom
Super User Tom
Super User

Is it just that you printed UNEXPECTED without using the display format?

Why not just use a FORMAT statement to attach the format instead of an option on the DEFINE statement?

proc report data=indata out=unexpected;
  column cat param n val=mean;
  define cat / group;
  define param / group ;
  define mean / analysis mean;
  format param $my_format.;

run;

proc print data=unexpected; 
run;

ods select position;
proc contents data=unexpected varnum;
run;
SAS 9.4 on WINDOWS                                                                                  13:52 Thursday, April 21, 2022  10

  cat   param                                  n        val
  CAT1  Anxiety and Fears                      2          2
        Global Severity of Autism              2          0
        Hyperactivity and Inattention          2          2
  CAT2  Anxiety and Fears                      2          5
        Global Severity of Autism              2          6
        Hyperactivity and Inattention          2          4

SAS 9.4 on WINDOWS                                                                                  13:52 Thursday, April 21, 2022  11

Obs    cat     param                            n    mean    _BREAK_

 1     CAT1    Anxiety and Fears                2      2
 2     CAT1    Global Severity of Autism        2      0
 3     CAT1    Hyperactivity and Inattention    2      2
 4     CAT2    Anxiety and Fears                2      5
 5     CAT2    Global Severity of Autism        2      6
 6     CAT2    Hyperactivity and Inattention    2      4

SAS 9.4 on WINDOWS                                                                                  13:52 Thursday, April 21, 2022  12

The CONTENTS Procedure

             Variables in Creation Order

#    Variable    Type    Len    Format         Label

1    cat         Char      4                   cat
2    param       Char     50    $MY_FORMAT.    param
3    n           Num       8                   n
4    mean        Num       8                   val
5    _BREAK_     Char     32

Formats to not change the stored values, just how they are displayed.  Most procedures that group the formatted still will return one of the actual values (I assume the first one seen for that grouped value).

 

If you want to change the value add another step to store the formatted value. Either into the same variable (since it is character) or into a new variable (required if it is numeric).

data want;
  set unexpected;
  param=put(param,$my_format.);
run;

 

 

PS Who uses an output dataset from PROC REPORT?  If you want calculate statistics use procedures that are designed to do that.

mtnbikerjoshua
Obsidian | Level 7

You have some good points there @Tom . As you say, a format does not change the stored value, therefore I expected to the `unexpected` dataset to have the original unformatted values, but if you look closely at the table I included in my first post, you can see that the first row is formatted, the second two are not, and the last three are repeats of the first three even those rows were not in the value statement on the left side of the equals at all. It seems to have to do with those last three rows that already have the formatted values because when I change those rows in the input data the issue goes away and I get the unformatted values as expected.

Anyway, your advice to use a format statement does seem to work and I'm sure there are other ways to accomplish my task that make more sense, but that strange output still puzzles me.

Tom
Super User Tom
Super User

Your FORMAT does not include an OTHER category so any value that does not EXACTLY match the values being formatted will print as the raw value.

 

So if the format definition is:

proc format;
value $ my_format
  "Change in Anxiety and Fears" = "Anxiety and Fears"
  "Change in Autism" = "Global Severity of Autism"
  "Change in Hyperactivity and Inattention" = "Hyperactivity and Inattention"
;
run;

And you are seeing UNFORMATTED values like:

Anxiety and Fears
Change in Autism
Change in Hyperactivity and Inattention

Then those are the values in the data.

 

Play around with some values to see what it happening.  Note that leading spaces in the values makes a difference. And that ODS output (like HTML, PDF or RTF) does NOT do a good job of showing leading spaces.  Adding quotes helps.

proc format;
value $ my_format
  "Change in Anxiety and Fears" = "Anxiety and Fears"
  "Change in Autism" = "Global Severity of Autism"
  "Change in Hyperactivity and Inattention" = "Hyperactivity and Inattention"
;
run;


data have;
  input string $char50. ;
cards;
Anxiety and Fears
Change in Autism
Hyperactivity and Inattention
Change in Anxiety and Fears
Change in Autism
Change in Hyperactivity and Inattention
other text
  Change in Anxiety and Fears
  Change in Autism
  Change in Hyperactivity and Inattention
;

data test;
  set have ;
  raw = quote(trim(string));
  formatted =quote(trim(put(string,$my_format.)));
run;

proc print;
run;
Obs    string                                       raw                                            formatted

  1    Anxiety and Fears                            "Anxiety and Fears"                            "Anxiety and Fears"
  2    Change in Autism                             "Change in Autism"                             "Global Severity of Autism"
  3    Hyperactivity and Inattention                "Hyperactivity and Inattention"                "Hyperactivity and Inattention"
  4    Change in Anxiety and Fears                  "Change in Anxiety and Fears"                  "Anxiety and Fears"
  5    Change in Autism                             "Change in Autism"                             "Global Severity of Autism"
  6    Change in Hyperactivity and Inattention      "Change in Hyperactivity and Inattention"      "Hyperactivity and Inattention"
  7    other text                                   "other text"                                   "other text"
  8      Change in Anxiety and Fears                "  Change in Anxiety and Fears"                "  Change in Anxiety and Fears"
  9      Change in Autism                           "  Change in Autism"                           "  Change in Autism"
 10      Change in Hyperactivity and Inattention    "  Change in Hyperactivity and Inattention"    "  Change in Hyperactivity and"

Tom_0-1651261890858.png

 

 

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 4 replies
  • 934 views
  • 0 likes
  • 3 in conversation