<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to report the best format/informat for values within categories in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784935#M250480</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have searched far and wide and haven't been able to find an answer to the following:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am curious on whether SAS has the capability to report directly, what format or informat is best suited for a set of values within some categories. For instance:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let's say we have a categorical variable TEST that contains a number of values, e.g. "TEST1" "TEST2" "TEST3". Furthermore, we have a character variable, RESULT, that contains a number of associated values, that can either be character or numeric (float or integer), e.g.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;TEST&amp;nbsp; &amp;nbsp;| RESULT&amp;nbsp;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;TEST1 | UNKNOWN&lt;/P&gt;&lt;P&gt;TEST2 | 10.411&lt;/P&gt;&lt;P&gt;TEST3 | 1&lt;/P&gt;&lt;P&gt;TEST1 | 10.1&amp;nbsp;&lt;/P&gt;&lt;P&gt;TEST2 | 10.334&lt;/P&gt;&lt;P&gt;TEST3 | 11&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For the TESTs I would like to find the best format matching the values in RESULT, is there a way to report that, so that e.g.&amp;nbsp;&lt;/P&gt;&lt;P&gt;best format for&lt;/P&gt;&lt;P&gt;TEST1 = character with length 7&amp;nbsp;&lt;/P&gt;&lt;P&gt;TEST2 = float with length 6.3&lt;/P&gt;&lt;P&gt;TEST3 = integer&amp;nbsp;&lt;/P&gt;&lt;P&gt;?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Wed, 08 Dec 2021 16:33:27 GMT</pubDate>
    <dc:creator>_phintaC_</dc:creator>
    <dc:date>2021-12-08T16:33:27Z</dc:date>
    <item>
      <title>How to report the best format/informat for values within categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784935#M250480</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I have searched far and wide and haven't been able to find an answer to the following:&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am curious on whether SAS has the capability to report directly, what format or informat is best suited for a set of values within some categories. For instance:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Let's say we have a categorical variable TEST that contains a number of values, e.g. "TEST1" "TEST2" "TEST3". Furthermore, we have a character variable, RESULT, that contains a number of associated values, that can either be character or numeric (float or integer), e.g.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;U&gt;TEST&amp;nbsp; &amp;nbsp;| RESULT&amp;nbsp;&lt;/U&gt;&lt;/P&gt;&lt;P&gt;TEST1 | UNKNOWN&lt;/P&gt;&lt;P&gt;TEST2 | 10.411&lt;/P&gt;&lt;P&gt;TEST3 | 1&lt;/P&gt;&lt;P&gt;TEST1 | 10.1&amp;nbsp;&lt;/P&gt;&lt;P&gt;TEST2 | 10.334&lt;/P&gt;&lt;P&gt;TEST3 | 11&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For the TESTs I would like to find the best format matching the values in RESULT, is there a way to report that, so that e.g.&amp;nbsp;&lt;/P&gt;&lt;P&gt;best format for&lt;/P&gt;&lt;P&gt;TEST1 = character with length 7&amp;nbsp;&lt;/P&gt;&lt;P&gt;TEST2 = float with length 6.3&lt;/P&gt;&lt;P&gt;TEST3 = integer&amp;nbsp;&lt;/P&gt;&lt;P&gt;?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Dec 2021 16:33:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784935#M250480</guid>
      <dc:creator>_phintaC_</dc:creator>
      <dc:date>2021-12-08T16:33:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to report the best format/informat for values within categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784940#M250482</link>
      <description>&lt;P&gt;Only thing that jumps to mind:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;OL&gt;
&lt;LI&gt;Transpose so each Test was in it's own column&lt;/LI&gt;
&lt;LI&gt;Write the data to a text file&lt;/LI&gt;
&lt;LI&gt;Use PROC IMPORT to import the data and let it guess at the types. However, it depends on your rules. If you have a column with 99% of numbers and 1% of characters it will get read as characters (depending on how many rows it uses to guess)&lt;/LI&gt;
&lt;/OL&gt;
&lt;P&gt;I guess my question is why?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/409261"&gt;@_phintaC_&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;I have searched far and wide and haven't been able to find an answer to the following:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am curious on whether SAS has the capability to report directly, what format or informat is best suited for a set of values within some categories. For instance:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let's say we have a categorical variable TEST that contains a number of values, e.g. "TEST1" "TEST2" "TEST3". Furthermore, we have a character variable, RESULT, that contains a number of associated values, that can either be character or numeric (float or integer), e.g.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;U&gt;TEST&amp;nbsp; &amp;nbsp;| RESULT&amp;nbsp;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;TEST1 | UNKNOWN&lt;/P&gt;
&lt;P&gt;TEST2 | 10.411&lt;/P&gt;
&lt;P&gt;TEST3 | 1&lt;/P&gt;
&lt;P&gt;TEST1 | 10.1&amp;nbsp;&lt;/P&gt;
&lt;P&gt;TEST2 | 10.334&lt;/P&gt;
&lt;P&gt;TEST3 | 11&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For the TESTs I would like to find the best format matching the values in RESULT, is there a way to report that, so that e.g.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;best format for&lt;/P&gt;
&lt;P&gt;TEST1 = character with length 7&amp;nbsp;&lt;/P&gt;
&lt;P&gt;TEST2 = float with length 6.3&lt;/P&gt;
&lt;P&gt;TEST3 = integer&amp;nbsp;&lt;/P&gt;
&lt;P&gt;?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Dec 2021 16:46:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784940#M250482</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2021-12-08T16:46:28Z</dc:date>
    </item>
    <item>
      <title>Re: How to report the best format/informat for values within categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784942#M250483</link>
      <description>&lt;P&gt;Character variables basically have one format $. So no such thing a "integer" "float" or similar.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The bit that you have some values that appear to want to display in a more numeric method means that perhaps you have the wrong variable type. You appear to want a numeric value for most of the values that has the option of displaying text "Unknown". That would be a custom format where instead of a value of "Unknown" you have a missing value, possibly a special missing (.U perhaps) and a BEST for the remaining.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;A custom informat can read text into numeric values such as:&lt;/P&gt;
&lt;PRE&gt;proc format;
invalue result (upcase)
'UNKNOWN'=.U
;

value result
.U='Unknown'
other=[best8.]
;
run;
data example;
  input result :result.;
datalines;
UNKNOWN
10.411
1
10.1 
10.334
11 
;

proc print data=example;
  format result result.;
run;&lt;/PRE&gt;
&lt;P&gt;The special missing values .A to .Z and ._ can represent different reasons that data is missing and a format can provide a description as desired. I have included an example of an Informat that reads your values into numeric and a matching format. The Other=[best8.] means any value other than the .U will display in the best manner to fit within 8 characters.&lt;/P&gt;
&lt;P&gt;When no value is explicitly provided in the Invalue statement it is assumed to be the default for the variable type, so treats the numeric values as if read with a 12. informat.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/409261"&gt;@_phintaC_&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Hi,&lt;/P&gt;
&lt;P&gt;I have searched far and wide and haven't been able to find an answer to the following:&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am curious on whether SAS has the capability to report directly, what format or informat is best suited for a set of values within some categories. For instance:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Let's say we have a categorical variable TEST that contains a number of values, e.g. "TEST1" "TEST2" "TEST3". Furthermore, we have a character variable, RESULT, that contains a number of associated values, that can either be character or numeric (float or integer), e.g.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;U&gt;TEST&amp;nbsp; &amp;nbsp;| RESULT&amp;nbsp;&lt;/U&gt;&lt;/P&gt;
&lt;P&gt;TEST1 | UNKNOWN&lt;/P&gt;
&lt;P&gt;TEST2 | 10.411&lt;/P&gt;
&lt;P&gt;TEST3 | 1&lt;/P&gt;
&lt;P&gt;TEST1 | 10.1&amp;nbsp;&lt;/P&gt;
&lt;P&gt;TEST2 | 10.334&lt;/P&gt;
&lt;P&gt;TEST3 | 11&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;For the TESTs I would like to find the best format matching the values in RESULT, is there a way to report that, so that e.g.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;best format for&lt;/P&gt;
&lt;P&gt;TEST1 = character with length 7&amp;nbsp;&lt;/P&gt;
&lt;P&gt;TEST2 = float with length 6.3&lt;/P&gt;
&lt;P&gt;TEST3 = integer&amp;nbsp;&lt;/P&gt;
&lt;P&gt;?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 08 Dec 2021 16:51:27 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784942#M250483</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2021-12-08T16:51:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to report the best format/informat for values within categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784947#M250484</link>
      <description>&lt;P&gt;That is essentially the exercise I went through in creating this macro:&lt;/P&gt;
&lt;P&gt;&lt;A href="https://github.com/sasutils/macros/blob/master/csv2ds.sas" target="_blank" rel="noopener"&gt;https://github.com/sasutils/macros/blob/master/csv2ds.sas&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So using the idea of converting your tall file into a wide file.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
  infile cards dlm='|';
  id = ceil(_n_/3);
  input name :$32. value :$200.;
cards;
TEST1 | UNKNOWN
TEST2 | 10.411
TEST3 | 1
TEST1 | 10.1 
TEST2 | 10.334
TEST3 | 11 
;

proc transpose data=have out=wide(drop=_name_);
  by id;
  id name;
  var value;
run;

filename csv temp;
data _null_ ;
  file csv dsd ;
  set have;
  where id=1;
  put name @;
run;

data _null_;
  set wide (drop=id);
  file csv dsd mod ;
  put (_all_) (+0);
run;

%csv2ds(csv,out=test,replace=1);

proc print data=_types_;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;PRE&gt;                                                    m   m
                                        i           i   a                                          d                       a
                                        n           n   x                      n   n       i       a                   e   n
    v                              l    f   f       l   l                      o   u       n       t       y   m   d   8   y
    a               x              e    o   o   l   e   e                      n   m   c   t       e       y   m   d   6   d
    r     n         t        t     n    r   r   a   n   n                      m   e   o   e   d   t   t   m   d   m   0   t
O   n     a         y        y     g    m   m   b   g   g     m         m      i   r   m   g   a   i   i   m   d   m   1   d
b   u     m         p        p     t    a   a   e   t   t     i         a      s   i   m   e   t   m   m   d   y   y   d   t
s   m     e         e        e     h    t   t   l   h   h     n         x      s   c   a   r   e   e   e   d   y   y   z   m

1   1   TEST1   character   char   $7               4   7   10.1     UNKNOWN   2   1   1   0   0   0   0   0   0   0   0   0
2   2   TEST2   numeric     num    8                6   6   10.334   10.411    2   2   2   0   0   0   0   0   0   0   0   2
3   3   TEST3   integer     num    8                1   2   1        11        2   2   2   2   0   0   0   0   0   0   0   0
&lt;/PRE&gt;
&lt;P&gt;Code generated to read the CSV file.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
  infile CSV dlm=',' dsd truncover firstobs=2 ;
  length TEST1 $7 TEST2 8 TEST3 8 ;
  input TEST1 -- TEST3 ;
run;
&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Wed, 08 Dec 2021 18:00:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/784947#M250484</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2021-12-08T18:00:11Z</dc:date>
    </item>
    <item>
      <title>Re: How to report the best format/informat for values within categories</title>
      <link>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/785120#M250542</link>
      <description>&lt;P&gt;Given data like this&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;                        
  infile cards dlm='|';           
  input name :$32. value :$200.;  
cards;                            
TEST1 | UNKNOWN                   
TEST2 | 10.411                    
TEST3 | 1                         
TEST1 | 10.1                      
TEST2 | 10.334                    
TEST3 | 11                        
;run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You can check the values you want like this:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First scan for length and decimals:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;                                   
  set have;                                  
  prxid=prxparse('/^\d+\.?(\d*)$/');         
  if prxmatch(prxid,trim(value)) then do;    
    length=length(trim(value));              
    decimals=lengthn(prxposn(prxid,1,value)); 
    end;                                     
  keep name length decimals;                 
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;The PRX expression looks for a string consisting of one or more digits, a possible period, and then zero or more digits after that (and nothing else). I used the TRIM function to guard against leading blanks.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Now you just have to find the maximum values:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc summary data=test nway;       
  class name;                      
  var length decimals;             
  output out=want (drop=_:) max=;  
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;Edit note: I changed the second LENGTH function to LENGTHN, so that zero decimals will not be reported as one.&lt;/P&gt;</description>
      <pubDate>Thu, 09 Dec 2021 10:28:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/How-to-report-the-best-format-informat-for-values-within/m-p/785120#M250542</guid>
      <dc:creator>s_lassen</dc:creator>
      <dc:date>2021-12-09T10:28:22Z</dc:date>
    </item>
  </channel>
</rss>

