<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Character values in numeric variables in New SAS User</title>
    <link>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613923#M18500</link>
    <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/304766"&gt;@PrinceAladdin&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Dear SAS community,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am trying to run an array to convert all blank values for all variables to "." However, when I try to run the array, I receive an error message indicating some of my numeric variables are actually character variables and thus the array cannot run. I am hypothesizing that these variables (which should be numeric) are considered character variables because the data collectors input a value of "N/A" for several observations instead of the intended 1-5 scale. If this is true, is there a way to (1) run an array to identify all non-numeric values and (2) convert those non-numeric values to "."? And will doing so automatically reclassify the intended numeric variable as a numeric variable?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Typically I would consider this a failure in how the data was originally read into SAS. If you know a variable is intended to be numeric then use an approach that reads the data into the correct type.&lt;/P&gt;
&lt;P&gt;I read lots of files with a variety of poor data entry standards and have to create appropriate "missing" or recoded values depending upon the actual characters encountered. If you use a data step to read the data then SAS provides a very nice tool in the form of custom informats to read commonly occurring text into a desired value.&lt;/P&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;PRE&gt;Proc format library=work;
invalue ynx (upcase)
'Y','YES' = 1
'N','NO'  = 0
'X'       = .x
' '       = .
other = _error_;
invalue lickert (upcase)
1,2,3,4,5 =_same_
'NA','N/A' = .n
' '        = .
other    =_error_
;

data example;
   infile datalines dlm=',' truncover;
   informat x ynx. y lickert. ;
   input x y;
datalines;
y,1
Yes,na
N,N/a
 ,5
Y,8
n, 
x,3
;
run;&lt;/PRE&gt;
&lt;P&gt;This creates two custom formats one that expects values of Y or Yes, N or No, X, and blank. The second expects values of 1 to 5, integers, NA or N/A and blank. Any other value encountered will be treated as an invalid data value and provide a note in the log.&lt;/P&gt;
&lt;P&gt;The UPCASE on the invalue statements says to make the encountered value upper case before comparing to the rules in the format. So in the first format Y, y, Yes, YEs, yES, yEs , YeS, yeS would all be considered valid values and treated as 1 on input.&lt;/P&gt;
&lt;P&gt;The informats also use special missing to indicate that special input codes X or NA can be differentiated from a blank though the numeric value is missing and will not be used for any summary statistics. But you can determine between the two types of missing if needed.&lt;/P&gt;
&lt;P&gt;The keywords _same_ means the value assigned is as read so you get numeric 1,2,3,4 and 5. The other=_error_ is the part that says any other value is an invalid value. This will cause the "invalid data" in the log and show the suspect line of data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I use this because I have some sources that just plain refuse to pay attention to spelling and I have to constantly update the informats to accommodate other random spelling changes like inserting hyphens, extra spaces, reversing the order of word values and such.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I also have&amp;nbsp;similar informats to list things like Site location codes. The data sources will either misspell the site names, renamed the site without notification or add new sites. So I get warnings when I read their data and can ask about appropriate behavior for the "new" site code encountered.&lt;/P&gt;</description>
    <pubDate>Thu, 26 Dec 2019 16:11:47 GMT</pubDate>
    <dc:creator>ballardw</dc:creator>
    <dc:date>2019-12-26T16:11:47Z</dc:date>
    <item>
      <title>Character values in numeric variables</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613853#M18489</link>
      <description>&lt;P&gt;Dear SAS community,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I am trying to run an array to convert all blank values for all variables to "." However, when I try to run the array, I receive an error message indicating some of my numeric variables are actually character variables and thus the array cannot run. I am hypothesizing that these variables (which should be numeric) are considered character variables because the data collectors input a value of "N/A" for several observations instead of the intended 1-5 scale. If this is true, is there a way to (1) run an array to identify all non-numeric values and (2) convert those non-numeric values to "."? And will doing so automatically reclassify the intended numeric variable as a numeric variable?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thank you!&lt;/P&gt;</description>
      <pubDate>Thu, 26 Dec 2019 05:39:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613853#M18489</guid>
      <dc:creator>PrinceAladdin</dc:creator>
      <dc:date>2019-12-26T05:39:56Z</dc:date>
    </item>
    <item>
      <title>Re: Character values in numeric variables</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613859#M18490</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/304766"&gt;@PrinceAladdin&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If my understanding is right, here is an approach.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;[1] Read into a string of all variables.&lt;/P&gt;
&lt;P&gt;[2] Replace 'n/a' by '.'.&lt;/P&gt;
&lt;P&gt;[3] Scan each variable so that '.' will also be read by SCAN function using 'S' option.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
array a a1 - a5;
input ;
_infile_ = tranwrd(_infile_,'n/a', '.');
do i = 1 to dim(a);
   a[i] = scan(_infile_,i,' ', 's');
end;
drop i;
datalines;
1  2   n/a  2  1
2  1   .    1  2
3 n/a  n/a  2  1
;
run;
&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Edited:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To avoid the&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;BR /&gt;NOTE: Character values have been converted to numeric values at the places given by&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;on the LOG, replace the statement&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;a[i] = scan(_infile_,i,' ', 's');&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;By&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;a[i] = input(scan(_infile_,i,' ', 's'),8.);&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&lt;BR /&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Dec 2019 08:46:21 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613859#M18490</guid>
      <dc:creator>KachiM</dc:creator>
      <dc:date>2019-12-26T08:46:21Z</dc:date>
    </item>
    <item>
      <title>Re: Character values in numeric variables</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613885#M18492</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/304766"&gt;@PrinceAladdin&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;I am hypothesizing that these variables (which should be numeric) are considered character variables because the data collectors input a value of "N/A" for several observations instead of the intended 1-5 scale. If this is true, is there a way to (1) run an array to identify all non-numeric values and (2) convert those non-numeric values to "."? And will doing so automatically reclassify the intended numeric variable as a numeric variable?&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;I think perhaps you have formulated the problem incorrectly. You don't want to identify all non-numeric "values", you want to identify all non-numeric &lt;EM&gt;variables&lt;/EM&gt;, then create a numeric variable as its replacement, in which a non-numeric value becomes a missing value which is a dot (without quotes around it). You don't want to leave these variables as character variables because if you did, you would not be able to do any math (like calculate an average) on them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First determine which variables are character, and then you can create replacement variables that don't have this issue.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;To determine all the character variables in your data set, replace "libraryname" with the actual name of the library where this data set exists, and 'datasetname' is the actual name of your data set.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;proc sql noprint;
    select distinct name into :names separated by ' ' from sashelp.vcolumn 
    where libname="libraryname" and memname='datasetname' and type='char';
quit;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;Then, a macro creates corresponding numeric variables where the N/A or other character values are set to missing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%macro create_num;
    data want;
         set have;
         %do i=1 %to %sysfunc(countw(&amp;amp;names));
             %let thisname=%scan(&amp;amp;names,&amp;amp;i,%str( ));
             length &amp;amp;thisname._N 8;
             &amp;amp;thisname._N = &amp;amp;thisname;
          %end;
     run;
%mend;
%create_num&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Thu, 26 Dec 2019 12:35:44 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613885#M18492</guid>
      <dc:creator>PaigeMiller</dc:creator>
      <dc:date>2019-12-26T12:35:44Z</dc:date>
    </item>
    <item>
      <title>Re: Character values in numeric variables</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613923#M18500</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/304766"&gt;@PrinceAladdin&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;Dear SAS community,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am trying to run an array to convert all blank values for all variables to "." However, when I try to run the array, I receive an error message indicating some of my numeric variables are actually character variables and thus the array cannot run. I am hypothesizing that these variables (which should be numeric) are considered character variables because the data collectors input a value of "N/A" for several observations instead of the intended 1-5 scale. If this is true, is there a way to (1) run an array to identify all non-numeric values and (2) convert those non-numeric values to "."? And will doing so automatically reclassify the intended numeric variable as a numeric variable?&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thank you!&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Typically I would consider this a failure in how the data was originally read into SAS. If you know a variable is intended to be numeric then use an approach that reads the data into the correct type.&lt;/P&gt;
&lt;P&gt;I read lots of files with a variety of poor data entry standards and have to create appropriate "missing" or recoded values depending upon the actual characters encountered. If you use a data step to read the data then SAS provides a very nice tool in the form of custom informats to read commonly occurring text into a desired value.&lt;/P&gt;
&lt;P&gt;Example:&lt;/P&gt;
&lt;PRE&gt;Proc format library=work;
invalue ynx (upcase)
'Y','YES' = 1
'N','NO'  = 0
'X'       = .x
' '       = .
other = _error_;
invalue lickert (upcase)
1,2,3,4,5 =_same_
'NA','N/A' = .n
' '        = .
other    =_error_
;

data example;
   infile datalines dlm=',' truncover;
   informat x ynx. y lickert. ;
   input x y;
datalines;
y,1
Yes,na
N,N/a
 ,5
Y,8
n, 
x,3
;
run;&lt;/PRE&gt;
&lt;P&gt;This creates two custom formats one that expects values of Y or Yes, N or No, X, and blank. The second expects values of 1 to 5, integers, NA or N/A and blank. Any other value encountered will be treated as an invalid data value and provide a note in the log.&lt;/P&gt;
&lt;P&gt;The UPCASE on the invalue statements says to make the encountered value upper case before comparing to the rules in the format. So in the first format Y, y, Yes, YEs, yES, yEs , YeS, yeS would all be considered valid values and treated as 1 on input.&lt;/P&gt;
&lt;P&gt;The informats also use special missing to indicate that special input codes X or NA can be differentiated from a blank though the numeric value is missing and will not be used for any summary statistics. But you can determine between the two types of missing if needed.&lt;/P&gt;
&lt;P&gt;The keywords _same_ means the value assigned is as read so you get numeric 1,2,3,4 and 5. The other=_error_ is the part that says any other value is an invalid value. This will cause the "invalid data" in the log and show the suspect line of data.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I use this because I have some sources that just plain refuse to pay attention to spelling and I have to constantly update the informats to accommodate other random spelling changes like inserting hyphens, extra spaces, reversing the order of word values and such.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I also have&amp;nbsp;similar informats to list things like Site location codes. The data sources will either misspell the site names, renamed the site without notification or add new sites. So I get warnings when I read their data and can ask about appropriate behavior for the "new" site code encountered.&lt;/P&gt;</description>
      <pubDate>Thu, 26 Dec 2019 16:11:47 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613923#M18500</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2019-12-26T16:11:47Z</dc:date>
    </item>
    <item>
      <title>Re: Character values in numeric variables</title>
      <link>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613932#M18502</link>
      <description>&lt;P&gt;How did you create the dataset? Why did you make the variables as character if you wanted them to be numeric?&lt;/P&gt;</description>
      <pubDate>Thu, 26 Dec 2019 16:53:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/New-SAS-User/Character-values-in-numeric-variables/m-p/613932#M18502</guid>
      <dc:creator>Tom</dc:creator>
      <dc:date>2019-12-26T16:53:56Z</dc:date>
    </item>
  </channel>
</rss>

