<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic How to identify unaccpetable account numbers in the data set? in SAS Procedures</title>
    <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104856#M29280</link>
    <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: green; font-size: 11pt;"&gt;Hi SAS Forum,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: green; font-size: 11pt;"&gt;I have a data set like this. "Account_number" is a character variable of it. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; font-size: 11pt; background: white; font-family: Arial;"&gt;data&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; test;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: blue; font-size: 11pt;"&gt;input&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; account_number $;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: blue; font-size: 11pt;"&gt;cards&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;2342342&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;450525&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;0&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;1254&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;0&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;4522&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;ABC&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; font-size: 11pt; background: white; font-family: Arial;"&gt;run&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: red; font-size: 11pt;"&gt;My objective: I&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; need to identify and remove unacceptable values of&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: purple; font-size: 11pt;"&gt;"account_number"&lt;/SPAN&gt; &lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;variable.&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;A priori we do not know what kind of unacceptable account Numbers are hidden in my one million data &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;set.&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;Could you please sugggest me a method for &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;this.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;This is what I have done&lt;/SPAN&gt;&lt;STRONG style="color: teal; font-size: 11pt; background: white; font-family: Arial;"&gt;.&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;(&lt;/SPAN&gt;&lt;STRONG style="color: teal; font-size: 11pt; background: white; font-family: Arial;"&gt;1&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;) Step &lt;/SPAN&gt;&lt;STRONG style="color: teal; font-size: 11pt; background: white; font-family: Arial;"&gt;1&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; - Using proc freq, I identified two missing accounts numbers, two zeros and one ABC as&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;uncceptable&amp;nbsp; values because an account_number of a customer cannot assume any of &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;them.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;Proc freq data=test;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &lt;SPAN style="font-family: Arial; background: white; color: red; font-size: 11pt;"&gt;tables&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; account_number/missing;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; font-size: 11pt; background: white; font-family: Arial;"&gt;run&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: red; font-size: 11pt;"&gt;However, &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;proc freq cannot be done for my large data &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;set.&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: green;"&gt;(2). Step2: Now I used the following code to remove unacceptable values*/&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;data want;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: red;"&gt;Set&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; test ;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: red;"&gt;if&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; account_number NOTIN (&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: purple;"&gt;' '&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;, &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: purple;"&gt;'ABC'&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;, &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: purple;"&gt;'0'&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;);&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; background: white; font-family: 'Courier New';"&gt;run&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;(&lt;/SPAN&gt;&lt;STRONG style="color: teal; background: white; font-family: 'Courier New';"&gt;3&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;)&lt;/SPAN&gt;&lt;STRONG style="color: teal; background: white; font-family: 'Courier New';"&gt;.&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; Step &lt;/SPAN&gt;&lt;STRONG style="color: teal; background: white; font-family: 'Courier New';"&gt;3&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;Now I have a cleaned data &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: teal;"&gt;set.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;Question:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;But this method is not realistic, I think. Could anyone suggest me a realistic method to achieve my &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: teal;"&gt;objective.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;Thanks&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Mirisage&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
    <pubDate>Wed, 24 Oct 2012 03:05:17 GMT</pubDate>
    <dc:creator>Mirisage</dc:creator>
    <dc:date>2012-10-24T03:05:17Z</dc:date>
    <item>
      <title>How to identify unaccpetable account numbers in the data set?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104856#M29280</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: green; font-size: 11pt;"&gt;Hi SAS Forum,&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: green; font-size: 11pt;"&gt;I have a data set like this. "Account_number" is a character variable of it. &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; font-size: 11pt; background: white; font-family: Arial;"&gt;data&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; test;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: blue; font-size: 11pt;"&gt;input&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; account_number $;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: blue; font-size: 11pt;"&gt;cards&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;2342342&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;450525&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;0&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;1254&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;0&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;4522&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: #ffffc0; color: black; font-size: 11pt;"&gt;ABC&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; font-size: 11pt; background: white; font-family: Arial;"&gt;run&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: red; font-size: 11pt;"&gt;My objective: I&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; need to identify and remove unacceptable values of&amp;nbsp; &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: purple; font-size: 11pt;"&gt;"account_number"&lt;/SPAN&gt; &lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;variable.&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;A priori we do not know what kind of unacceptable account Numbers are hidden in my one million data &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;set.&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;Could you please sugggest me a method for &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;this.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;This is what I have done&lt;/SPAN&gt;&lt;STRONG style="color: teal; font-size: 11pt; background: white; font-family: Arial;"&gt;.&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;(&lt;/SPAN&gt;&lt;STRONG style="color: teal; font-size: 11pt; background: white; font-family: Arial;"&gt;1&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;) Step &lt;/SPAN&gt;&lt;STRONG style="color: teal; font-size: 11pt; background: white; font-family: Arial;"&gt;1&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; - Using proc freq, I identified two missing accounts numbers, two zeros and one ABC as&amp;nbsp; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;uncceptable&amp;nbsp; values because an account_number of a customer cannot assume any of &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;them.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;Proc freq data=test;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &lt;SPAN style="font-family: Arial; background: white; color: red; font-size: 11pt;"&gt;tables&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; account_number/missing;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; font-size: 11pt; background: white; font-family: Arial;"&gt;run&lt;/STRONG&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: red; font-size: 11pt;"&gt;However, &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt;proc freq cannot be done for my large data &lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: teal; font-size: 11pt;"&gt;set.&lt;/SPAN&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: Arial; background: white; color: black; font-size: 11pt;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: green;"&gt;(2). Step2: Now I used the following code to remove unacceptable values*/&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;data want;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: red;"&gt;Set&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; test ;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: red;"&gt;if&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; account_number NOTIN (&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: purple;"&gt;' '&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;, &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: purple;"&gt;'ABC'&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;, &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: purple;"&gt;'0'&lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;);&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;STRONG style="color: navy; background: white; font-family: 'Courier New';"&gt;run&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;(&lt;/SPAN&gt;&lt;STRONG style="color: teal; background: white; font-family: 'Courier New';"&gt;3&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;)&lt;/SPAN&gt;&lt;STRONG style="color: teal; background: white; font-family: 'Courier New';"&gt;.&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; Step &lt;/SPAN&gt;&lt;STRONG style="color: teal; background: white; font-family: 'Courier New';"&gt;3&lt;/STRONG&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;Now I have a cleaned data &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: teal;"&gt;set.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;Question:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;But this method is not realistic, I think. Could anyone suggest me a realistic method to achieve my &lt;/SPAN&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: teal;"&gt;objective.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN style="font-family: 'Courier New'; background: white; color: black;"&gt;Thanks&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;Mirisage&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Oct 2012 03:05:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104856#M29280</guid>
      <dc:creator>Mirisage</dc:creator>
      <dc:date>2012-10-24T03:05:17Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify unaccpetable account numbers in the data set?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104857#M29281</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;so you are defining your rules on the fly. one rule to start with from your description: it has to be digits larger than zero:&lt;/P&gt;&lt;P&gt;data want;&lt;/P&gt;&lt;P&gt; set test;&lt;/P&gt;&lt;P&gt;&amp;nbsp; if account_number&amp;gt;0 ;&lt;/P&gt;&lt;P&gt;run;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Well, your log will look ugly if you have too many of 'abc's.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Haikuo&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Oct 2012 03:18:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104857#M29281</guid>
      <dc:creator>Haikuo</dc:creator>
      <dc:date>2012-10-24T03:18:43Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify unaccpetable account numbers in the data set?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104858#M29282</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;What are the criteria that define a valid account number?&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Oct 2012 03:19:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104858#M29282</guid>
      <dc:creator>art297</dc:creator>
      <dc:date>2012-10-24T03:19:05Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify unaccpetable account numbers in the data set?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104859#M29283</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;One of the key questions to answer is whether your account_number must have only digits.&amp;nbsp; Also, whether the length is less than 15.&amp;nbsp; (Account numbers with 16 or more digits, eg credit cards, are best left as character values because that is the limit of numeric precision in SAS unless you are working on a mainframe.)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If the answer is yes you can transform account_number to a true number&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; account_num = input (account_number, ?? Best.) ;&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;(?? will suppress warnings if non numeric values are encountered.)&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Any account having a null value for account_num would be invalid.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Then use proc univariate to do an analysis of the number range.&amp;nbsp; In addition to means and medians you can get the largest and smallest values.&amp;nbsp; Examine these to check whether they are in range.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Also look in the data for accounts with 11111111,&amp;nbsp; 999999, 123456, 000000 and the like.&lt;/P&gt;&lt;P&gt;You have to think like the front line staff who find they have to input some value into a form and make up a number (which they can later replace if necessary).&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Richard in Oz&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Oct 2012 05:45:13 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104859#M29283</guid>
      <dc:creator>RichardinOz</dc:creator>
      <dc:date>2012-10-24T05:45:13Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify unaccpetable account numbers in the data set?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104860#M29284</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;If you repeatedly want to analyze data quality, you might want to look a tool, instead of try to fix it by ad hoc programming. It seems that data flux should suit your needs.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Oct 2012 08:40:56 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104860#M29284</guid>
      <dc:creator>LinusH</dc:creator>
      <dc:date>2012-10-24T08:40:56Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify unaccpetable account numbers in the data set?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104861#M29285</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Without actual rules this is going to be difficult in places.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;I think one place to look is at the function NOTDIGIT if your accounts are only supposed to contain digits. This function will allow identifying any string with anything besides digits.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;If this were my data I would also look very closely at any number of only one digit.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;And PROC FREQ with NOPRINT and sending the output to a dataset may be useful. Depending on the nature of the data any account with only one record might be suspect.&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Of course there's lots of other information that should likely be examined at the same time such as office entering the data, date entered and such to find different types of suspect accounts.&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Wed, 24 Oct 2012 14:45:57 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104861#M29285</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2012-10-24T14:45:57Z</dc:date>
    </item>
    <item>
      <title>Re: How to identify unaccpetable account numbers in the data set?</title>
      <link>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104862#M29286</link>
      <description>&lt;HTML&gt;&lt;HEAD&gt;&lt;/HEAD&gt;&lt;BODY&gt;&lt;P&gt;Hi Ballardw , Haikuo, Art, Richard in Oz and LinusH,&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Thank you very much to every one of you for sharing this knowledge. &lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Mirisage&lt;/P&gt;&lt;/BODY&gt;&lt;/HTML&gt;</description>
      <pubDate>Tue, 30 Oct 2012 01:45:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Procedures/How-to-identify-unaccpetable-account-numbers-in-the-data-set/m-p/104862#M29286</guid>
      <dc:creator>Mirisage</dc:creator>
      <dc:date>2012-10-30T01:45:25Z</dc:date>
    </item>
  </channel>
</rss>

