<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: dropping data with low freq in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259628#M269016</link>
    <description>&lt;P&gt;Hi Loko, thank you for your macro, it worked well, one more thing, if I want to remove those variable with only one class, e.g. suppose a variable (look below) with only two classes, this one is going to stay with an unique class (170/170) after running the macro, how I could drop those variables from my data using the same macro?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Var4&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;&lt;STRONG&gt;178/178&lt;/STRONG&gt;&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;/P&gt;</description>
    <pubDate>Tue, 29 Mar 2016 12:10:50 GMT</pubDate>
    <dc:creator>Fersal</dc:creator>
    <dc:date>2016-03-29T12:10:50Z</dc:date>
    <item>
      <title>dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259321#M269011</link>
      <description>&lt;P&gt;Hi all,&lt;/P&gt;&lt;P&gt;I have a dataset from which I wish to drop (for subsequent analysis) those values of the variables with only one observation, that is with a Freq count of 1. In following table there is a similar example. The values in bold are those I do not want to consider in my analysis. I appreciate any suggestion. Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Trait Var1 Var2 ...Var300&lt;BR /&gt;1 200/200 120/120 100/100&lt;BR /&gt;1 150/150 140/140 &lt;STRONG&gt;99/99&lt;/STRONG&gt;&lt;BR /&gt;1 100/100 180/180 135/135&lt;BR /&gt;1 150/150 120/120 100/100&lt;BR /&gt;1 100/100 180/180 160/160&lt;BR /&gt;1 150/150 &lt;STRONG&gt;119/119&lt;/STRONG&gt; 160/160&lt;BR /&gt;1 200/200 140/140 135/135&lt;BR /&gt;0 150/150 180/180&lt;BR /&gt;0 &lt;STRONG&gt;149/149&lt;/STRONG&gt; 140/140 160/160&lt;BR /&gt;0 100/100 120/120 &lt;STRONG&gt;133/133&lt;/STRONG&gt;&lt;BR /&gt;0 100/100 140/140 100/100&lt;BR /&gt;0 200/200 180/180 135/135&lt;BR /&gt;0 150/150 140/140 160/160&lt;BR /&gt;0 200/200 180/180 100/100&lt;BR /&gt;0 100/100 120/120 135/135&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 10:06:31 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259321#M269011</guid>
      <dc:creator>Fersal</dc:creator>
      <dc:date>2016-03-28T10:06:31Z</dc:date>
    </item>
    <item>
      <title>Re: dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259334#M269012</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It need a little bit of refinement (not creating the macro variable allvars manually), but may be a solution:&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;data have;&lt;BR /&gt;infile datalines missover;&lt;BR /&gt;input Trait $ Var1 :$7. Var2 :$7. Var300 :$7.;&lt;BR /&gt;datalines;&lt;BR /&gt;1 200/200 120/120 100/100&lt;BR /&gt;1 150/150 140/140 99/99&lt;BR /&gt;1 100/100 180/180 135/135&lt;BR /&gt;1 150/150 120/120 100/100&lt;BR /&gt;1 100/100 180/180 160/160&lt;BR /&gt;1 150/150 119/119 160/160&lt;BR /&gt;1 200/200 140/140 135/135&lt;BR /&gt;0 150/150 180/180&lt;BR /&gt;0 149/149 140/140 160/160&lt;BR /&gt;0 100/100 120/120 133/133&lt;BR /&gt;0 100/100 140/140 100/100&lt;BR /&gt;0 200/200 180/180 135/135&lt;BR /&gt;0 150/150 140/140 160/160&lt;BR /&gt;0 200/200 180/180 100/100&lt;BR /&gt;0 100/100 120/120 135/135&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;%let allvars=Var1 Var2 Var300;&lt;BR /&gt;&lt;BR /&gt;%macro h;&lt;BR /&gt;&lt;BR /&gt;%let i=1;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;%do %while (%scan(&amp;amp;allvars,&amp;amp;i) ne);&lt;BR /&gt;&lt;BR /&gt;%let var=%scan(&amp;amp;allvars,&amp;amp;i);&lt;BR /&gt;proc sql;&lt;BR /&gt;&lt;BR /&gt;create table int as&lt;BR /&gt;select &amp;amp;var, count(&amp;amp;var) as no&lt;BR /&gt;from have&lt;BR /&gt;group by &amp;amp;var&lt;BR /&gt;having no eq 1;&lt;BR /&gt;&lt;BR /&gt;quit;&lt;BR /&gt;&lt;BR /&gt;data have;&lt;BR /&gt;&lt;BR /&gt;set have;&lt;BR /&gt;&lt;BR /&gt;if _n_=1 then&lt;BR /&gt;&amp;nbsp;do;&lt;BR /&gt;&amp;nbsp;&amp;nbsp;&amp;nbsp; &amp;nbsp;declare hash i(dataset:'int');&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;i.definekey("&amp;amp;var");&lt;BR /&gt;&amp;nbsp;&amp;nbsp; &amp;nbsp;i.definedone();&lt;BR /&gt;&lt;BR /&gt;&amp;nbsp;end;&lt;BR /&gt;&amp;nbsp;&lt;BR /&gt;rc=i.check();&lt;BR /&gt;&lt;BR /&gt;if rc=0 then call missing(&amp;amp;var);&lt;BR /&gt;&lt;BR /&gt;run;&lt;BR /&gt;&lt;BR /&gt;%let i=%eval(&amp;amp;i+1);&lt;BR /&gt;%end;&lt;BR /&gt;&lt;BR /&gt;&lt;BR /&gt;%mend h;&lt;BR /&gt;&lt;BR /&gt;%h&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 12:44:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259334#M269012</guid>
      <dc:creator>Loko</dc:creator>
      <dc:date>2016-03-28T12:44:39Z</dc:date>
    </item>
    <item>
      <title>Re: dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259339#M269013</link>
      <description>&lt;P&gt;Look at proc sort with nounique option.&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You'll have to run multiple iterations for each variable so maintaining a macro loop is probably still required.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 13:18:05 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259339#M269013</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-03-28T13:18:05Z</dc:date>
    </item>
    <item>
      <title>Re: dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259388#M269014</link>
      <description>&lt;P&gt;Hi&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/78747"&gt;@Fersal﻿&lt;/a&gt;,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;First of all, you should be sure that your analysis will still be valid after excluding values just because they happen to occur only once within a trait.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;If your dataset contains only 7-8 observations per trait (as your sample data), isn't it quite likely that some valid values (in some of the 300 variables) will occur only once?&amp;nbsp;On the other hand, if your real dataset has thousands of observations, some "improbable" values might occur 2, 3 or 4 times, or only once, just by chance.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;What I'm trying to say is that the rule "exclude values with frequency count 1" is not necessarily sensible.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Maybe you can specify exclusion criteria based on the values themselves and apply these by means of (possibly user-defined) functions or formats.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Also, it could simplify the code if you transposed the data from wide to long format (esp. if the 300 variables have identical types and lengths).&lt;/P&gt;</description>
      <pubDate>Mon, 28 Mar 2016 17:05:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259388#M269014</guid>
      <dc:creator>FreelanceReinh</dc:creator>
      <dc:date>2016-03-28T17:05:25Z</dc:date>
    </item>
    <item>
      <title>Re: dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259447#M269015</link>
      <description>&lt;P&gt;If you still want to do it:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data have;
infile datalines truncover;
input Trait (Var1 - Var3) ($);
datalines;
1 200/200 120/120 100/100
1 150/150 140/140 99/99
1 100/100 180/180 135/135
1 150/150 120/120 100/100
1 100/100 180/180 160/160
1 150/150 119/119 160/160
1 200/200 140/140 135/135
0 150/150 180/180
0 149/149 140/140 160/160
0 100/100 120/120 133/133
0 100/100 140/140 100/100
0 200/200 180/180 135/135
0 150/150 140/140 160/160
0 200/200 180/180 100/100
0 100/100 120/120 135/135
;

data list;
set have;
obs = _n_;
array a{*} var:;
do i = 1 to dim(a);
    if not missing(a{i}) then do;
        value = a{i};
        output;
        end; 
    end;
keep Trait obs value;
run;

proc sql;
create table newList as
select Trait, obs, value
from list
group by Trait, value
having count(*) &amp;gt; 1
order by Trait, obs, value;
quit;

proc transpose data=newList out=want(drop=_name_ obs) prefix=var;
var value;
by Trait obs;
run;&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Mon, 28 Mar 2016 19:50:29 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259447#M269015</guid>
      <dc:creator>PGStats</dc:creator>
      <dc:date>2016-03-28T19:50:29Z</dc:date>
    </item>
    <item>
      <title>Re: dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259628#M269016</link>
      <description>&lt;P&gt;Hi Loko, thank you for your macro, it worked well, one more thing, if I want to remove those variable with only one class, e.g. suppose a variable (look below) with only two classes, this one is going to stay with an unique class (170/170) after running the macro, how I could drop those variables from my data using the same macro?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Var4&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;&lt;STRONG&gt;178/178&lt;/STRONG&gt;&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;BR /&gt;170/170&lt;/P&gt;</description>
      <pubDate>Tue, 29 Mar 2016 12:10:50 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259628#M269016</guid>
      <dc:creator>Fersal</dc:creator>
      <dc:date>2016-03-29T12:10:50Z</dc:date>
    </item>
    <item>
      <title>Re: dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259640#M269017</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hi FreelanceReinhard,&lt;/P&gt;&lt;P&gt;Thanks for your comments, actually the original dataset is pretty big, and the number of observation is much higher than number of variables (not like the example I showed). I have had a convergence problem with this data when I try to fit them to a logit model (even after apply the Firth option), so there is not likelihood estimates for these variables, for that reason I want to know what happen if I remove those classes.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Mar 2016 12:45:54 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259640#M269017</guid>
      <dc:creator>Fersal</dc:creator>
      <dc:date>2016-03-29T12:45:54Z</dc:date>
    </item>
    <item>
      <title>Re: dropping data with low freq</title>
      <link>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259694#M269018</link>
      <description>&lt;P&gt;I thought it was 0 cells that caused the issue, not 1's?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 29 Mar 2016 14:45:48 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/dropping-data-with-low-freq/m-p/259694#M269018</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2016-03-29T14:45:48Z</dc:date>
    </item>
  </channel>
</rss>

