<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: How to remove variables in a Data set that contain a certain percentage of special values in SAS Enterprise Guide</title>
    <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778687#M39821</link>
    <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18293"&gt;@Norman21&lt;/a&gt;.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I will take a look and i will let you know if it works.&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Fri, 05 Nov 2021 10:01:51 GMT</pubDate>
    <dc:creator>EduanR</dc:creator>
    <dc:date>2021-11-05T10:01:51Z</dc:date>
    <item>
      <title>How to remove variables in a Data set that contain a certain percentage of special values</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778515#M39819</link>
      <description>&lt;P&gt;Good day!&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope all is well with you. This is my first time posting a question on the SAS blog so I apologize if a listed it under the wrong location. I thought of listing it to enterprise guide since I am working in enterprise guide&amp;nbsp;&lt;span class="lia-unicode-emoji" title=":grinning_face:"&gt;😀&lt;/span&gt;.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have the following problem: I want to remove variables in a data set that have more than 90% of the following values:-1 , -2, -3, -4, -5 , -6 , -7. Stated otherwise, if a variable is made up of more than 90% of the above listed values, the variable should be removed from the data set. Below is an example of the data set I am working with.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE border="1"&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;ID&lt;/TD&gt;&lt;TD&gt;Target Variable&lt;/TD&gt;&lt;TD&gt;Var1&lt;/TD&gt;&lt;TD&gt;Var2&lt;/TD&gt;&lt;TD&gt;...&lt;/TD&gt;&lt;TD&gt;Var675&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;001&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;-2&lt;/TD&gt;&lt;TD&gt;-2&lt;/TD&gt;&lt;TD&gt;..&lt;/TD&gt;&lt;TD&gt;-3&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;002&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;-2&lt;/TD&gt;&lt;TD&gt;-2&lt;/TD&gt;&lt;TD&gt;...&lt;/TD&gt;&lt;TD&gt;-2&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;003&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;24&lt;/TD&gt;&lt;TD&gt;1000&lt;/TD&gt;&lt;TD&gt;...&lt;/TD&gt;&lt;TD&gt;100&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;004&lt;/TD&gt;&lt;TD&gt;0&lt;/TD&gt;&lt;TD&gt;-2&lt;/TD&gt;&lt;TD&gt;-4&lt;/TD&gt;&lt;TD&gt;...&lt;/TD&gt;&lt;TD&gt;-4&lt;/TD&gt;&lt;/TR&gt;&lt;TR&gt;&lt;TD&gt;005&lt;/TD&gt;&lt;TD&gt;1&lt;/TD&gt;&lt;TD&gt;-3&lt;/TD&gt;&lt;TD&gt;-5&lt;/TD&gt;&lt;TD&gt;...&lt;/TD&gt;&lt;TD&gt;-3&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have not yet written any code. But I got the feeling that I will probably have to use a loop (to loop through all the columns). So therefore I created the following macro definition that will perform this process for all the numeric columns in any given data set. All the columns from Var1 - Var675 are numeric.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;%macro control(dsn, method = Quantile);

%let lib = %upcase(%scan(&amp;amp;dsn,1,.));
%let dsn = %upcase(%scan(&amp;amp;dsn,2,.));
%let method = %upcase(&amp;amp;method);

/*	Creatign macro's for each column in the data set	*/

data _null_;
	set sashelp.vcolumn(keep=libname memname name type
	where=(libname = "&amp;amp;lib" AND memname = "&amp;amp;dsn" and type = "num")) end=eof;
		call symputx('col'||strip(_n_),strip(name));
		if eof then call symputx('max',strip(_n_));
run;

%do i = 1 %to &amp;amp;max;

I presume here is where the code will go for my problem.
 
%end;

%mend control&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Thanks in advance!&lt;/P&gt;</description>
      <pubDate>Thu, 04 Nov 2021 16:05:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778515#M39819</guid>
      <dc:creator>EduanR</dc:creator>
      <dc:date>2021-11-04T16:05:37Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove variables in a Data set that contain a certain percentage of special values</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778664#M39820</link>
      <description>&lt;P&gt;Welcome to the Community!&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;You might find the following helpful:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;A href="https://blogs.sas.com/content/iml/2011/09/19/count-the-number-of-missing-values-for-each-variable.html" target="_blank"&gt;https://blogs.sas.com/content/iml/2011/09/19/count-the-number-of-missing-values-for-each-variable.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Nov 2021 06:39:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778664#M39820</guid>
      <dc:creator>Norman21</dc:creator>
      <dc:date>2021-11-05T06:39:22Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove variables in a Data set that contain a certain percentage of special values</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778687#M39821</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18293"&gt;@Norman21&lt;/a&gt;.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I will take a look and i will let you know if it works.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 05 Nov 2021 10:01:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778687#M39821</guid>
      <dc:creator>EduanR</dc:creator>
      <dc:date>2021-11-05T10:01:51Z</dc:date>
    </item>
    <item>
      <title>Re: How to remove variables in a Data set that contain a certain percentage of special values</title>
      <link>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778975#M39824</link>
      <description>&lt;P&gt;Thank you&amp;nbsp;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/18293"&gt;@Norman21&lt;/a&gt;&amp;nbsp;fro the help.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I managed to write the following code and it seems to work/provide the desired output.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Firstly, I created an empty table with the necessary specification before the macro as follows:&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;proc sql;
	Create table Datasets.Special_Perc_&amp;amp;Split
	(Variable			char 	length = 23 	format=$50.,
	Sum_Special			num 	length = 8		format = best12.,
	Sum_positive		num		length = 8		format = best12.,
	Percentage 			num		length = 8		format = Percent6.2);		
quit; &lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;Secondly,&amp;nbsp; I wrote a macro to calculate the percentage of special values for each column in any given data set.&amp;nbsp; This is then added to the table that I created above for each iteration of the loop.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;&lt;CODE class=""&gt;%macro Special(dsn);

	%let lib = %upcase(%scan(&amp;amp;dsn,1,.));
	%let dsn = %upcase(%scan(&amp;amp;dsn,2,.));

	/*	Creatign macro's for each column in the data set	*/

	data _null_;
		set sashelp.vcolumn(keep=libname memname name type
		where=(libname = "&amp;amp;lib" AND memname = "&amp;amp;dsn" and type = "num")) end=eof;
			call symputx('col'||strip(_n_),strip(name));
			if eof then call symputx('max',strip(_n_));
	run;

	%do i = 1 %to &amp;amp;max;
	
		proc sql noprint;
		select sum(&amp;amp;&amp;amp;col&amp;amp;i in (-1,-2,-3,-4,-5,-6,-7)) as NegS,
		       sum(&amp;amp;&amp;amp;col&amp;amp;i not in (-1,-2,-3,-4,-5,-6,-7)) as Pos,
			round(calculated NegS/(calculated NegS + calculated Pos), .01) as Perct
			into :SN, :SP, :Perc
		from &amp;amp;lib..&amp;amp;dsn;
		quit;

		data b;
			Variable = "&amp;amp;&amp;amp;col&amp;amp;i";
			Sum_Special = &amp;amp;sn;
			Sum_positive = &amp;amp;sp;
			Percentage = &amp;amp;Perc;
		run;

		proc append base = Datasets.Special_Perc_&amp;amp;split data = b force; 
		run;

		proc sql;
			drop table b;
		quit;
   %end;

%mend special;

%special(Datasets.Longevity_Numeric_&amp;amp;split);&lt;/CODE&gt;&lt;/PRE&gt;&lt;P&gt;Lastly, i use a proc sql to exclude those variables that have a special value percentage of more than 90%.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It may not be the best way to do it, but it got the job done.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="EduanR_0-1636283540132.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/65460iFD71F2E25BFAD1D7/image-size/medium?v=v2&amp;amp;px=400" role="button" title="EduanR_0-1636283540132.png" alt="EduanR_0-1636283540132.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 07 Nov 2021 11:13:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Enterprise-Guide/How-to-remove-variables-in-a-Data-set-that-contain-a-certain/m-p/778975#M39824</guid>
      <dc:creator>EduanR</dc:creator>
      <dc:date>2021-11-07T11:13:49Z</dc:date>
    </item>
  </channel>
</rss>

