<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>topic Re: Delete variables with 90% missing values in SAS Programming</title>
    <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940181#M369067</link>
    <description>And I noticed that your python function "dfbig.isnull()"  only checked the NULL value, you also need to check EMPTY value.&lt;BR /&gt;Because SAS is unlike database, SAS take EMPTY value as NULL value , but database take them as two different thing .</description>
    <pubDate>Wed, 21 Aug 2024 01:03:11 GMT</pubDate>
    <dc:creator>Ksharp</dc:creator>
    <dc:date>2024-08-21T01:03:11Z</dc:date>
    <item>
      <title>Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859816#M339675</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have a dataset with around 520 variable with many missing values. I want to delete variables with &amp;gt;90% missing values (I dont want to impute). Is there any macro or anything else to do it?&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I tried ChatGPT, but I got an error that GPT could not solve.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Here is GPTcode:&lt;/P&gt;&lt;P&gt;/* Step 1: Calculate the percentage of missing values for each variable */&lt;BR /&gt;ods output onewayfreqs=missing;&lt;BR /&gt;proc freq data=dxccsr_transposed;&lt;BR /&gt;tables _all_ / missing;&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/* Step 2: Identify variables with 90% or more missing values */&lt;BR /&gt;data _null_;&lt;BR /&gt;set missing end=lastobs;&lt;BR /&gt;if _n_ = 1 then call symputx('numvars',_nobs_);&lt;BR /&gt;array pctmiss(*) _:;&lt;BR /&gt;array vars(*) $ _CHARACTER_;&lt;BR /&gt;do i = 1 to dim(pctmiss);&lt;BR /&gt;if pctmiss(i) &amp;gt;= 90 then vars(i) = scan(vlabel(pctmiss(i)), 1, ' ');&lt;BR /&gt;end;&lt;BR /&gt;if lastobs then call symputx('delvars',catx(' ', of vars(*)));&lt;BR /&gt;run;&lt;/P&gt;&lt;P&gt;/* Step 3: Define a macro to drop variables */&lt;BR /&gt;%macro dropvars;&lt;BR /&gt;data dxccsr_transposed(drop=&amp;amp;delvars);&lt;BR /&gt;set dxccsr_transposed;&lt;BR /&gt;run;&lt;BR /&gt;%mend;&lt;/P&gt;&lt;P&gt;/* Step 4: Call the macro to drop variables */&lt;BR /&gt;%dropvars;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The log error:&lt;/P&gt;&lt;P&gt;NOTE: Line generated by the macro variable "DELVARS".&lt;BR /&gt;1 Table DXCCSR_EAR006 3&lt;BR /&gt;-&lt;BR /&gt;214&lt;BR /&gt;23&lt;/P&gt;&lt;P&gt;ERROR 214-322: Variable name 3 is not valid.&lt;/P&gt;&lt;P&gt;ERROR 23-7: Invalid value for the DROP option.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 02:59:28 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859816#M339675</guid>
      <dc:creator>lansoprazole</dc:creator>
      <dc:date>2023-02-21T02:59:28Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859862#M339693</link>
      <description>&lt;P&gt;Names of your original variables? I ask because you are using&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;array pctmiss(*) _:;&lt;/PRE&gt;
&lt;P&gt;And the ODS output for onewayfreqs will only have variables with names that start with _ for the variables I have created, none of the Proc Freq variables. If that is the case and some of your variables of interest do not start with _ that could be the problem. Or did you mean to use F_: for the formatted variables that Ods output creates?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;The proc freq code you show would have the variable PERCENT with values of 90 or greater as your value of interest, not any of the other variables.&amp;nbsp; I don't see where you are checking for any missing values. I am afraid that your code checks to see if the original value was greater than 90, not the percent missing.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;This will show the variables that have a value with a percent &amp;gt; 90 but not checking for missing.&lt;/P&gt;
&lt;PRE&gt;data reduce;
   set missing;
   length var $ 32;
   var=scan(table,2);
   if percent&amp;gt;90;

   keep var percent;
run;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 08:33:37 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859862#M339693</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-02-21T08:33:37Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859872#M339697</link>
      <description>&lt;P&gt;You need a SAS programmer to help you, not a ChatGPT bull*hit, So, good you writing here &lt;span class="lia-unicode-emoji" title=":slightly_smiling_face:"&gt;🙂&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Try this:&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;data test;
  output;
  output;
  output;
  a=4;
  output;
  b=4;
  output;
  c=4;
  output;
  d=4;
  output;
  e=4;
  output;
  f=4;
  output;
  g=4;
  output;
  h=4;
  output;
  i=4;
  output;
  j=4;
  output;
  k=4;
  output;
  l=4;
  m=4;
  n=4;
  o=4;
  output;
run;
proc print data=test;
run;


data _null_;

  if 0 then set test; 
  array _variables_ _NUMERIC_; /* list your variables here */ 

  length _variableName_ $ 32 _numberOfMissing_ 8;
  declare hash _H_(ordered:'a');
  _H_.defineKey("_variableName_");
  _H_.defineData("_variableName_");
  _H_.defineData("_numberOfMissing_");
  _H_.DefineDone();

  do until(EOF);
    set test NOBS=NOBS end=EOF;
    
    do over _variables_;
      _variableName_  = vname(_variables_);
      _numberOfMissing_ = 0;
      _RC_=_H_.find(); 
      _numberOfMissing_ + nmiss(_variables_);
      _H_.replace();
    end;

  end;

  _H_.output(dataset:"missing_count");
  stop;
run;

proc print data=missing_count;
run;

data missing_count;
  if 0 then set test(drop=_ALL_) nobs=_nobs_;
  set missing_count;
  nobs=_nobs_;
  _percentOfMissing_ = _numberOfMissing_/nobs;
  format _percentOfMissing_ percent10.2;
run;

proc print data=missing_count;
run;

proc sql noprint;
  select _variableName_
  into :_variableNameMissing_ separated by " "
  from missing_count
  where _percentOfMissing_ &amp;gt; .90 /* &amp;lt;------------- 90% */
  ;
run;

options symbolgen;
data test2;
  set test(drop=&amp;amp;_variableNameMissing_.);
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Bart&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 09:24:55 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859872#M339697</guid>
      <dc:creator>yabwon</dc:creator>
      <dc:date>2023-02-21T09:24:55Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859898#M339703</link>
      <description>&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%macro drop_vars(dsn=,pct=);
proc transpose data=&amp;amp;dsn.(obs=0) out=vname;
var _all_;
run;
data _null_;
 set vname end=last;
 if _n_=1 then call execute('proc sql;create table n_miss as select ');
 call execute(catx(' ','nmiss(',_name_,') as ',_name_));
 if last then call execute("from &amp;amp;dsn.;quit;");
  else call execute(',');
run;
proc transpose data=n_miss out=n_miss2;
var _all_;
run;

%let dsid=%sysfunc(open(&amp;amp;dsn.));
%let nobs=%sysfunc(attrn(&amp;amp;dsid.,nlobs));
%let dsid=%sysfunc(close(&amp;amp;dsid.));

data n_miss3;
 set n_miss2;
 per_missing=col1/&amp;amp;nobs.;
run;
proc sql noprint;
select _NAME_ into :drops separated by ','
 from n_miss3
  where per_missing &amp;gt; &amp;amp;pct. ;
alter table &amp;amp;dsn.
 drop &amp;amp;drops.;
quit;
%mend;

data have;
 set sashelp.heart;
run;
/*
dsn is the dataset you want modify
pct is the percent of missing you want to drop
*/
%drop_vars(dsn=have,pct=0.6)&lt;/CODE&gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 21 Feb 2023 11:38:17 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859898#M339703</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2023-02-21T11:38:17Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859935#M339720</link>
      <description>&lt;P&gt;As an aside, because I already see a couple answers posted...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That code that ChatGPT spit out is really horrific.&amp;nbsp; You can imagine what it was trying to do (or I guess what it was predicting should be done), but it seems to me as so far off that it's not helpful.&amp;nbsp; I wonder if ChatGPT is significantly better at predicting code for other languages, where it might have had more training data.&amp;nbsp; I've seen plenty of programming folks saying they could use ChatGPT as a first draft of code, or to generate code for the basic/boring stuff.&amp;nbsp; But I'm not at all convinced that the SAS code it can generate is helpful. Interesting, maybe, but I'm not sure it's helpful.&amp;nbsp;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 15:49:07 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859935#M339720</guid>
      <dc:creator>Quentin</dc:creator>
      <dc:date>2023-02-21T15:49:07Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859954#M339728</link>
      <description>&lt;BLOCKQUOTE&gt;&lt;HR /&gt;&lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/19879"&gt;@Quentin&lt;/a&gt;&amp;nbsp;wrote:&lt;BR /&gt;
&lt;P&gt;As an aside, because I already see a couple answers posted...&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;That code that ChatGPT spit out is really horrific.&amp;nbsp; You can imagine what it was trying to do (or I guess what it was predicting should be done), but it seems to me as so far off that it's not helpful.&amp;nbsp; I wonder if ChatGPT is significantly better at predicting code for other languages, where it might have had more training data.&amp;nbsp; I've seen plenty of programming folks saying they could use ChatGPT as a first draft of code, or to generate code for the basic/boring stuff.&amp;nbsp; But I'm not at all convinced that the SAS code it can generate is helpful. Interesting, maybe, but I'm not sure it's helpful.&amp;nbsp;&amp;nbsp;&lt;/P&gt;
&lt;HR /&gt;&lt;/BLOCKQUOTE&gt;
&lt;P&gt;Considering that SAS Proc Import for text files, which knows what you are attempting to do and does a fair job, has recurring issues with making many id type variables numeric when likely not the best choice I sort of cringe at what Chatgpt might suggest for some of the files I read.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Of course we have &lt;STRONG&gt;absolutely no idea&lt;/STRONG&gt; what &lt;a href="https://communities.sas.com/t5/user/viewprofilepage/user-id/339703"&gt;@lansoprazole&lt;/a&gt; asked ChatGPT to do in the first place. Considering the percentage of poorly formed questions we see on this forum I would be surprised if the question was phrased well to begin with and then the bot used the very few parts that it understood, &amp;gt;90 for example, to build from.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Not to mention the likelihood of it coming up with use of a custom informat to provide data validations on reading is extremely small.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 16:35:39 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859954#M339728</guid>
      <dc:creator>ballardw</dc:creator>
      <dc:date>2023-02-21T16:35:39Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859958#M339732</link>
      <description>Interesting response from ChatGPT. It seems to have the right steps but the code cobbled together doesn't align at all. The second step input wouldn't be what would come out of the PROC FREQ and it seems to be working on character variables only. And using labels not variable names.</description>
      <pubDate>Tue, 21 Feb 2023 17:01:51 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859958#M339732</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-02-21T17:01:51Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859959#M339733</link>
      <description>&lt;P&gt;Great. Thanks for your help.&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 17:03:25 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859959#M339733</guid>
      <dc:creator>lansoprazole</dc:creator>
      <dc:date>2023-02-21T17:03:25Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859962#M339735</link>
      <description>&lt;P&gt;Apparently I wrote a macro to do this a few years ago...&lt;/P&gt;
&lt;P&gt;&lt;A href="https://gist.github.com/statgeek/feedc3fc520cb0d2018ca2a8cab241d8" target="_blank" rel="noopener"&gt;https://gist.github.com/statgeek/feedc3fc520cb0d2018ca2a8cab241d8&lt;/A&gt;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;%macro drop_missing_pct(input_dsn = , output_dsn=, pct = , id_vars=);

*input_dsn = input data set name;
*output_dsn = output data set name;
*pct = missing percent, variables with a percentage of missing above this value are dropped;
*id_vars = space delimited list of variables that you do not want to include in the analysis such as ID variables;


*create format for missing;
proc format;
    value $ missfmt ' '="Missing" other="Not Missing";
    value nmissfmt .="Missing" other="Not Missing";
run;

*Proc freq to count missing/non missing;
ods select none;
*turns off the output so the results do not get too messy;
ods table onewayfreqs=temp;

proc freq data=&amp;amp;INPUT_DSN. (drop = &amp;amp;ID_Vars);
    table _all_ / missing;
    format _numeric_ nmissfmt. _character_ $missfmt.;
run;


ods select all;
*Format and organize output;

data long;
    length variable $32. variable_value $50.;
    set temp;
    Variable=scan(table, 2);
    Variable_Value=strip(trim(vvaluex(variable)));
    presentation=catt(frequency, " (", trim(put(percent/100, percent7.1)), ")");
    keep variable variable_value frequency percent cum: presentation;
    label variable='Variable' variable_value='Variable Value';
run;

*not required for display purposes;
proc sort data=long;
    by variable;
run;

*select variables more than x% missing;
proc sql noprint;
select variable into :drop_var_list separated by " "
from long where variable_value = 'Missing' and percent &amp;gt; &amp;amp;pct;
quit;

*Drop variables;
data &amp;amp;output_dsn;
set &amp;amp;input_dsn;
drop &amp;amp;drop_var_list;
run;

*clean up;
*uncomment after testing;
/* proc sql; */
/* drop table long; */
/* drop table temp; */
/* quit; */

%mend;

***************************************************************************************************
*Example Usage
***************************************************************************************************;

data class;
    set sashelp.class;

    if age=14 then
        call missing(height, weight, sex);

    if name='Alfred' then
        call missing(sex, age, height);
    label age="Fancy Age Label";
run;


%drop_missing_pct(input_dsn = class, output_dsn = want, pct = 20, id_vars = Name);



*check output;
proc contents data=want;
run;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 21 Feb 2023 17:19:09 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/859962#M339735</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2023-02-21T17:19:09Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/860298#M339862</link>
      <description>&lt;P&gt;Thanks. Very helpful.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 22 Feb 2023 19:58:49 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/860298#M339862</guid>
      <dc:creator>lansoprazole</dc:creator>
      <dc:date>2023-02-22T19:58:49Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/939931#M368999</link>
      <description>i wanted to get 100% missing values for sas &amp;amp; a dataframe but get different number of columns for both. any idea why?&lt;BR /&gt;&lt;BR /&gt;python import&lt;BR /&gt;df2=pd.read_csv("&lt;A href="https://raw.githubusercontent.com/CharuSAS/SASPythonDataScientists/main/pattern_widespread_decline__N_American_Bumblebees.csv" target="_blank"&gt;https://raw.githubusercontent.com/CharuSAS/SASPythonDataScientists/main/pattern_widespread_decline__N_American_Bumblebees.csv&lt;/A&gt;", encoding='latin-1')&lt;BR /&gt;&lt;BR /&gt;here's the python code I used, &lt;BR /&gt;cols = dfbig.columns[dfbig.isnull().all&lt;BR /&gt;dfbig.drop(cols, axis=1, inplace=True)&lt;BR /&gt;24 columns are returned&lt;BR /&gt;&lt;BR /&gt;sas import&lt;BR /&gt;filename dst2 url "&lt;A href="https://raw.githubusercontent.com/CharuSAS/SASPythonDataScientists/main/pattern_widespread_decline__N_American_Bumblebees.csv" target="_blank"&gt;https://raw.githubusercontent.com/CharuSAS/SASPythonDataScientists/main/pattern_widespread_decline__N_American_Bumblebees.csv&lt;/A&gt;";&lt;BR /&gt;&lt;BR /&gt;proc import file=dst2 out=work.dst2 dbms=csv;&lt;BR /&gt;run;&lt;BR /&gt;I just made 2 changes to your sas code-&lt;BR /&gt; where per_missing = &amp;amp;pct. ;&lt;BR /&gt;%drop_vars(dsn=have,pct=1.0)&lt;BR /&gt;33 columns are returned.</description>
      <pubDate>Mon, 19 Aug 2024 17:02:12 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/939931#M368999</guid>
      <dc:creator>sqlGoddess</dc:creator>
      <dc:date>2024-08-19T17:02:12Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/939982#M369016</link>
      <description>Which is correct?&lt;BR /&gt;&lt;BR /&gt;Confirm the data is read in correctly in each program, likely one is reading it in correctly. &lt;BR /&gt;&lt;BR /&gt;Both 'data import' procedures being used make a bunch of guesses in the background as to datatypes. Likely they're different, resulting in different data being processed by the drop_vars macro or the .drop() operation in python.</description>
      <pubDate>Mon, 19 Aug 2024 19:14:03 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/939982#M369016</guid>
      <dc:creator>Reeza</dc:creator>
      <dc:date>2024-08-19T19:14:03Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940001#M369023</link>
      <description>&lt;P&gt;good point,&lt;/P&gt;
&lt;P&gt;I went back &amp;amp; ran a proc contents after the sas import,&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sqlGoddess_0-1724098046270.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/99428i739829A37AA3A2EA/image-size/medium?v=v2&amp;amp;px=400" role="button" title="sqlGoddess_0-1724098046270.png" alt="sqlGoddess_0-1724098046270.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;and here's the python describe on the dataframe&lt;/P&gt;
&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="sqlGoddess_1-1724098093457.png" style="width: 400px;"&gt;&lt;img src="https://communities.sas.com/t5/image/serverpage/image-id/99429i537F4C9C73789907/image-size/medium?v=v2&amp;amp;px=400" role="button" title="sqlGoddess_1-1724098093457.png" alt="sqlGoddess_1-1724098093457.png" /&gt;&lt;/span&gt;&lt;/P&gt;
&lt;P&gt;both have read in 66907 rows and 170 columns.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;the problem seems to be at the point of dropping columns with 100% missing values.&lt;/P&gt;</description>
      <pubDate>Mon, 19 Aug 2024 20:38:43 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940001#M369023</guid>
      <dc:creator>sqlGoddess</dc:creator>
      <dc:date>2024-08-19T20:38:43Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940032#M369032</link>
      <description>&lt;P&gt;OK. Here is another way to check if a variable is all missing.&lt;/P&gt;
&lt;PRE&gt;&lt;CODE class=" language-sas"&gt;
data have;
 set sashelp.heart;
 call missing(sex,weight);
run;

ods select none;
ods output nlevels=want(where=(NNonMissLevels=0));
proc freq data=have  nlevels;
table _all_;
run;
ods select all;&lt;/CODE&gt;&lt;/PRE&gt;
&lt;P&gt;You can compare it with my PROC SQL.&lt;/P&gt;
&lt;P&gt;BTW, my code drop both character and numeric type variables , I am not sure your Python code would drop&amp;nbsp; both of them.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I am unable to test it due to not access to that csv .&lt;/P&gt;
&lt;P&gt;and also you could check the missing value in both sas and python to see if there are some difference.&lt;/P&gt;</description>
      <pubDate>Tue, 20 Aug 2024 01:15:46 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940032#M369032</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-08-20T01:15:46Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940033#M369033</link>
      <description>And make sure there are not  the ERROR/WARNING info in LOG after running this macro.</description>
      <pubDate>Tue, 20 Aug 2024 00:58:22 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940033#M369033</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-08-20T00:58:22Z</dc:date>
    </item>
    <item>
      <title>Re: Delete variables with 90% missing values</title>
      <link>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940181#M369067</link>
      <description>And I noticed that your python function "dfbig.isnull()"  only checked the NULL value, you also need to check EMPTY value.&lt;BR /&gt;Because SAS is unlike database, SAS take EMPTY value as NULL value , but database take them as two different thing .</description>
      <pubDate>Wed, 21 Aug 2024 01:03:11 GMT</pubDate>
      <guid>https://communities.sas.com/t5/SAS-Programming/Delete-variables-with-90-missing-values/m-p/940181#M369067</guid>
      <dc:creator>Ksharp</dc:creator>
      <dc:date>2024-08-21T01:03:11Z</dc:date>
    </item>
  </channel>
</rss>

