BookmarkSubscribeRSS Feed
Sheeba
Lapis Lazuli | Level 10

Hi,

I have a Xlsxfile which I would like to import and create a SAS data set. The excel has 5 columns out of which 3 are numeric. For the 3 numeric columns, I would like to remove the observations if the column has a non numeric value.

It will be great if suggest some way to achieve this?

Thanks in advance,

6 REPLIES 6
esjackso
Quartz | Level 8

So the columns really arent numeric if they have other values in them. If they are no missing values to begin with, you can read those columns as numeric and weed out the observations by deleting those that come in as missing (SAS will set non numeric values to missing). This might require an infile statement instead of an import through the wizard or proc depending on how often the values show up in the column (simple explanation - proc import or the wizard sets the column value by scanning part of each column to see what the column needs to be, If it reads in as numeric through this scan the non numeric values will be set to missing through this process as well. IF it reads non numeric it will set the data type to character).

If the list of values is already known you can use a datastep to delete out the observations:

data restricted;

     set raw;

     if var1 in ("A" "B" C") or var2 in ("D" "C" "A") or .... then delete; /*alternative would be to output to a separate dataset*/

run;

You could combine the two tactics if the variables dont read in all the same.

data restricted;

     set raw;

     if var1 in ("A" "B" "C") or var2 = . or then delete;

run;

Hopefully this gives you a start at what you need!

EJ

Anotherdream
Quartz | Level 8

If you know that the variable is SUPPOSED to be numeric and people are accidently creating character values AND you ALWAYS want to remove all of the character data, just do an input to manually convert the values to numeric.

Note you are changing data in doing this, and losing the record for all variables that are not numeric, but this is what you asked for (I still wouldn't recommend this without understanding why your variables aren't numeric if they're supposed to be).

Anyways just do something like this

proc import datafile='myfile' out=newdata replace;

guessingrows=10000;

run;

data newdata_fixed;

set newdata;

NewNumericfield=input(character_Field,8.);

run;

This will convert all of the characters to null values within your numeric fields.

Sheeba
Lapis Lazuli | Level 10

Hi Anotherdream ,

Thanks a lot for the reply.

Yes, the variable in the xlsx is supposed to be numeric. But if people are putting some values by mistake, I would like to identify the records with invalid values and produce a report to the user.

Regards,

Sheeba S

Astounding
PROC Star

Here's one way:

data invalid;

   set have;

   length var_name var_value $ 32;

   array check {5} A B C D E;

   do _n_=1 to 5;

      if check{_n_} > ' ' and input(check{_n_}, ??12.) = . then do;
         var_name = vname(check{_n_});

         var_value = check{_n_};

         output;

      end;

   end;

run;

It hard-codes the variable names, and assumes that they are already character.  Automating that functionality might be your next step once this part is working.

Good luck.

Sheeba
Lapis Lazuli | Level 10

Hi Astounding,

Thanks a lot for the reply.

I will definitely give it a try.

Regards,

Sheeba Swaminathan

Sheeba
Lapis Lazuli | Level 10

Hi EJ,

Thanks a lot for the reply.

I am trying to create a reusable project to be used by n number of users. The idea is to allow the users to upload an xlsx file. Once the file is in the server, plan is to use proc import. The structure of xlsx is known ( 3 character columns and 2 numeric columns). Numeric columns are cost price and list price. My requirement is to check if the numeric columns ( cost and list price) has any non numeric value like character values or invalid values and produce a report with those observations. so the list of values is not known

Regards,

Sheeba S

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 4486 views
  • 0 likes
  • 4 in conversation