BookmarkSubscribeRSS Feed
hamhocks
Calcite | Level 5

I am importing a spreadsheet with a few thousand rows of data and two columns

All of the first field are importing properly but some of the second field are not.

For example, this is what the import file would look like

SKU_NUMBER              PARENT_SKU

123                                  400

124                                  411        

125                                  414

130                                  418

but this is what might import

SKU_NUMBER              PARENT_SKU

123                                  400

124                                  .      

125                                  .

130                                  418

I can't figure out why some of the data points are getting missed.

I am assuming it's a formatting issue but all of the cells seem to have the same number format. (though it's weird, they appear slightly different -the ones not getting imported are sligthly indented on the right... but i can't find anywhere in excel that can tell me the difference in formatting - number format, alignement, font, etc all seems to be the same)

any help would be much appreciated!  (tried to upload a sample file but not excel allowed!)

9 REPLIES 9
RamKumar
Fluorite | Level 6

How did you import? e.g. proc import, infile statement.

Provide your code as well.

hamhocks
Calcite | Level 5

I didn't use code to import that data.  I used the "import data" wizard from the SAS Enterprise menu.

both are number fields.

i thought both with BEST12 but looks like the first is F12 and the second is BEST12  (not sure what the difference is)

Scott_Mitchell
Quartz | Level 8

You are saying that that the data points not being imported are indented.

Is there a possiblity that some of these are actually character values?

Do a check using the excel function ISNUMBER.  The Syntax is ISNUMBER(CELL REF).

hamhocks
Calcite | Level 5

smart man.  they are not nujmbers - and there is a trailing space in these cells.  (thanks didn't know the isnumber formula)

Now, barring going into each cell and fixing them, how can i fix this en masse?

(lots of people input numbers into this file so how can i make sure it imports properly in the future so i don't have to go and manually change their inputs)

Scott_Mitchell
Quartz | Level 8

If this is a file you only import from and not export to then I would use data validation in excel to limit the values to numeric only.

If you both import and export to this file then you are going to override the data validation as part of the export.  I would import the data as text.  I would then clean the text using the SAS COMPRESS function and convert it to numeric.

You could even do this if turn excel data validation on just to be safe...I would.

Kurt_Bremser
Super User

Save the file as .csv and import it with a data step. Either SAS will tolerate the blanks, or you can do the initial input as character and then convert in the data step.

Saving to a text file also gives you the luxury of visually inspecting what data you try to read.

RW9
Diamond | Level 26 RW9
Diamond | Level 26

First, I would agree with KurtBremser, CSV + datastep is the way with most control over what you are doing.

I you are fortunate enough to be able to change the Excel file, then adding Validation into the spreadsheet upfront so they can only enter specific data will help a fair.  Excel however isn't a database/data entry/transport system and so you will still get problems, for instance you have no versioning on the data, no access control, no fixed structure etc.

One other option is to have a VBA macro which will process the sheet and save a CSV for you.  Obviously you need to know a bit of VBA, but its relatively straight forward.  In the macro you could run every cell through a validation script and then export to CSV.  Here is a starter page: http://msdn.microsoft.com/en-us/library/office/ee814737(v=office.14).aspx

Scott_Mitchell
Quartz | Level 8

Something like this:

DATA HAVE;
HAVE1 = "123A";
HAVE2 = "123 ";
RUN;

DATA WANT;
SET HAVE;
WANT1 = INPUT(COMPRESS(HAVE1,"1234567890","K"),8.);
WANT2 = INPUT(COMPRESS(HAVE2,"1234567890","K"),8.);
RUN;

Ksharp
Super User

Add  an option  mixed=yes  into your code .

It seems 124 125 has some characters, but SAS force to take it as a numeric type variable.

Xia Keshan

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1420 views
  • 1 like
  • 6 in conversation