BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
mcmaxwell
Fluorite | Level 6

I am new to SAS and appreciate any help! I am using SAS Studio.

I have a dataset with three variables (ID number, exam score, and percentage points) and 619 observations. 

I import my excel file to SAS and examine the dataset using the following SAS code: 

/** Import grades file.  **/  
PROC IMPORT DATAFILE="/folders/myfolders/sasuser.v94/Exam 1 Scores.xlsx"  
		    OUT=GRADES  
		    DBMS=XLSX  
		    REPLACE;  
RUN;  
 
Proc Print data=GRADES; 
run;

I noticed that SAS added three new variables to the right of the data table labeled "D", "E", and "F" (see image below). I tried converting the excel file to a csv file and then uploading the csv file but the same thing happened.

Why is this happening and how do I fix it? Is it better to use excel files or csv files? Are there any tips or tricks for preparing files for import to SAS to make things go more smoothly? 

Thanks!image1.PNG

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

One the many annoyances of Excel is what is a "used cell".

Enter a value in a column. Delete the value. The Column is treated by Excel as used. If you do a file save as to CSV then you will get a set of commas with no values because the column is used, so needs to be "saved" but has no values.

You will also get phantom rows of values when an entire row of values is deleted.

 

This is happening because the the Excel engine that SAS connects to is told those columns are used.

 

Doubly annoying, is often deleting those columns just promotes the replacement columns that shifted over to used.

 

The only way I have ever found a way to correct this is exporting the data to csv and then writing a data step to read what is wanted. If you look in the log after using Proc import with a CSV version you will see a data step program  with those very missing variables because there were the exported commas mentioned. And proc import doesn't know that you only want some variables. However you can copy that data step from the log into the editor, clean up things like line numbers and remove all the references to the blank column variables, both in the informat and input sections. Re-run the modified code and only the variables you want should be in the output. Caveat: CSV has to read ALL the variables up to the rightmost one that you want in the output, no skipping columns entirely.

 

I have been set "data" that consisted of 5 or 6 columns of values and 100+ empty columns that were imported.

 

The KEEP statement is your friend.

Data want;

    set have;

   keep <names of the variables you want>.

run;

 

PS. Don't forget about possible hidden columns or rows. Those will also appear when imported.

View solution in original post

2 REPLIES 2
SASKiwi
PROC Star

There could be some hidden content in columns D, E and F. Try deleting the columns in Excel and re-save the workbook.

ballardw
Super User

One the many annoyances of Excel is what is a "used cell".

Enter a value in a column. Delete the value. The Column is treated by Excel as used. If you do a file save as to CSV then you will get a set of commas with no values because the column is used, so needs to be "saved" but has no values.

You will also get phantom rows of values when an entire row of values is deleted.

 

This is happening because the the Excel engine that SAS connects to is told those columns are used.

 

Doubly annoying, is often deleting those columns just promotes the replacement columns that shifted over to used.

 

The only way I have ever found a way to correct this is exporting the data to csv and then writing a data step to read what is wanted. If you look in the log after using Proc import with a CSV version you will see a data step program  with those very missing variables because there were the exported commas mentioned. And proc import doesn't know that you only want some variables. However you can copy that data step from the log into the editor, clean up things like line numbers and remove all the references to the blank column variables, both in the informat and input sections. Re-run the modified code and only the variables you want should be in the output. Caveat: CSV has to read ALL the variables up to the rightmost one that you want in the output, no skipping columns entirely.

 

I have been set "data" that consisted of 5 or 6 columns of values and 100+ empty columns that were imported.

 

The KEEP statement is your friend.

Data want;

    set have;

   keep <names of the variables you want>.

run;

 

PS. Don't forget about possible hidden columns or rows. Those will also appear when imported.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

SAS Enterprise Guide vs. SAS Studio

What’s the difference between SAS Enterprise Guide and SAS Studio? How are they similar? Just ask SAS’ Danny Modlin.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 2026 views
  • 0 likes
  • 3 in conversation