BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
Tom
Super User Tom
Super User

@kajal_30 wrote:

Hi Tom,

 

AS I am reimporting the csv to a dataset I can see 0 variables . Is there a way we can retain column names while importing from a csv file ? refraining from using input statement as in future business might change the sequence of the columns in the sheet

 

Regards

Kajal


If you ran the code I posted that means the TEMPLATE dataset you created did not have any variables.  Instead of a TEMPLATE dataset you can just add the code for defining the variables and reading them into the data step.

data want;
  length var1 8 var2 $20 .... varlast 8 ;
  informat var1 date.;
  format var1 date9.;
  infile csv dsd truncover;
  input var1 -- varlast ;
run;

If your XSLX files CHANGE then you need a more sophisticated process.

1) Import the XLSX file.

2) Get the list of variables from the resulting dataset.

3) Compare the list of variables to to acceptable list of variables.

4) If the file is valid the fix it. Otherwise issue an error.

 

Fixing it might mean using the CSV file intermediate like I posted before.

Or it might just mean reading the imported dataset and apply changes such as changing the length of character variables converting numeric variables to character or the reverse. 

kajal_30
Quartz | Level 8

one more concern while creating the csv file from dataset I can see there are 2 empty lines after every record. but actual excel file doesn't have any empty lines.

- importing xlsx as dataset

- exporting dataset into a csv file

May I know why am I seeing these empty lines.

regards

Kajal

ballardw
Super User

@kajal_30 wrote:

one more concern while creating the csv file from dataset I can see there are 2 empty lines after every record. but actual excel file doesn't have any empty lines.

- importing xlsx as dataset

- exporting dataset into a csv file

May I know why am I seeing these empty lines.

regards

Kajal


One of the truly obnoxious elements, in terms of data exchange, is the stuff people can put in cells.

One of them is entered using the Alt-Enter key combination. If done between visible text in cell it will force the cell to display the text with a vertical adjustment so the text after the first bit is displayed below it (may not be very obvious with long text in a narrow cell).

However, those can be entered as characters and you will not "see" anything in the spreadsheet. The exported text will usually treat those cell contents as "new lines".

There are other issues such as which operating system you are running and where/how you read the file. Conversion of files from UNIX to Windows (or vice versa) have well-known issues caused because the operating systems use different characters to tell applications that you have reached the end of a line. Windows uses two characters: Carriage Return and Line Feed (ancient names deriving from teletype), UNIX and derivatives the Line Feed (one character) . So conversion and movement can add extra characters that get treated as end-of-line.

Sometimes use of options on the INFILE statement of a data step TERMSTR can be used to force behavior that is not native to your operating system.

 

You will have to show 1) the code used to read that csv and 2) a couple lines of text.

Reeza
Super User

How did you convert the XLSX to CSV? It's possible there's an issue in that process. You can check it by manually doing the conversion (in Excel click Save As, change type to CSV) and verify the file. 

 


@kajal_30 wrote:

one more concern while creating the csv file from dataset I can see there are 2 empty lines after every record. but actual excel file doesn't have any empty lines.

- importing xlsx as dataset

- exporting dataset into a csv file

May I know why am I seeing these empty lines.

regards

Kajal


 

kajal_30
Quartz | Level 8

Hi @Reeza ,

 

Yeah I am converting EXCEL to csv the same way you suggested saving the excel file as csv.

I have also validated the csv file and data is correct but after reading it using the data step I am getting merged columns in the output and empty lines. Is there any option I am missing in the code below?

libname work2 '/apps/sas/data';

data work2.XL_OUT;
	infile "/apps/sas/orig.csv"
		delimiter = ","
		missover 
		dsd
		firstobs=2;
	Length  
		CUST_ID 8
		CUST_REG $10
	;
	informat 	CUST_ID  BEST.;
	informat 	CUST_REG $10.;
	format 	CUST_ID  BEST.;
	format 	CUST_REG $10.;
	input   
		CUST_ID	
		CUST_REG $

		run;

proc print data= XL_OUT  (obs=6) noobs;
run;
Tom
Super User Tom
Super User

You are missing the semicolon to end the INPUT statement. So you are reading an extra numeric variable named RUN.

But that would not explain the symptoms you describe.  It would instead just result in an extra variable that was all missing values.

Are you sure the file is actually delimited with commas?  Look at the start of the file. For example by dumping the first 10 lines into the SAS log.

data _null_;
  infile "/apps/sas/orig.csv" obs=10;
  input;
  list;
run;

 

You have mistakenly using MISSOVER instead of TRUNCOVER on the INFILE statement, but since your INPUT statement is using LIST MODE style that will not have any impact.

 

The code has a lot of unneeded things.  There is no need to attach either an informat nor a format to either of those variables.  There is no need to include the $ in the input statement since the variables' types have already been defined by the LENGTH statement. 

 

 

kajal_30
Quartz | Level 8

sorry missing ; was just a typo while pasting the code here. Actual code has ; after input statement and yes I printed 10 lines and they have comma separated values

 

kajal_30
Quartz | Level 8

Hi @Tom ,

I tried all the options suggested but still getting some errors like one value is getting into another column 

also missing values for some of the column. May concern is even if values in the columns are undefined but at least they should not get into another column. Can you please suggest some more options 

Regards

Kajal

Tom
Super User Tom
Super User

@kajal_30 wrote:

one more concern while creating the csv file from dataset I can see there are 2 empty lines after every record. but actual excel file doesn't have any empty lines.

- importing xlsx as dataset

- exporting dataset into a csv file

May I know why am I seeing these empty lines.

regards

Kajal


If the XLSX file has end of line characters in it then that will make the CSV file unusable.  Remove them.

data _null_:
  set have;
  file csv dsd ;
  array _character_ _character_ ;
  do over _character_;
    _character_=translate(_character_,'  ','0D0A'x);
  end;
  put (_all_) (+0);
run;
Reeza
Super User
I'd just add on if this is a 'production' level job I would add a check to ensure that all column types in the data set were correct, not just that one field. Can save you hours of debugging if the job fails at loading the data because types have been modified for some reason. PROC IMPORT doesn't guarantee that types are the same between files especially for Excel files.

kajal_30
Quartz | Level 8
Hi Reeza,

So it can be different each time we are reading the excel file? is it better for me to convert this file into csv format and then read that file using data step and assign lengths and formats as per the final target dataset to be appended to?
Thanks
Kajal
Kurt_Bremser
Super User

@kajal_30 wrote:
Hi Reeza,

So it can be different each time we are reading the excel file? is it better for me to convert this file into csv format and then read that file using data step and assign lengths and formats as per the final target dataset to be appended to?
Thanks
Kajal

A-B-S-O-L-U-T-E-L-Y.

It is not just the better, it is the only way to go for a production process, IMHO.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 26 replies
  • 2030 views
  • 4 likes
  • 6 in conversation