BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
djbateman
Lapis Lazuli | Level 10

I need help converting an XLSX file to a true CSV file (i.e., a CSV file that doesn't contain any Excel formatting or highlighting).  I don't have SAS/ACCESS to PC files.

 

I write a lot of programs for data management to detect dirty data.  When the report doesn't produce any results, we aren't sure if it is because the data is clean or if my programming is wrong.  When the report is blank, I export the associated SAS datasets and give them to the data manager to update.  I will read in the modified file and run that data through the program to see if the programming is correct.

 

My problem is that if I read in a straight Excel file, SAS likes to pick and choose what formats and data types to apply to variables, which in turn causes errors in my code or causes more programming to get the right types, formats, and lengths applied to the test data.  My workaround was to just create a new dataset by setting the original table and then manually changing the fields that were updated by the DM.  For example:

 

data new;
    set original.old;
    if subjid='101-001-005' then weight=400;
    if subjid='101-002-003' then sex='F';
run;

However, there are times when the DM makes a LOT of changes to many datasets, and it is too time consuming to manually enter every single update.  In those cases, I prefer to read in the data.  Since I mentioned above that SAS likes to automatically assigns data types and formats, I have found that I can save the XLSX file as a CSV and then read in a CSV file by using the meta data from the original library to assign the exact same data type, formats, and lengths.

 

That was a long background, but here is the root of my problem.  I wrote some VBscript that reads all the XLSX files in a folder and converts them to CSV.  If I were to manually open an XLSX file, I can "save as" CSV.  This CSV file will loose any Excel formatting (like highlighting, Excel formats, etc.).  But my problem is that in my code, it doesn't seem to save as a true CSV file.  Instead, it keeps all the highlighting as though it is still an XLSX file with a CSV extension.  When I then read in this newly converted CSV file, it read in as jumbled characters.

 

Attached is the original XLSX file that was updated by a DM.  I tried attaching the CSV file and the jumbled SAS dataset, but I couldn't get them to attach.  I can try to email those individually if you think you can help.

 

Can someone help me find out how I can convert XLSX to a true CSV?

 

Here is the VB script that I am using.

 

%macro convert_files(default=,inext=,outext=);
	data _null_;
		file "'&default\temp.vbs'";
		put "set xlapp = CreateObject(""Excel.Application"")";
		put "set fso = CreateObject(""scripting.filesystemobject"")";
		put "set myfolder = fso.GetFolder(""&default"")";
		put "set myfiles = myfolder.Files";
		put "xlapp.DisplayAlerts = False";
		put " ";
		put "for each f in myfiles";
		put "  ExtName = fso.GetExtensionName(f)";
		put "  Filename= fso.GetBaseName(f)";
		put "    if ExtName=""&inext"" then";
		put "           set mybook = xlapp.Workbooks.Open(f.Path)"; 
		put "           xlapp.Visible = false";
		put "           mybook.SaveAs ""&default.\"" & Filename & "".&outext"", 51";
		put "    End If";
		put "  Next";
		put "  mybook.Close";
		put "  xlapp.DisplayAlerts = True";

		put " xlapp.Quit";
		put " Set xlapp = Nothing";
		put " strScript = Wscript.ScriptFullName";
		put " FSO.DeleteFile(strScript)"; 
	run; 
 
	x "cscript ""&default\temp.vbs""";
%mend convert_files;

%convert_files(default=C:\Users,inext=xlsx,outext=csv);

 

1 ACCEPTED SOLUTION

Accepted Solutions
djbateman
Lapis Lazuli | Level 10

I got a lot of replies in different ways that all helped.

 

 

 

First, Chris Hemedinger sent me private messages and helped me realize that I really to have access to SAS/ACCESS to PC files.  Using this method I can directly access XLSX files without the need to convert to CSV.  Here is the code I used to convert my files.  It seems that using this method can have one tab per dataset, and SAS will automatically convert each tab into a dataset.  But these datasets are not readable as is.  You will need to use an additional data step to set the new XLSX library table into another table to be able to read (hence the DO loop in the macro):

 

%macro getdata;
	/* Assign the XLSX library as the Excel test file */
	/* https://blogs.sas.com/content/sasdummy/2015/05/20/using-libname-xlsx-to-read-and-write-excel-files/ */
	libname xlsx XLSX "&testdir.\&testfile..xlsx";

	proc sql noprint;
		select count(distinct memname) into :ntables from sashelp.vtable where libname='XLSX';
		%let ntables=%sysfunc(compress(&ntables.));
		select distinct memname into :table1-:table&ntables. from sashelp.vtable where libname='XLSX';
	quit;

	%do i=1 %to &ntables.;
		data test.&&table&i.;
			set xlsx.&&table&i.;
		run;

		%maketestdata(dsin=&&table&i.);
	%end;
%mend getdata;

%getdata;

 

Second, Ballardw and Reeza both helped me see that with my method, I just needed to change "51" to "6" in the saveas function of the VBScript.  This is very helpful if you do not have SAS/ACCESS to PC files.  That simple fix will do the trick if you need to convert a file to CSV.  But Chris' method will be less code if you want simplicity and have the proper access.

View solution in original post

5 REPLIES 5
ballardw
Super User

 

I suspect the specific might be the 51 in the SAVEAS which I have is associated with an xlOpenXMLWorkbook file type, xlsx without macros.

I found a reference with these fileformat values for different CSV formats in Excel.

 

xlCSV 6 CSV *.csv 
xlCSVMac 22 Macintosh CSV *.csv 
xlCSVMSDOS 24 MSDOS CSV *.csv 
xlCSVUTF8 62 UTF8 CSV *.csv 
xlCSVWindows 23 Windows CSV *.csv 
djbateman
Lapis Lazuli | Level 10

I got a lot of replies in different ways that all helped.

 

 

 

First, Chris Hemedinger sent me private messages and helped me realize that I really to have access to SAS/ACCESS to PC files.  Using this method I can directly access XLSX files without the need to convert to CSV.  Here is the code I used to convert my files.  It seems that using this method can have one tab per dataset, and SAS will automatically convert each tab into a dataset.  But these datasets are not readable as is.  You will need to use an additional data step to set the new XLSX library table into another table to be able to read (hence the DO loop in the macro):

 

%macro getdata;
	/* Assign the XLSX library as the Excel test file */
	/* https://blogs.sas.com/content/sasdummy/2015/05/20/using-libname-xlsx-to-read-and-write-excel-files/ */
	libname xlsx XLSX "&testdir.\&testfile..xlsx";

	proc sql noprint;
		select count(distinct memname) into :ntables from sashelp.vtable where libname='XLSX';
		%let ntables=%sysfunc(compress(&ntables.));
		select distinct memname into :table1-:table&ntables. from sashelp.vtable where libname='XLSX';
	quit;

	%do i=1 %to &ntables.;
		data test.&&table&i.;
			set xlsx.&&table&i.;
		run;

		%maketestdata(dsin=&&table&i.);
	%end;
%mend getdata;

%getdata;

 

Second, Ballardw and Reeza both helped me see that with my method, I just needed to change "51" to "6" in the saveas function of the VBScript.  This is very helpful if you do not have SAS/ACCESS to PC files.  That simple fix will do the trick if you need to convert a file to CSV.  But Chris' method will be less code if you want simplicity and have the proper access.

Tom
Super User Tom
Super User

No need to use macros to copy all sheets in an XLSX workbook.  Just use PROC COPY.

libname xlsx XLSX "&testdir.\&testfile..xlsx";
libname test 'some path I can write to';
proc copy inlib=xlsx outlib=test;
run;
miantanzeel
Calcite | Level 5

To convert XLSX to csv a tool can be used that is very simple it can convert files online specifically excel files. with simple GUI, fast processing and easy to use feature.Total Excel Converter Support input formats include Excel and Excel 2007, XLSM, XLT, XLTM as well as OpenOffice formats ODS, XML, SQL, WK2, WKS, WAB, DBF, TEX, and DIF. The list of target formats is even more extensive: convert your tabled files into DOC, DOCX, PDF, HTML, Access, TXT, ODT, ODS, XML, SQL, CSV, Lotus, DBF, TEX, DIFF, SYLK, and LaTeX. It is product of Coolutils, you can search it on google.

Focho78
Calcite | Level 5
Salut,
Avec proc import c est + simple.
Cdlt

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 5 replies
  • 9977 views
  • 7 likes
  • 5 in conversation