@AyoSho wrote:
Hi Tom,
Unfortunately, I don't have the original text file.... I only have access
to the excel file. Does that mean the excel file would be unusable? Is
there a way to manipulate the excel file at least and make it usable?.
Thanks!
You will need to be careful using the file as you need watch out for the type of changes that excel is likely to make when reading a text file. Your method of just asking EXCEL to write out a text file and then reading it in again could work. You need to try to figure our what delimiter you want to insert between the cells on a row in the spreadsheet. Or you could do the same thing in SAS.
So I would do it in three steps. (1) Convert the XLSX file into a sas dataset. (2) Write it back out using comma (or tab or space) as the delimiter. (3) read the new text file using semi-colon as the delimiter.
libname old xlsx 'the_exisitng_excel_file.xlsx';
data work.step1;
set old.sheetname;
run;
filename text 'new_name_for_a_text_file.txt';
data _null_;
file text dsd dlm=',' ;
set step1;
put (_all_) (+0);
run;
data want;
infile text dsd dlm=';' truncover ;
input (var1-var20) (:$30.);
run;
Do you have dataset? Or just a text file that you are trying to read?
If it is a text file then use semi-colon as the delimiter when reading the file.
data want;
infile 'myfile.txt' dsd dlm=';' truncover ;
input var1 var2 var3 ;
run;
If it is already a dataset then what do the other variables have in them if all of the data is in just one variable? Or do you mean you only have one variable and you want to parse it into more variables? Anyway use the SCAN() function. Let's assume your dataset is named HAVE the variable is named ONLYVAR. So something like this. Note you should really add a LENGTH statement to define the lengths of the new character variables.
data want;
set have;
var1 = scan(onlyvar,1,';');
var2 = scan(onlyvar,2,';');
run;
If you can to interpret some of the text in ONLYVAR as numbers then use the INPUT() function.
data want;
set have;
var1 = scan(onlyvar,1,';');
var2 = input(scan(onlyvar,2,';'),32.);
run;
A B C D E 1 name;age;state;country;labs | |
2 duf;45;lopos;;7 | |
3 steve;;bol;90 rolo;76 | |
4 kevin;30;nu;zip;32 | |
5 bela;9;hun; zip;87;Neo;lot;9 |
Looks like somewhere along the way someone took a text file and opened it in EXCEL. So your third line was probably originally something like this in the text file:
3,steve;;bol;90,rolo;76
So EXCEL decided the comma was the separator and converted that line to 3 cells in the spreadsheet.
By saving the Excel file back into a text file you have made something similar to the original file. But note that opening a text file in EXCEL can cause it to change some of the values. For example strings that look like numbers are converted to numbers. So leading zeros can be removed. Strings like 10-12 might be interpreted as a date.
Do you have the original text file? It would be better to read that instead.
@AyoSho wrote:
Hi Tom,
Unfortunately, I don't have the original text file.... I only have access
to the excel file. Does that mean the excel file would be unusable? Is
there a way to manipulate the excel file at least and make it usable?.
Thanks!
You will need to be careful using the file as you need watch out for the type of changes that excel is likely to make when reading a text file. Your method of just asking EXCEL to write out a text file and then reading it in again could work. You need to try to figure our what delimiter you want to insert between the cells on a row in the spreadsheet. Or you could do the same thing in SAS.
So I would do it in three steps. (1) Convert the XLSX file into a sas dataset. (2) Write it back out using comma (or tab or space) as the delimiter. (3) read the new text file using semi-colon as the delimiter.
libname old xlsx 'the_exisitng_excel_file.xlsx';
data work.step1;
set old.sheetname;
run;
filename text 'new_name_for_a_text_file.txt';
data _null_;
file text dsd dlm=',' ;
set step1;
put (_all_) (+0);
run;
data want;
infile text dsd dlm=';' truncover ;
input (var1-var20) (:$30.);
run;
Hi Tom,
The syntax you provided worked wonderfully!.
I now have the data in SAS well delineated.
Thank you for your help!
I appreciate it.
Best wishes.
Hi Tom,
Just fix the final data step to use the right variable names. You din't tell us what variables your file had so I just used a simple generic data step to read it in my example code.
Is the problem that you don't know what variables are in the file? Do you want SAS to guess what names it should use for the variables? If you include the header line in the fixed file and then use PROC IMPORT to GUESS what the variables are it will make up names from the header line.
One way you keep the header line is to change how you first read the Excel file so that it includes the first line as part of the data. Then it will get written as the first line of the resulting text file. So if you used PROC IMPORT to read the Excel file add the GETNAMES=NO statement to the proc step.
Or you could write the line with the variable names that SAS generated when it read the Excel file to the fixed text file. Insert steps to write that line and then modify the step that writes the data to use the MOD option on the FILE statement so that it appends the data to the file you just wrote with the header line.
filename text 'new_name_for_a_text_file.txt';
proc transpose data=step1(obs=0) out=names;
var _all_;
run;
data _null_;
file text dsd dlm=',';
set names ;
put _name_ @;
run;
data _null_;
file text dsd dlm=',' MOD ;
set step1;
put (_all_) (+0);
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
ANOVA, or Analysis Of Variance, is used to compare the averages or means of two or more populations to better understand how they differ. Watch this tutorial for more.
Find more tutorials on the SAS Users YouTube channel.