Hi experts,
I received a xlsx dataset with some text fields that are pretty massive, for example.. brief narratives or diagnosis history from a doctors visit. I can import it to stata 14 without problems, but I can't import the xlsx to SAS with either the proc import, infile or import wizard. Any idea why?
And because SAS does not import Stata 14 files, even if I replace it with the older Stata version 12, it only saves those text fields to lengths of 244.
Thanks for any advice or SAS code!
SAS varaibles have a limit, 2000 characters. I would suggest dropping the data into plain text file format. Then you can read the plain text file character after character and output to strings. I.e.
data want;
infile "thetextfile.txt";
length str $2000;
input a $char1. @@; /*might need the at's might not */
if lengthn(cat(str,a)) > 2000 then do;
output;
str=a;
end;
else str=cat(str,a);
run;
The question is what are you going to do with that data, its pretty much useless for anything programming or stats related.
So your file is a csv. Well, that adds more complexity to the whole thing then. Its hard to say when we can't see the file (or an example). My code was presented to run through a text file character by character, read it in, append it to a string variable and if the length is 2000 then output as a row. I.e. cutting the string up into blocks of 2000 characters. Not a good fit for what you now say. I would need to know something about what the file looks like and what you intend to be the output from it as each "cell" in a dataset can only have a maximum of 2000 characters, so you will need to cut it out into blocks of this somehow, but is that data really useful to anyone or any process?
Yeah, I tried exporting the variables i needed to a csv using stata and trying to import it into sas that way. there's 3 variables in the file, 2 unique ids and the character variable which is ultimately a bunch of sentences. I plan to search this text variable for key words for race, marital status, etc..
Does that help at all?
Did you use the XLSX engine? It should not have a problem with your file. The limit for a character variable in a SAS dataset is 32K bytes.
libname in xlsx 'myfile.xlsx';
proc copy inlib=in outlib=work;
run;
Hi, I'm trying this now, I've never used this code before, so I hope it works, but it seems like SAS is going to time out, it's been churning this whole time. I'll keep you posted. Thanks for your help!
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
Still thinking about your presentation idea? The submission deadline has been extended to Friday, Nov. 14, at 11:59 p.m. ET.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.