Hi experts,
I received a xlsx dataset with some text fields that are pretty massive, for example.. brief narratives or diagnosis history from a doctors visit. I can import it to stata 14 without problems, but I can't import the xlsx to SAS with either the proc import, infile or import wizard. Any idea why?
And because SAS does not import Stata 14 files, even if I replace it with the older Stata version 12, it only saves those text fields to lengths of 244.
Thanks for any advice or SAS code!
SAS varaibles have a limit, 2000 characters. I would suggest dropping the data into plain text file format. Then you can read the plain text file character after character and output to strings. I.e.
data want; infile "thetextfile.txt"; length str $2000; input a $char1. @@; /*might need the at's might not */ if lengthn(cat(str,a)) > 2000 then do; output; str=a; end; else str=cat(str,a); run;
The question is what are you going to do with that data, its pretty much useless for anything programming or stats related.
So your file is a csv. Well, that adds more complexity to the whole thing then. Its hard to say when we can't see the file (or an example). My code was presented to run through a text file character by character, read it in, append it to a string variable and if the length is 2000 then output as a row. I.e. cutting the string up into blocks of 2000 characters. Not a good fit for what you now say. I would need to know something about what the file looks like and what you intend to be the output from it as each "cell" in a dataset can only have a maximum of 2000 characters, so you will need to cut it out into blocks of this somehow, but is that data really useful to anyone or any process?
Yeah, I tried exporting the variables i needed to a csv using stata and trying to import it into sas that way. there's 3 variables in the file, 2 unique ids and the character variable which is ultimately a bunch of sentences. I plan to search this text variable for key words for race, marital status, etc..
Does that help at all?
Did you use the XLSX engine? It should not have a problem with your file. The limit for a character variable in a SAS dataset is 32K bytes.
libname in xlsx 'myfile.xlsx';
proc copy inlib=in outlib=work;
run;
Hi, I'm trying this now, I've never used this code before, so I hope it works, but it seems like SAS is going to time out, it's been churning this whole time. I'll keep you posted. Thanks for your help!
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.