04-25-2018 10:02 AM
I received a xlsx dataset with some text fields that are pretty massive, for example.. brief narratives or diagnosis history from a doctors visit. I can import it to stata 14 without problems, but I can't import the xlsx to SAS with either the proc import, infile or import wizard. Any idea why?
And because SAS does not import Stata 14 files, even if I replace it with the older Stata version 12, it only saves those text fields to lengths of 244.
Thanks for any advice or SAS code!
04-25-2018 10:23 AM
SAS varaibles have a limit, 2000 characters. I would suggest dropping the data into plain text file format. Then you can read the plain text file character after character and output to strings. I.e.
data want; infile "thetextfile.txt"; length str $2000; input a $char1. @@; /*might need the at's might not */ if lengthn(cat(str,a)) > 2000 then do; output; str=a; end; else str=cat(str,a); run;
The question is what are you going to do with that data, its pretty much useless for anything programming or stats related.
04-25-2018 11:01 AM
04-25-2018 11:09 AM
So your file is a csv. Well, that adds more complexity to the whole thing then. Its hard to say when we can't see the file (or an example). My code was presented to run through a text file character by character, read it in, append it to a string variable and if the length is 2000 then output as a row. I.e. cutting the string up into blocks of 2000 characters. Not a good fit for what you now say. I would need to know something about what the file looks like and what you intend to be the output from it as each "cell" in a dataset can only have a maximum of 2000 characters, so you will need to cut it out into blocks of this somehow, but is that data really useful to anyone or any process?
04-25-2018 11:27 AM
Yeah, I tried exporting the variables i needed to a csv using stata and trying to import it into sas that way. there's 3 variables in the file, 2 unique ids and the character variable which is ultimately a bunch of sentences. I plan to search this text variable for key words for race, marital status, etc..
Does that help at all?
04-25-2018 11:14 AM
Did you use the XLSX engine? It should not have a problem with your file. The limit for a character variable in a SAS dataset is 32K bytes.
libname in xlsx 'myfile.xlsx'; proc copy inlib=in outlib=work; run;
04-25-2018 11:32 AM
Hi, I'm trying this now, I've never used this code before, so I hope it works, but it seems like SAS is going to time out, it's been churning this whole time. I'll keep you posted. Thanks for your help!