- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi experts,
I received a xlsx dataset with some text fields that are pretty massive, for example.. brief narratives or diagnosis history from a doctors visit. I can import it to stata 14 without problems, but I can't import the xlsx to SAS with either the proc import, infile or import wizard. Any idea why?
And because SAS does not import Stata 14 files, even if I replace it with the older Stata version 12, it only saves those text fields to lengths of 244.
Thanks for any advice or SAS code!
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
SAS varaibles have a limit, 2000 characters. I would suggest dropping the data into plain text file format. Then you can read the plain text file character after character and output to strings. I.e.
data want; infile "thetextfile.txt"; length str $2000; input a $char1. @@; /*might need the at's might not */ if lengthn(cat(str,a)) > 2000 then do; output; str=a; end; else str=cat(str,a); run;
The question is what are you going to do with that data, its pretty much useless for anything programming or stats related.
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
My file has 3 variables cid 8. Encid 8. narrative_assess_imp . I tried inputting all the variables, and that didn't work either. I think I'm confused what the a variable is in your code?
Does the following log help at all?
8473 data want;
8474 infile "C:\\narrative.csv";
8475 length narrative_assess_imp $2000;
8476 input a $char1. ; /*might need the at's might not */
8477 if lengthn(cat(narrative_assess_imp,a)) > 2000 then do;
8478 output;
8479 narrative_assess_imp=a;
8480 end;
8481 else narrative_assess_imp=cat(narrative_assess_imp,a);
8482 run;
NOTE: The infile "C:\narrative.csv" is:
Filename=C:\narrative.csv,
RECFM=V,LRECL=32767,
File Size (bytes)=181558671,
Last Modified=25Apr2018:10:06:11,
Create Time=25Apr2018:10:06:08
NOTE: 397651 records were read from the infile "C:\narrative.csv".
The minimum record length was 12.
The maximum record length was 2061.
NOTE: The data set WORK.WANT has 397651 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 19.19 seconds
cpu time 2.01 seconds
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
So your file is a csv. Well, that adds more complexity to the whole thing then. Its hard to say when we can't see the file (or an example). My code was presented to run through a text file character by character, read it in, append it to a string variable and if the length is 2000 then output as a row. I.e. cutting the string up into blocks of 2000 characters. Not a good fit for what you now say. I would need to know something about what the file looks like and what you intend to be the output from it as each "cell" in a dataset can only have a maximum of 2000 characters, so you will need to cut it out into blocks of this somehow, but is that data really useful to anyone or any process?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Yeah, I tried exporting the variables i needed to a csv using stata and trying to import it into sas that way. there's 3 variables in the file, 2 unique ids and the character variable which is ultimately a bunch of sentences. I plan to search this text variable for key words for race, marital status, etc..
Does that help at all?
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Did you use the XLSX engine? It should not have a problem with your file. The limit for a character variable in a SAS dataset is 32K bytes.
libname in xlsx 'myfile.xlsx';
proc copy inlib=in outlib=work;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I'm trying this now, I've never used this code before, so I hope it works, but it seems like SAS is going to time out, it's been churning this whole time. I'll keep you posted. Thanks for your help!