BookmarkSubscribeRSS Feed
jsmall
Calcite | Level 5

Hi experts,

 

I received a xlsx dataset with some text fields that are pretty massive, for example.. brief narratives or diagnosis history from a doctors visit. I can import it to stata 14 without problems, but I can't import the xlsx to SAS with either the proc import, infile or import wizard. Any idea why? 

 

And because SAS does not import Stata 14 files, even if I replace it with the older Stata version 12, it only saves those text fields to lengths of 244. 

Thanks for any advice or SAS code! 

6 REPLIES 6
RW9
Diamond | Level 26 RW9
Diamond | Level 26

SAS varaibles have a limit, 2000 characters.  I would suggest dropping the data into plain text file format.  Then you can read the plain text file character after character and output to strings.  I.e.

data want;
  infile "thetextfile.txt";
  length str $2000;
  input a $char1. @@;  /*might need the at's might not */
  if lengthn(cat(str,a)) > 2000 then do;
    output;
    str=a;
  end;
  else str=cat(str,a);
run;

The question is what are you going to do with that data, its pretty much useless for anything programming or stats related.

jsmall
Calcite | Level 5
Thanks for your help, but I can't get this to work. When I proc print, the narrative field comes out empty. I tried with or without @@.

My file has 3 variables cid 8. Encid 8. narrative_assess_imp . I tried inputting all the variables, and that didn't work either. I think I'm confused what the a variable is in your code?



Does the following log help at all?


8473 data want;
8474 infile "C:\\narrative.csv";
8475 length narrative_assess_imp $2000;
8476 input a $char1. ; /*might need the at's might not */
8477 if lengthn(cat(narrative_assess_imp,a)) > 2000 then do;
8478 output;
8479 narrative_assess_imp=a;
8480 end;
8481 else narrative_assess_imp=cat(narrative_assess_imp,a);
8482 run;

NOTE: The infile "C:\narrative.csv" is:

Filename=C:\narrative.csv,
RECFM=V,LRECL=32767,
File Size (bytes)=181558671,
Last Modified=25Apr2018:10:06:11,
Create Time=25Apr2018:10:06:08

NOTE: 397651 records were read from the infile "C:\narrative.csv".
The minimum record length was 12.
The maximum record length was 2061.
NOTE: The data set WORK.WANT has 397651 observations and 2 variables.
NOTE: DATA statement used (Total process time):
real time 19.19 seconds
cpu time 2.01 seconds




RW9
Diamond | Level 26 RW9
Diamond | Level 26

So your file is a csv.  Well, that adds more complexity to the whole thing then.  Its hard to say when we can't see the file (or an example).  My code was presented to run through a text file character by character, read it in, append it to a string variable and if the length is 2000 then output as a row.  I.e. cutting the string up into blocks of 2000 characters.  Not a good fit for what you now say.  I would need to know something about what the file looks like and what you intend to be the output from it as each "cell" in a dataset can only have a maximum of 2000 characters, so you will need to cut it out into blocks of this somehow, but is that data really useful to anyone or any process?

jsmall
Calcite | Level 5

Yeah, I tried exporting the variables i needed to a csv using stata and trying to import it into sas that way. there's 3 variables in the file, 2 unique ids and the character variable which is ultimately a bunch of sentences. I plan to search this text variable for key words for race, marital status, etc..  

Does that help at all?

Tom
Super User Tom
Super User

Did you use the XLSX engine?  It should not have a problem with your file.  The limit for a character variable in a SAS dataset is 32K bytes.

libname in xlsx 'myfile.xlsx';
proc copy inlib=in outlib=work;
run;
jsmall
Calcite | Level 5

Hi, I'm trying this now, I've never used this code before, so I hope it works, but it seems like SAS is going to time out, it's been churning this whole time. I'll keep you posted. Thanks for your help!

sas-innovate-2024.png

Don't miss out on SAS Innovate - Register now for the FREE Livestream!

Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 837 views
  • 0 likes
  • 3 in conversation