BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GrokAndRoll
Calcite | Level 5

Assuming I have a plain-text file with contents in an unknown or inconsistent format:

  • Could be spread over multiple lines, or not
  • Could have commas, or not


How would I go about reading it into a single variable of a single observation? I just want a "text" variable to contain the entire contents.


Try as I might I can't write a program which will get me what I want - I hope that I'm missing something very simple.


Example text file:

[{"var1":"ses-

14_14723","var2":652720,"var3":653940}]


My SAS code so far.


data work.test;

     infile "C:\test.txt" flowover dlmstr="&(*&)(*&";

     input alltext :$ 3000.;

run;


As you can see -

  • I've experimented using flowover to search for data on new lines.
  • I'm using a delimiter string with a ridiculous value I'll never hit so that the third line isn't split, and
  • I've chosen a randomly-high length for the alltext variable, but used the colon to tell the input statement the length could be variable


This strikes me as really hacky and I'm sure it can't be right. It also returns me 2 observations - one for line 1, and one for line 3.


Any help would be much appreciated.


1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
SAS Super FREQ

Something like this perhaps?

/* Create a text file */
data _null_ ;
    file "s:\workshop\test.txt";
      set sashelp.class;
      put _all_;
run;

/*Read the whole text file into a text variable *
data test;
   length text $32767;
   retain text '';
   infile "s:\workshop\test.txt" flowover dlmstr='//' end=last;
   input;
   text=cats(text,_infile_);
   if last then output;
run;
proc print; run;
Check out my Jedi SAS Tricks for SAS Users

View solution in original post

6 REPLIES 6
jakarman
Barite | Level 11

Well I hope you are aware what you are asking. Not knowing the format, is it video/audio and you want to build your own codecs?

Let me assume you are at least expecting character data in some encoding like Unicode and you are knowing just that type.

SAS(R) 9.3 Statements: Reference (infile)

SAS(R) 9.3 Companion for Windows (infile)

There is a limit that string variables cannot be longer as 32Kb. 

You can read the fie in binary format using: 

Recfm=N and than choose your char variables into splitting as an array (maximum size?) or as observations using reading in logicl blocks (reclen) of eg 32Kb.

The automatic _infile_ variable can be uses as the source after each "input ;" statement. An input statement not reading any variables.

Do you have a format using a record-layout (what is the maximum length)? You can use those record indicators.

These are the CR LF CRLF characters giving a basic format to the file.  

---->-- ja karman --<-----
SASJedi
SAS Super FREQ

Something like this perhaps?

/* Create a text file */
data _null_ ;
    file "s:\workshop\test.txt";
      set sashelp.class;
      put _all_;
run;

/*Read the whole text file into a text variable *
data test;
   length text $32767;
   retain text '';
   infile "s:\workshop\test.txt" flowover dlmstr='//' end=last;
   input;
   text=cats(text,_infile_);
   if last then output;
run;
proc print; run;
Check out my Jedi SAS Tricks for SAS Users
GrokAndRoll
Calcite | Level 5

SASJedi (if that is your real name :smileysilly:) - Thanks, that does the job perfectly.

Of course I concede I probably shouldn't be taking this approach, but it's interesting to know how it could be done. I must have read about the _infile_ variable in the past, but had forgotten it.

jakarman
Barite | Level 11

@Grokandroll, SASjedi is of course Mark Jordan http://blogs.sas.com/content/sastraining/?s=sasjedi

Creating json: Base SAS(R) 9.4 Procedures Guide, Second Edition (proc json)

The json data processing is a possible enhancement as alternative for XML.

http://support.sas.com/resources/papers/proceedings13/296-2013.pdf is some same approach as you want.

---->-- ja karman --<-----
GrokAndRoll
Calcite | Level 5

Tom - Partly to see if it's possible. The other reason was that I am writing a piece of SAS code to parse a JSON response, but my parser assumes that everything is on the same line. Turned out there were a couple of random new lines in there so I wanted to load up the file and prepare it for use by removing all new lines. I hoped there would be a way of loading all contents in one variable, running a tranwrd or compress function against it, and voila.

I realise the answer is probably "Don't do that" but I was curious.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 13508 views
  • 7 likes
  • 4 in conversation