About cooch17

cooch17 · ‎06-20-2017

This worked perfectly -- I thought I'd tried something like this, but apparently, messed something up in the attempt. Thanks very much.

cooch17 · ‎06-20-2017

Here is how I read in this particular data file - &MCMCfile, and recl are macro variables set earlier in the program. Note I'm using recfm=N - but I explicitly set the linesize, which 'forces' records (i.e., splits the big string into discrete lines): data MCMC; infile &MCMCfile linesize=&recl recfm=N; input buffer1 RB4. Chain1 RB8. Chain2 RB8. Chain3 RB8. buffer2 RB4.; drop buffer1 buffer2; Each line (record) contains output from an MCMC sampler at each step (so, same number of 'variables' per line, different values for each variable). As per OP, the following 2-step procedure works: (i) pull in the big binary file (into a data set I call MCMC), then (ii) subset it, keeping everynth record. For step (ii), I simply use the following (as one of a couple of approaches that probably would work), where thin is a macro variable I set earlier in the program. *************************** * thin the data set.... * ***************************; data MCMC; do point = &thin to nobs by &thin; set MCMC point=point nobs=nobs; output; end; stop; This works fine, but as per OP, seems annoyingly inefficient, since I'm basically taking 2 steps for something I'd like to do in 1 step (during the infile stage).

cooch17 · ‎06-20-2017

Neat, except it doesn't work with binary files -- following from the log: The '/' INPUT/PUT statement option is inconsistent with binary mode I/O. The execution of the DATA STEP is being terminated.

cooch17 · ‎06-20-2017

Suppose I have a very large (multiple GB) binary data file, that I'm want to read into SAS (using 9.3 at the moment). I don't want to read in the entire file, but, rather, I want to read in every nth record. For simple data files (ASCII), this can be done fairly easily using #. For example, suppose I have some file called test.dat containin g 3 data files/columns (x,y and z). The following reads in every 5th record: filename in 'c:\users\userDesktop\test.dat'; data hold; infile in; input #5 x y z; run; Works fine. But, for some reason, if test.dat is a binary file, this approach doesn't seem to work. To read in the particular binary data file, I use something like the following input syntax: input buffer1 RB4. Chain1 RB8. Chain2 RB8.; Works fine. However, input #5 buffer1 RB4. Chain1 RB8. Chain2 RB8.; doesn't work as expected (or really, at all...). I know I could probably do this using 2 steps: (i) read in the full binary file, and then (ii) use some 'tricks' with subsetting the data to keep only every nth record, but the original file is so large I'm trying to avoid having to read the entire thin in in the first place. Suggestions/pointers to the obvious are welcomed.

Online Status	Offline
Date Last Visited	‎06-20-2017 10:09 PM

Re: keep every nth record while reading in binary data file

Re: keep every nth record while reading in binary data file

Re: keep every nth record while reading in binary data file

keep every nth record while reading in binary data file

Re: keep every nth record while reading in binary data file

Re: keep every nth record while reading in binary data file

Re: keep every nth record while reading in binary data file

keep every nth record while reading in binary data file