Thanks for your post. It has some very interesting items. I especially like the INPUT @'string' that searches and inputs at the same time. This is what I was working on in the mean time: DATA sleep2; INFILE 'M:\temp\all.dat' DLM=',' DSD LRECL=256 PAD MISSOVER; LENGTH SleepAlgorithm $ 12; INFORMAT InBedDate OutBedDate Onsetdate MMDDYY10.; FORMAT InBedDate OutBedDate Onsetdate MMDDYY10.; LENGTH InBedTime OutBedTime OnsetTime $ 10; RETAIN line line1-line6; INPUT line $ 1-256 @; *** input card and hold in buffer ***; *** header lines get re-read here ***; IF INDEX(line, "Sleep Report for:")>0 THEN INPUT line1 $ 1-256; IF INDEX(line, "Subject Name:")>0 THEN INPUT line2 $ 1-256; IF INDEX(line, "Serial Number:")>0 THEN INPUT line3 $ 1-256; IF INDEX(line, "Sleep Algorithm:")>0 THEN INPUT line4 $ 1-256; IF line="" THEN INPUT line5 $ 1-256; * blank line*; IF INDEX(line, "In Bed Date")>0 THEN INPUT line6 $ 1-256; * column names line *; *** data lines get re-read here ***; IF INDEX(line, "Cole-Kripke,")>0 OR INDEX(line,"Sadeh,")>0 THEN DO; INPUT @1 SleepAlgorithm InBedDate InBedTime OutBedDate OutBedTime OnsetDate OnsetTime Latency TotalCounts Efficiency TotalMinutesinBed TotalSleepTime WakeAfterSleepOnset NumberofAwakenings AverageAwakeningLength; OUTPUT; END; RUN; PROC PRINT; RUN The final hang up was how SAS holds lines. I finally discovered the column pointer was at the end of the held line and my primary data Input statement was trying to read starting at the end which resulted in all missing data (with the correct number of output obs). thanks to my using the PAD and MISSOVER INFILE options. Using the INFILE COLUMN= and LINE= options and PUT (to the log) help me track this problem down. Adding the @1 resets the column pointer in the current buffer line. The earlier INPUT line(s) used specified columns so they didn't rely on the column pointer. My method means reading each line twice so it's probably not as fast but this file is relatively small. Thanks again, Rick
... View more