BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
serge68
Calcite | Level 5

I'm trying to parse an input dataset, to output only certain lines and can't get that working.

 

My input files looks like this –

 

Top

.

.

.

/LOGON

Line 1

Line 2

Etc

/LOGOFF

.

.

.

Bottom

 

I want to write the lines found between the /LOGON and /LOGOFF out to a dataset.

 

I've been playing around with variations of this, but its not giving me what I'm after –

 

 

DATA COMMAND;                                  
     INFILE FLAGFILE;                          
     INPUT @2  CMD     $50.;                   
                                               
  IF SUBSTR(CMD,2,6) = '/LOGON' then           
    do while (SUBSTR(CMD,2,7) ¬= '/LOGOFF');   
       put CMD;                                
    end;                                       

 

Thanks….

1 ACCEPTED SOLUTION

Accepted Solutions
Shmuel
Garnet | Level 18

Then just change: if index(a_line, ...) = 2 instead if ... > 0

View solution in original post

12 REPLIES 12
ballardw
Super User

Does this help:

 

data want;
   input @;
   length cmd $ 50.;
   if index (_infile_,"/LOGON") > 0 then input;
   else if index (_infile_,"/LOGOFF")>0 then input;
   else do;
      cmd=_infile_;
      output;
   end;
datalines;
/LOGON
Line 1
Line 2
Etc
/LOGOFF
/LOGON
second Line 1
second Line 2
second Etc
second etc 2
/LOGOFF
;
run;
Astounding
PROC Star

The problem needs a little clarification.  Are there lines before the first /LOGON that you need to ignore?  Are there lines after the final /LOGOFF that you also need to ignore?  Could there be multiple pairs of logons and logoffs, with garbage in between?

 

The program might be as simple as:

 

data command;

input flagfile truncover;

input @2 cmd $50.;

if cmd in ('/LOGON', '/LOGOFF') then delete;

run;

Astounding
PROC Star

Anticipating the answers to some of the questions, this would be a flexible way to approach the problem:

 

data command;

infile flagfile truncover;

retain status 'logged out';

input @2  cmd $50.;

if cmd='/LOGON' then status='logged in';

else if cmd='/LOGOFF' then status='logged out';

else if status='logged in' then output;

drop status;

run;

 

It's not clear whether a function (substr, index) needs to be applied when searching for /LOGON or /LOGOFF ... depends on what is actually in your data lines.

Astounding
PROC Star

Based on your last explanation, here is a shortcut that would improve the speed:

 

data command;

infile flagfile truncover;

retain status 'logged out';

input @2  cmd $50.;

if cmd='/LOGON' then status='logged in';

else if cmd='/LOGOFF' then stop;

else if status='logged in' then output;

drop status;

run;

 

Again, this relies on cmd being exactly "/LOGON" or "/LOGOFF".  If there are other characters on the line (including leading blanks), another function might have to be applied.

Shmuel
Garnet | Level 18

You have two types of lies: lines to skip and lines to output.

in such case I'll do:

 

data out;

  retain phase 0;

  infile ... truncover;

  input a_line $80.;

  if index(a_line, '/LOGON'  ) > 0 then do; phase=1;  input a_line; end;   /* skipping the /LOGON line */

  if index(a_line, '/LOGOFF' ) > 0 then phase = 0;                                      /* skipping the /LOGOFF line and the follows */

  if phase=1 then output;

  drop phase;

run;

 

Please check does it fits your request.

        

serge68
Calcite | Level 5

@ballardw – not quite. This includes data before and after the ?LOGON & ?LOGOFF that I want to drop.

 

@Astounding – yes there are lines before ?LOGON that I need to ignore. And lines after /LOGOFF I need to ignore. There is a single pair og LOGON & LOGOFF, and it’s the data between them which is valid.

 

@Shmuel  - this drops all lines preceding and including the /Logon, it out puts the data I want. However it then outputs all data after the /LOGOFF which I need to drop.

 

 

Thanks all.

Shmuel
Garnet | Level 18

when /LOGOFF is found the PHASE is assigned to 0, so no output will be done on next lines

except if a new /LOGON line encountered.

serge68
Calcite | Level 5

Hi Schmuel - You are correct, the issue is I have a second LOGON & LOGON, though not in columns 2-8. Thats why I had attempted the SUBSTR earlier, as its only the LOGON/LOGOFF in cols 2-8 that I'm concerned with.

 

 

Thanks.

Shmuel
Garnet | Level 18

Then just change: if index(a_line, ...) = 2 instead if ... > 0

Shmuel
Garnet | Level 18

If there may be only one pair of /LOGIN - /LOGOFF then

you better do:

   if index(a_line, '/LOGOFF') = 2 then stop;

No need to continue reading the file.

serge68
Calcite | Level 5

Needed to go =1 vs =2, but that gives me what I'm after.

 

Appreciate the assistance.

 

 

Thanks

PGStats
Opal | Level 21

You need a RETAINed variable:

 

DATA COMMAND;                                  
retain keep;
INFILE datalines;                          
INPUT CMD $50.; 
if left(_infile_) = "/LOGON" then keep=1;
else if left(_infile_) = "/LOGOFF" then keep=0;
else if keep then output;
drop keep;
datalines;
Before
/LOGON
Line 1
Line 2
Etc
/LOGOFF
After
;

proc print; run;
PG

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 12 replies
  • 1336 views
  • 0 likes
  • 5 in conversation