DATA Step, Macro, Functions and more

Info needed from input dataset wraps onto multiple lines.

Accepted Solution Solved
Reply
Contributor
Posts: 21
Accepted Solution

Info needed from input dataset wraps onto multiple lines.

I am reading in a dataset where the info I need to parse and work with wraps over to multiple lines. This does not happen on every line and may extend to 2 or more lines. Here is what I am working with. (I have truncated my data to make it easier to read)

 

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN            

         Status(Lost control)                              

         User(SCPPSB38) Host(scppsb38)                    

APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed          

MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN                

MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN                

MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/MAIN State

         Jobno(43228) User(SCMSIS47) Host(SCMSIS47)      

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN STATE FAILED

         User(SCPPSB38) Host(scppsb38)                    

APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed          

MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5                    

         Jobno(59568) User(SCMSIS47) Host(SCMSIS47)      

MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/MAIN      

         Jobno(7800) User(SCMSIS47) Host(SCMSIS47)        

MgrMsg: SCMSIS47 KPPSCDPP.QA009/DPPSC5@.1308642          

         Jobno(56156) User(SCMSIS47) Host(SCMSIS47)      

MgrMsg: SCMSIS47 KPPSCDPP.QA/DPPSC5@.1308642/MAIN State  

         Jobno(9672) User(SCMSIS47) Host(SCMSIS47)        

                                                

 Every time “MgrMsg:” appears in cols 2 – 8 and on the next line cols 2-8 are blank. I need that line, starting at column 10 to append to the end of the line that has the MgrMsg. Desired out put would look like this

 

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN Status(Lost control) User(SCPPSB38) Host(scppsb38)              

APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed                                                              

MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN                                                                    

MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN                                                                    

MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/MAIN State Jobno(43228) User(SCMSIS47) Host(SCMSIS47)        

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN STATE FAILED User(SCPPSB38) Host(scppsb38)                      

APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed                                                              

MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5 Jobno(59568) User(SCMSIS47) Host(SCMSIS47)                            

MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/MAIN Jobno(7800) User(SCMSIS47) Host(SCMSIS47)              

MgrMsg: SCMSIS47 KPPSCDPP.QA009/DPPSC5@.1308642 Jobno(56156) User(SCMSIS47) Host(SCMSIS47)                  

MgrMsg: SCMSIS47 KPPSCDPP.QA/DPPSC5@.1308642/MAIN State Jobno(9672) User(SCMSIS47) Host(SCMSIS47) Jobno(67160)

OPERCMD: BY OPSMCB01 SPINLOG  


Accepted Solutions
Solution
‎09-19-2016 03:53 PM
Trusted Advisor
Posts: 1,574

Re: Info needed from input dataset wraps onto multiple lines.

I made some changes to the program and tested it. Here is the code and the test data.

Pay attention: I had to add at end of datalines a new line to indicate the end of input lines (by TEST = 'EOF').

Without this indication, last output out_line won't be written.

 

Better check it again;

 

DATA want(keep=out_line);
LENGTH out_line $200;
RETAIN out_line ' ' flag_out 0;
drop flag_out;

INFILE datalines TRUNCOVER;
INPUT  a_line $char100.;
test = SUBSTR(a_line,2,7);
if test = "MgrMsg:"   THEN link process1; else
if test = " "                THEN link process2; else
if test ne 'EOF'         THEN link process3; else output;
RETURN;
*---------------------------*;
process1:
if flag_out = 1 then output;
out_line = a_line;
flag_out = 1;
RETURN;
*---------------------------*;
process2:
out_line = CATX(' ',TRIM(out_line), TRIM(a_line));
flag_out = 1;
RETURN;
*---------------------------*;
process3:
if flag_out = 1 then output;
out_line = a_line;
OUTPUT;
flag_out = 0;
RETURN;

DATALINES;
 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST
                *** FIRST LINE NEEDED *****
                *** MAY OR MAY NOT BE ANOTHER *****
APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed
 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO
 MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO
 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/
                Jobno(43228) User(SCMSIS47) Host(SCMSIS
 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST
                User(SCPPSB38) Host(scppsb38)
APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed
 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/
               Jobno(59568) User(SCMSIS47) Host(SCMSIS
 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/
              Jobno(7800) User(SCMSIS47) Host(SCMSIS4
EOF
;

run;
PROC PRINT;

View solution in original post


All Replies
Trusted Advisor
Posts: 1,574

Re: Info needed from input dataset wraps onto multiple lines.

data want (keep=out_line);
      length out_line $200;  /* max length expected for a line on output */
      infile cards truncover;
      input @2 a_line   $100.  ;
test = substr(a_line,1,7);
      if test = "MgrMsg:"  then do;
          out_line = a_line;
          input @2 a_line  $100. ;
          test =  substr(a_line,1,7);
          do while (test = ' ');
               out_line = catx(' ',trim(out_line), trim(a_line2);
               input @2 a_line  $100. ;
test =  substr(a_line,1,7);
          end;
          output;
out_line = a_line;
      end;
      else do;
           out_line = a_line;
           output;
      end;
   
return;
/*----------*/
cards
    ... enter here your input data ...
run; 

Trusted Advisor
Posts: 1,574

Re: Info needed from input dataset wraps onto multiple lines.

on catx function line it should be written a_line instead a_line2
Contributor
Posts: 21

Re: Info needed from input dataset wraps onto multiple lines.

Thank you for the reply and the time you spent on this. It is not producing the output I hoped. First run only produced the "out_line" var and this var does not include all the data.

PROC PRINT

The SAS System

Obs out_line



1 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST

2 PPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed

3 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO

4 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/

5 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST

6 PPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed

7 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/

8 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/



So I edited "DATA want(keep=out_line);" to just "DATA want;" This out contained the data I need so I thought I would just able to join out_line and a_line.



The SAS System 10:23 Tuesday



Obs out_line a_line



1 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST User(SCPPSB38) Host(scppsb38)

2 PPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed PPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed

3 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO

4 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/ Jobno(43228) User(SCMSIS47) Host(SCMSIS

5 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST User(SCPPSB38) Host(scppsb38)

6 PPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed PPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed

7 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/ Jobno(59568) User(SCMSIS47) Host(SCMSIS

8 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/ Jobno(7800) User(SCMSIS47) Host(SCMSIS4



But when I inject more than one line below the MgrMsg that I also need to caput it is not grabbing that line.



The SAS System 10:27 Tuesda



Obs out_line a_line



1 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST *** FIRST LINE NEEDED *****

2 *** MAY OR MAY NOT BE ANOTHER ***** *** MAY OR MAY NOT BE ANOTHER *****

3 PPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed PPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed

4 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO

5 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/ Jobno(43228) User(SCMSIS47) Host(SCMSIS

6 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST User(SCPPSB38) Host(scppsb38)

7 PPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed PPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed

8 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/ Jobno(59568) User(SCMSIS47) Host(SCMSIS

9 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/ Jobno(7800) User(SCMSIS47) Host(SCMSIS4







I think this code will give me a good start, just needs some tweaking. Thanks again !!!








Trusted Advisor
Posts: 1,574

Re: Info needed from input dataset wraps onto multiple lines.

As we need concatenate input lines to create the out line - I missed a retain statement.

Here is the code that might work: (NOTE - the last line(s) may be truncated as infile cards doesn't turn end of input flag)

 

data want (keep=out_line);
      length out_line $200;  /* max length expected for a line on output */

      retain out_line;
      infile cards truncover;
      input @2 a_line   $100.  ;
      test = substr(a_line,1,7);
      if test = "MgrMsg:"  then do;
          out_line = a_line;
          input @2 a_line  $100. ;
          test =  substr(a_line,1,7);
          do while (test = ' ');
               out_line = catx(' ',trim(out_line), trim(a_line2);
               input @2 a_line  $100. ;
               test =  substr(a_line,1,7);
          end;
          output;
out_line = a_line;
      end;
      else do;
           out_line = a_line;
           output;
      end;
   
return;
/*----------*/
cards
    ... enter here your input data ...

    ... add at the end a line with End Of Input indication ...
run; 

Super User
Posts: 5,513

Re: Info needed from input dataset wraps onto multiple lines.

[ Edited ]

This is untested, but should be OK:

 

data want;

infile rawdata end=nomore;

length message $ 1000;

retain message;

input;

if _n_=1 then message = _infile_;

else do;

   if left(_infile_) in : ("MgrMsg:", "APPLMGR:") then do;

      output;

      message = _infile_;

   end;

   else message = catx(message, _infile_);

end;

if nomore then output;

run;

 

The END= option on the INFILE statement probably won't work with a CARDS statement ... INFILE would be more appropriate here.

Contributor
Posts: 21

Re: Info needed from input dataset wraps onto multiple lines.

Posted in reply to Astounding

Thank you again, it is still not quite what I need. It is soooo close but it is dropping a line. The problem with the input data is it there may be zero, one or tw lines that follow that need to be included. Here is the code;

 

OPTION NOCENTER;                                        
 DATA want(keep=out_line);                              
  LENGTH out_line $200;                                 
  RETAIN out_line;                                      
  INFILE datalines TRUNCOVER;                           
  INPUT @2 a_line $100.;                                
test = SUBSTR(a_line,1,7);                              
  IF test = "MgrMsg:"  THEN DO;                         
     out_line = a_line;                                 
     INPUT @2 a_line $100.;                             
     test = SUBSTR(a_line,1,7);                         
     DO WHILE (test = ' ');                             
       out_line = CATX(' ',TRIM(out_line), TRIM(a_line));
       input @2 a_line  $100. ;                         
test = SUBSTR(a_line,1,7);                              
        END;                                            
        OUTPUT;                                         

  END;                                           
  ELSE DO;                                       
out_line = a_line;                               
  OUTPUT;                                        
  END;                                           
RETURN;                                          
                                                 
DATALINES;                                       
 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST
         *** FIRST LINE NEEDED *****             
         *** MAY OR MAY NOT BE ANOTHER *****     
APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed 
 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO
 MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO
 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/
         Jobno(43228)  User(SCMSIS47) Host(SCMSIS
 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST
         User(SCPPSB38) Host(scppsb38)           
APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed 
 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/

         Jobno(59568)  User(SCMSIS47) Host(SCMSIS
 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/
         Jobno(7800)  User(SCMSIS47) Host(SCMSIS4
;                                               
PROC PRINT;                                     

 

**** Here is the output

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST
*** MAY OR MAY NOT BE ANOTHER *****             
PPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed 
MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO
MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/
MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST
PPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed 
MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/
MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/

Contributor
Posts: 21

Re: Info needed from input dataset wraps onto multiple lines.

Posted in reply to Astounding

Thank you for the time you spent to respond to my question. Unfortunately this is not working as I expected. My input data looks like this.

 

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST

         *FIRST LINE NEEDED*           

         *MAY OR MAY NOT BE ANOTHER*   

APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed

 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO

 MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO

 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/

         Jobno(43228)  User(SCMSIS47) Host(SCMSIS

 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST

         User(SCPPSB38) Host(scppsb38)          

APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed

 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/

         Jobno(59568)  User(SCMSIS47) Host(SCMSIS

 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/

         Jobno(7800)  User(SCMSIS47) Host(SCMSIS4

 

 

Output looks like this

*** MAY OR MAY NOT BE ANOTHER *****            

APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed

 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO

 MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO

Jobno(43228) User(SCMSIS47) Host(SCMSIS        

User(SCPPSB38) Host(scppsb38)                  

APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed

Jobno(59568) User(SCMSIS47) Host(SCMSIS        

Jobno(7800) User(SCMSIS47) Host(SCMSIS4        

 

 

Desired output would be:

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST*FIRST LINE NEEDED**MAY OR MAY NOT BE ANOTHER*    

APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed

 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO

 MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO

 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/Jobno(43228)  User(SCMSIS47) Host(SCMSIS

 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN STUser(SCPPSB38) Host(scppsb38)          

APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed

 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/Jobno(59568)  User(SCMSIS47) Host(SCMSIS

 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/Jobno(7800)  User(SCMSIS47) Host(SCMSIS4

 

Thank you again for your time Smiley Happy

Trusted Advisor
Posts: 1,574

Re: Info needed from input dataset wraps onto multiple lines.

I have the feeling that small change will help:

 

On each INPUT line change from $100.  to $char100.

Changing this informat will calculate (substring) TEST realy from pos 1.

Using $100 makes left justification and thus give fault TEST value.

Contributor
Posts: 21

Re: Info needed from input dataset wraps onto multiple lines.

So close, it is only wokring on the first observation.

 

MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST * FIRST LINE NEEDED **MAY OR MAY NOT BE ANOTHER *

MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO                                                              

MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/ Jobno(43228) User(SCMSIS47) Host(SCMSIS                      

        User(SCPPSB38) Host(scppsb38)                                                                          

PPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed                                                                

MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/ Jobno(59568) User(SCMSIS47) Host(SCMSIS                      

        Jobno(7800)  User(SCMSIS47) Host(SCMSIS4                                                               

Solution
‎09-19-2016 03:53 PM
Trusted Advisor
Posts: 1,574

Re: Info needed from input dataset wraps onto multiple lines.

I made some changes to the program and tested it. Here is the code and the test data.

Pay attention: I had to add at end of datalines a new line to indicate the end of input lines (by TEST = 'EOF').

Without this indication, last output out_line won't be written.

 

Better check it again;

 

DATA want(keep=out_line);
LENGTH out_line $200;
RETAIN out_line ' ' flag_out 0;
drop flag_out;

INFILE datalines TRUNCOVER;
INPUT  a_line $char100.;
test = SUBSTR(a_line,2,7);
if test = "MgrMsg:"   THEN link process1; else
if test = " "                THEN link process2; else
if test ne 'EOF'         THEN link process3; else output;
RETURN;
*---------------------------*;
process1:
if flag_out = 1 then output;
out_line = a_line;
flag_out = 1;
RETURN;
*---------------------------*;
process2:
out_line = CATX(' ',TRIM(out_line), TRIM(a_line));
flag_out = 1;
RETURN;
*---------------------------*;
process3:
if flag_out = 1 then output;
out_line = a_line;
OUTPUT;
flag_out = 0;
RETURN;

DATALINES;
 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56910/MAIN ST
                *** FIRST LINE NEEDED *****
                *** MAY OR MAY NOT BE ANOTHER *****
APPLMGR: APPL DFUSCBM3.56910 JOB WFUSCBM3 Failed
 MgrMsg: SCPPSB3L SCPPSB3L/SRVMON.2057/MAIN RESPO
 MgrMsg: SCPPSB3M SCPPSB3M/SRVMON.2057/MAIN RESPO
 MgrMsg: SCMSIS47 KPPSCDPP.QA029/DPPSC5@.1308642/
                Jobno(43228) User(SCMSIS47) Host(SCMSIS
 MgrMsg: SCPPSB38 WFUSCBM3/DFUSCBM3.56912/MAIN ST
                User(SCPPSB38) Host(scppsb38)
APPLMGR: APPL DFUSCBM3.56912 JOB WFUSCBM3 Failed
 MgrMsg: SCMSIS47 KPPSCDPP.QA014/DPPSC5@.1308642/
               Jobno(59568) User(SCMSIS47) Host(SCMSIS
 MgrMsg: SCMSIS47 KPPSCDPP.QA010/DPPSC5@.1308642/
              Jobno(7800) User(SCMSIS47) Host(SCMSIS4
EOF
;

run;
PROC PRINT;

Contributor
Posts: 21

Re: Info needed from input dataset wraps onto multiple lines.

This works perfectly! Thank you again for all the help.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 11 replies
  • 579 views
  • 1 like
  • 3 in conversation