DATA Step, Macro, Functions and more

Merge mulitiple line observations to single line using record line terminater

Reply
New Contributor
Posts: 2

Merge mulitiple line observations to single line using record line terminater

Hi All,

 

I have a requirement where I need to merge multiples lines records into single. Each row length in 80 bytes.

The challenge is that that each record that is spread in multiple lines is dynamic in number of lines i.e.

1. a new record is identified by a line having '=============….'

2. After the above record, there are 2 lines which has same structure for every record

3. From 4th record, there can be multiple records with same structure and it will be dynamic (1 or 2 or 3 lines...)

So I need to ignore the 1st line indicating new record (i.e. one with '========...'), merge line 2 and line 3 for each line in the next lines...

 

e.g.

===============================================================================
UNIQUEID_RECORD00001 2018-06-05 01:00:00 USER001  FNAME1   MNAME2   LNAME1    
00001     PERSONAL_RECORD_00000000001         00000000000000000001 JOBPRFL1   
CITY0001 CODE1 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
CITY0002 CODE2 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
===============================================================================
UNIQUEID_RECORD00002 2018-06-05 01:00:00 USER002  FNAME1   MNAME2   LNAME1    
00002     PERSONAL_RECORD_00000000001         00000000000000000002 JOBPRFL2   
CITY0003 CODE3 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
CITY0004 CODE4 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
CITY0005 CODE5 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
CITY0006 CODE6 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
===============================================================================
UNIQUEID_RECORD00003 2018-06-05 01:00:00 USER001  FNAME1   MNAME2   LNAME1    
00003     PERSONAL_RECORD_00000000003         00000000000000000031 JOBPRFL3   
CITY0002 CODE2 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         

 

output should be 

 

UNIQUEID_RECORD00001 2018-06-05 01:00:00 USER001  FNAME1   MNAME2   LNAME1     00001     PERSONAL_RECORD_00000000001         00000000000000000001 JOBPRFL1    CITY0001 CODE1 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
UNIQUEID_RECORD00001 2018-06-05 01:00:00 USER001  FNAME1   MNAME2   LNAME1     00001     PERSONAL_RECORD_00000000001         00000000000000000001 JOBPRFL1    CITY0002 CODE2 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
UNIQUEID_RECORD00002 2018-06-05 01:00:00 USER002  FNAME1   MNAME2   LNAME1     00002     PERSONAL_RECORD_00000000001         00000000000000000002 JOBPRFL2    CITY0003 CODE3 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
UNIQUEID_RECORD00002 2018-06-05 01:00:00 USER002  FNAME1   MNAME2   LNAME1     00002     PERSONAL_RECORD_00000000001         00000000000000000002 JOBPRFL2    CITY0004 CODE4 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
UNIQUEID_RECORD00002 2018-06-05 01:00:00 USER002  FNAME1   MNAME2   LNAME1     00002     PERSONAL_RECORD_00000000001         00000000000000000002 JOBPRFL2    CITY0005 CODE5 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
UNIQUEID_RECORD00002 2018-06-05 01:00:00 USER002  FNAME1   MNAME2   LNAME1     00002     PERSONAL_RECORD_00000000001         00000000000000000002 JOBPRFL2    CITY0006 CODE6 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         
UNIQUEID_RECORD00003 2018-06-05 01:00:00 USER001  FNAME1   MNAME2   LNAME1     00003     PERSONAL_RECORD_00000000003         00000000000000000031 JOBPRFL3    CITY0002 CODE2 ADDRESS1 ADDRESS2          ADDRESS3           PINCODE1         

PROC Star
Posts: 266

Re: Merge mulitiple line observations to single line using record line terminater

Posted in reply to rinkum2007

Something like this should do it:

data want;
  length record_id $40 date time 8 user fname mname lname $20 recordno 8 person_id $40 x 8
         jobprofile $40 city code adr1 adr2 adr3 pincode $20;
  informat date yymmdd. time time.;
  format date yymmdd10. time time8.;
  infile cards firstobs=2 eof=done; /* firstobs=2 to drop the first separator line */
  input record_id date time user fname mname lname /
        recordno person_id x jobprofile
        / @
        ;
  do while(_infile_ ne: '=========');
    input city code adr1 adr2 adr3 pincode /@;
    output;
    end;
done: cards; =============================================================================== UNIQUEID_RECORD00001 2018-06-05 01:00:00 USER001 FNAME1 MNAME2 LNAME1 00001 PERSONAL_RECORD_00000000001 00000000000000000001 JOBPRFL1 CITY0001 CODE1 ADDRESS1 ADDRESS2 ADDRESS3 PINCODE1 CITY0002 CODE2 ADDRESS1 ADDRESS2 ADDRESS3 PINCODE1 =============================================================================== UNIQUEID_RECORD00002 2018-06-05 01:00:00 USER002 FNAME1 MNAME2 LNAME1 00002 PERSONAL_RECORD_00000000001 00000000000000000002 JOBPRFL2 CITY0003 CODE3 ADDRESS1 ADDRESS2 ADDRESS3 PINCODE1 CITY0004 CODE4 ADDRESS1 ADDRESS2 ADDRESS3 PINCODE1 CITY0005 CODE5 ADDRESS1 ADDRESS2 ADDRESS3 PINCODE1 CITY0006 CODE6 ADDRESS1 ADDRESS2 ADDRESS3 PINCODE1 =============================================================================== UNIQUEID_RECORD00003 2018-06-05 01:00:00 USER001 FNAME1 MNAME2 LNAME1 00003 PERSONAL_RECORD_00000000003 00000000000000000031 JOBPRFL3 CITY0002 CODE2 ADDRESS1 ADDRESS2 ADDRESS3 PINCODE1 ;run;

Of course, you will probably want to use an external file as your infile, and drop the CARDS section.

 

The EOF=DONE thing is a workaround in case there is no separator line after the last data line.

New Contributor
Posts: 2

Re: Merge mulitiple line observations to single line using record line terminater

Hi Lassen,

 

Sorry for the delay in coming back on this.

Thanks a lot for your help here and it is working fine as expected.

Only case it has not handled is that the last line is not picked in the output.

There will not be a separator at the EOF.

So if you can further help to handle this condition, it would be a complete solution.

 

Thanks,

Rinku

PROC Star
Posts: 266

Re: Merge mulitiple line observations to single line using record line terminater

Posted in reply to rinkum2007

Sorry, do you have any problems with the code I submitted? I put the EOF=DONE and the DONE label in to handle the case where there was no delimiter line in the end. If that gives an error, you will have to be more specific, maybe show a sample file and the log.

Ask a Question
Discussion stats
  • 3 replies
  • 153 views
  • 0 likes
  • 2 in conversation