DATA Step, Macro, Functions and more

Complicated File Layout

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 12
Accepted Solution

Complicated File Layout

Hi -

 

I'm relatively new to SAS EG and am trying to read in data with a file format similar to the attached.  Please note that I needed to encrypt the data as it can't be shown as-is.  Nonetheless, the format remains original.

 

I am trying to get the unique occurences in 'Column 0' to appear into the subsequent blank fields below them until the next unique occurence is read.  Afterwards, I would like to delete the line containing the unique occurence.  I also won't need the column headings.

 

I've tried a number of approaches but cannot make it work out.

 

Thank you for helping.

 

----------------------------------

 

The before looks like:

 

Column 0                              Column 1                     Column 2                                  Column 3                                          Column 4
2C1c5P9b/EZ/ocCU
                                             DIZS1/oZvywyb           06000\ccpb2015DiZs.osc          CcPB Doc LibrZry 2015 DIZS            4/6/2016 13:06
                                             DIZS2/oZvywyb           06000\ccpb2015DiZs.osc          CcPB Doc LibrZry 2015 DIZS           11/24/2015 2:00
                                             DIZS3/oZvywyb           06000\ccpb2015DiZs.osc          CcPB Doc LibrZry 2015 DIZS           12/25/2015 2:00
ZZmioZ UsmZo/EZ/ocCU
                                             LoZdmio/EZ/ocCU        hylp\hylp85_Zdmio.osc             XSY mdl Domioo                             6/28/2016 13:52
ZZryo oyroyy/MPM/ocCU
                                             LoZpps08/EZ/ocCU      16000\oOCRqst.osc                  oOC Ryquysts                                6/5/2016 2:33

 

-------------------------------------

The after needs to look like:

 


2C1c5P9b/EZ/ocCU                DIZS1/oZvywyb           06000\ccpb2015DiZs.osc          CcPB Doc LibrZry 2015 DIZS            4/6/2016 13:06
2C1c5P9b/EZ/ocCU                DIZS2/oZvywyb           06000\ccpb2015DiZs.osc          CcPB Doc LibrZry 2015 DIZS           11/24/2015 2:00
2C1c5P9b/EZ/ocCU                DIZS3/oZvywyb           06000\ccpb2015DiZs.osc          CcPB Doc LibrZry 2015 DIZS           12/25/2015 2:00
ZZmioZ UsmZo/EZ/ocCU        LoZdmio/EZ/ocCU        hylp\hylp85_Zdmio.osc             XSY mdl Domioo                             6/28/2016 13:52
ZZryo oyroyy/MPM/ocCU        LoZpps08/EZ/ocCU      16000\oOCRqst.osc                  oOC Ryquysts                                6/5/2016 2:33


Accepted Solutions
Solution
‎07-08-2016 03:13 PM
Super User
Posts: 5,076

Re: Complicated File Layout

Here's one way:

 

data want;

set have;

if column_0 > ' ' then column_00 = column_0;

else output;

retain column_00;

drop column_0;

rename column_00=column_0;

run;

 

If you don't already have a SAS data set, the same idea could be applied to a DATA step that reads in the raw data.

View solution in original post


All Replies
Occasional Contributor
Posts: 12

Re: Complicated File Layout

Sorry Column 4 wrapped around
Solution
‎07-08-2016 03:13 PM
Super User
Posts: 5,076

Re: Complicated File Layout

Here's one way:

 

data want;

set have;

if column_0 > ' ' then column_00 = column_0;

else output;

retain column_00;

drop column_0;

rename column_00=column_0;

run;

 

If you don't already have a SAS data set, the same idea could be applied to a DATA step that reads in the raw data.

Occasional Contributor
Posts: 12

Re: Complicated File Layout

Thanks... it kinda works but I think there is something wrong with my file and will need to resolve before I can continue testing your suggestion.

 

Regards.

Super User
Posts: 10,474

Re: Complicated File Layout

[ Edited ]

Take a look at this:

data want;
   infile "test.txt" lrecl=300 pad firstobs=2 ;
   length column0 $ 34 column1 $ 17 column2 $ 50  column3 $ 50  ;
   informat date mmddyy10.;
   informat time time5.;
   format date mmddyy10.;
   format time time5.;
   retain column0 '';
   input @;
   if anyspace(substr(_infile_,1,1)) =0 then do;
      input  column0 $ 1-33;
   end;
   else do;
      input
         column1 $ 35-56
         column2 $ 57-106
         column3 $ 107-156
         
         date
         time
      ;
      output;
   end;
run

Some issues: varying line length, a simple input statement will likely generate lots of lost card from the column0 lines (fix: PAD on infile)

 

Second is the date values don't always start in the same column. This might be due to your "encryption" which I suspect was a couple of editor search and replaces.

 

You do have one line in the source file, line 12145 that looks like it should have column0 data but it is missing. So the above code treats it as the column1 data line type, retaining the previous column0 values and report errors for date on that line(there isn't one on

that line) and therefore has the wrong column0 value from there to the end of the data. If you put a column0 value on line 12145 starting in column1 you should be good to go.

Occasional Contributor
Posts: 12

Re: Complicated File Layout

Hello and thank you for your further expalnation as I think what you are mentioning is what I am seeing too.  I think it has something to do with taking a Lotus Notes category view, then saving to excel, then saving as a text file.  I will need some time to tinker and try to further understand this scenario and the use of the suggestions here.

 

Regards.

Occasional Contributor
Posts: 12

Re: Complicated File Layout

[ Edited ]

It looks like the first recommendation from Astounding works.  I did need to use Access to create a new file to get the data into "fixed columns".

 

I will continue to test both suggestions for more practice and then get back here with final results.

 

Thank you.

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 505 views
  • 0 likes
  • 3 in conversation