Desktop productivity for business analysts and programmers

Line pointers with raw data with inconsistent lines and embedded headers

Reply
Contributor
Posts: 35

Line pointers with raw data with inconsistent lines and embedded headers

The raw data I want to read not only runs across several different lines but also doesn't have the same number of lines per record. To further complicate things it also has headers that appear in the middle of the file after every page that ruin everything. here is code so far in sas studio...

`libname mylib '/folders/myfolders/';

data myfile;

  length itm $ 4 itemnum 5 itemdesc $ 40 inac $ 2 assetcl $ 4 invcl 3 dspunit $ 2ordunit $ 2 convr 4 loc $ 4 vndnum 4 manufnum $ 20 vendinfo $ 80;

  infile '/folders/myfolders/ItemstrSM.txt' missover;

  input  #1 itm $ itemnum itemdesc $ &

            #2 inac $ assetcl $ invcl dspunit $ ordunit $ convr loc $ vndnum manufnum

           #3 vendinfo & $ ;

run;

proc print data=myfile noobs;

run;

SAMPLE OF RAW DATA.....

Time: 1:47pm                                      Item Master Report For 06/06/2013                                 Report: GMRIMMSB

Item Type: Nonstock

Item  Asset  Inven  Dsp   ---Order---  ---Primary----                           Substute     Contract     Hazd   Count    

Stat  Class  Class  Unt   Unit   Conv   Loc    Vendor        Manufacturer Nbr   Item Nbr      Number      Flag   Cycle 

------------------------------------------------------------------------------------------------------------------------------------

  ITEM     20049 TEST PNEUMONIA S LATEX ZL22 (30859001)

A    0173    6      PK     PK       1   NSL   2431       R30859001                                                    

     Vendor  1:      2431 FISHER SCIENTIFIC COMPANY                 2:      2658 REMEL

             3:       536 ABBOTT LABS - DIAGNOSTIC DIVISION         4:      1404 MUREX DIAGNOSTICS INC.

  ITEM     20051 ANTIGEN BACTER. WELLCOGEN ZL26 B1901-51

A    0173    6      PK     PK       1   NSL   2431       30859602                                                     

     Vendor  1:      2431 FISHER SCIENTIFIC COMPANY                 2:      3804 CARDINAL HEALTH-ALLEGIANCE

             3:      2658 REMEL                                     4:       536 ABBOTT LABS - DIAGNOSTIC DIVISION

             5:      1404 MUREX DIAGNOSTICS INC.

  ITEM     20053 FILM DUPLICATING 10X12

I    0173    14     BX     BX       1   NSX   1335       112010                                                       

     Vendor  1:      1335 AGFA CORPORATION

  ITEM     20055 FILM HTU 10 X 12

I    0173    14     BX     BX       1   NSX   1335       094010                                                       

     Vendor  1:      1335 AGFA CORPORATION

  ITEM     20056 FILM HTU 8 X 10

I    0173    14     BX     BX       1   NSX   1335       094008                                                       

     Vendor  1:      1335 AGFA CORPORATION

  ITEM     20057 SOL AXSYM FLUIDIES CHECK (09A3401)

A    0173    119    BX     BX       1   NSL   536                                                                     

     Vendor  1:       536 ABBOTT LABS - DIAGNOSTIC DIVISION

  ITEM     20058 FILM DUPLICATING 8 X 10

I    0173    14     BX     BX       1   NSX   1335       112008                                                       

     Vendor  1:      1335 AGFA CORPORATION

  ITEM     20059 FILM HTU 14 X 17

I    0173    14     BX     BX       1   NSX   1335       094014                                                       

     Vendor  1:      1335 AGFA CORPORATION

Item  Asset  Inven  Dsp   ---Order---  ---Primary----                           Substute     Contract     Hazd   Count    

Stat  Class  Class  Unt   Unit   Conv   Loc    Vendor        Manufacturer Nbr   Item Nbr      Number      Flag   Cycle 

------------------------------------------------------------------------------------------------------------------------------------

  ITEM     20060 FILM HTU 30 X 35

I    0173    14     BX     BX       1   NSX   1335       094030                                                       

     Vendor  1:      1335 AGFA CORPORATION

  ITEM     20061 FILM HTU 14 X 14

I    0173    14     BX     BX       1   NSX   1335       094001                                                       

     Vendor  1:      1335 AGFA CORPORATION

Grand Advisor
Posts: 10,223

Re: Line pointers with raw data with inconsistent lines and embedded headers

Welcome to the world of fun coding. If a point-and-click interface ever gets to point where it can generate code for such situations (at least without having the code that wrote this type of data) the developer can say they have something.

One general approach is to use one of the SAS automatic variables called _infile_, which has the entire current "line" of input data.

You can search this variable for content of interest.

A brief example:

do until (scan(_infile_,1) = "ITEM");

     input;

end;

Else do;

     /* this where the actual reading would occur*/

    OUTPUT; /* explicit OUTPUT after successful reading so that is the only output in the dataset*/

end;

This advances through the file until it finds a line that starts with ITEM, the header rows that start with Item are ignored due to case difference.

The meat of your program would be to read that item information. After reading the two lines of data you could then test for the presence of : to find your vendors. The Else block should have a test for 2: , 4: , 6: (guessing here and not sure how many you might have) that might indicate another row of vendors.

Have fun.

Ask a Question
Discussion stats
  • 1 reply
  • 239 views
  • 0 likes
  • 2 in conversation