Help using Base SAS procedures

Scan messy data for values

Reply
Contributor
Posts: 30

Scan messy data for values

Thank you anyone for helping with this.  I’m trying to pick up
the values left of ‘T’ and right of ‘$’.   Rows without T are
eliminated.   Any help would be appreciated.  I can’t get scan(
to work here for lack of experience. I've tried the following:

data test;

infile "j:\test.txt" truncover;

input i $char100. ;

i=upcase(i);

i = compress(i,',');

Tons=input(scan(i,10," ",'b'),6.); /*T and $ seem to be the only consistency*/

Price=input(scan(i,8," ",'b'),7.); /*space as a delimiter is inconsistent */

format
Tons
6.2 price 7.2;

Here is a sample of the text file:

Medium square, soy bean straw, on ground 57B $ 17.00 ] ] ] ] ] ] ]

5x6 round, soy bean straw, net wrap, on ground 23B $ 29.00 ] ] ] ] ] ] ]

Medium square, alfalfa, 2nd cutting 21.75T $ 180.00 87.39 12.61 18.24 30.5 37.46 162 137

3x4 square, alfalfa, 3rd cut 26.64T $ 260.00 85.42 14.58 22.2 29.1 34.21 180 166

Medium square, alfalfa, 3rd cutting, 48B 25.28T $ 90.00 78.71 21.29 19.14 39.22 45.67 119 95

Medium square, alfalfa/fescue mix, 4th cutting, 36B 13.26T $ 155.00 84.03 15.97 19.41 36.84 47.07 119 124

5x6 round, alfalfa, 3rd cut, net wrap 24.74T $ 215.00 81.77 18.23 21.37 29.85 37.38 163 151

Large round, grass hay, 1st crop 7.06T $ 85.00 81.83 18.17 12.79 45.22 60.64 82 65

4x5 round, grass mix hay, stored inside, 42B 16.21T $ 130.00 86.42 13.58 6.54 39.13 55.52 98 95

Medium square, grass mix, 24B 8.18T $ 125.00 83.72 16.28 13.59 39.61 51.39 105 97

3x4 medium square, wheat straw, roto, stored inside 25.85 $ 112.00 ] ] ] ] ] ] ]

Medium square, oat straw 72B $ 30.00 ] ] ] ] ] ] ]

The file is attached. Thanks!

Attachment
Super User
Posts: 9,682

Re: Scan messy data for values

If you want the rows with T but without $ ?

data x;
infile 'c:\temp\test.txt' ;
input;
if findc(_infile_,'T') and findc(_infile_,'$') then do;
 var1=scan(_infile_,1,'T');var2=scan(_infile_,-1,'$'); output;
end;
run;

Xia Keshan

Respected Advisor
Posts: 3,896

Re: Scan messy data for values

Below code works for the data you've posted.

data sample(drop=_Smiley Happy;

  infile 'c:\temp\test.txt' lrecl=1000 dlm=' T';

  input @;

  retain _prxid;

  if _n_=1 then

    _prxid=prxparse('/(\d+\.\d+)T \$ (\d+\.\d+)/o');

  if prxmatch(_prxid,_infile_) then

    do;

      call prxposn(_prxid, 1, _pos);

      input @_pos Quantity ?? :best32. @;

      call prxposn(_prxid, 2, _pos);

      input @_pos Price ?? :best32.;

      output;

    end;

run;

Contributor
Posts: 30

Re: Scan messy data for values

Patric,  this works fantastic for picking up Ton and Price value .  I did not foresee a pearl response.  I had been using a 'class' statement such as  indexw(i,"ALFALFA")   or GRASS OR STRAW etc.  to create a classification by price level variable called 'TYPE'.

for lack of experience in Pearl prxparse(...  i don't know how to augment your code to include the class ification element to create "TYPE" variable.

can you give a clue?  Thanks you!

Respected Advisor
Posts: 3,896

Re: Scan messy data for values

I'd probably go for a simple scan() for this.

data sample(drop=_Smiley Happy;

  infile 'c:\temp\test.txt' lrecl=1000 dlm=' T';

  input @;

  length Type $40;

  retain _prxid;

  if _n_=1 then

    _prxid=prxparse('/(\d+\.\d+)T \$ (\d+\.\d+)/o');

  if prxmatch(_prxid,_infile_) then

    do;

      Type=scan(_infile_,2,',');

      call prxposn(_prxid, 1, _pos);

      input @_pos Quantity ?? :best32. @;

       call prxposn(_prxid, 2, _pos);

      input @_pos Price ?? :best32.;

       output;

    end;

run;

Super User
Super User
Posts: 7,407

Re: Scan messy data for values

As an alternative:

data have (keep=resultt result_dollar);

  length resultt result_dollar $200.;

  array var{20} $200.;

  infile "S:\Temp\Rob\test.txt" dlm=' ' dsd missover lrecl=32767;

  input var1-var20 $;

  do i=1 to 20;

    if index(var{i},"T")>1 then resultt=var{i};

    if strip(var{i})="$" then result_dollar=var{i+1};

  end;

run;

Ask a Question
Discussion stats
  • 5 replies
  • 288 views
  • 0 likes
  • 4 in conversation