BookmarkSubscribeRSS Feed
Mgarret
Obsidian | Level 7

Hi,

I’m trying to extract a price from a string and I’m having some with my code.

This is what I start with:

PROMOTION_TITLE
Aug Earn $25 v2 external
Aug Earns $125 v2 external
Aug Earns $50 v2 external

This is what I need:

Price
$25
$125
$50

This is my program so far. I can't seem to get the syntax right.

data test (keep=PROMOTION_TITLE price) ;

set  a.promo;

  price =(input(prxchange('s/(\w+)_(\d+)_(\w+)/$2/',-1,PROMOTION_TITLE),8.))/100;

  format price dollar8.2;

run;

Any assistance will be greatly appreciated.

Thanks! 

16 REPLIES 16
sbb
Lapis Lazuli | Level 10 sbb
Lapis Lazuli | Level 10

Consider other SAS functions...

* DATA step with instream data-rows for illustration. ;

data _null_;

input ;

format price dollar8.2 ;

* parse input record and extract dollar-amount, convert to numeric. ;

price = input(scan(substr(_infile_,find(_infile_,'$','t')+1),1,' '),8.);

putlog _all_;

datalines;

Aug Earn $25 v2 external 

Aug Earns $125 v2 external 

Aug Earns $50 v2 external 

run;

Scott Barry

SBBWorks, Inc.

data_null__
Jade | Level 19

This looks about right.  Instead of removing everything around the dollar amount use a capture buffer to retrieve it.

data test;
   input PROMOTION_TITLE $80.;
  
retain rx;
   if _n_ eq 1 then rx = prxparse('/(\$\d+\.?\d?\d?)/');

  
if prxmatch(rx,promotion_title) then price = prxposn(rx,1,promotion_title);
   cards;
Aug Earn $25 v2 external
Aug Earns $125.75x v2 external
Aug Earns $50 v2 external
;;;;
   run;
proc print;
  
run;
Mgarret
Obsidian | Level 7

This is good but is it possible to change the var to numeric?

slchen
Lapis Lazuli | Level 10

data want;

input  ;

price =prxchange('s/.*(\$\d+).*/\1/',-1,_infile_) ;

datalines;

Aug Earn $25 v2 external

Aug Earns $125 v2 external

Aug Earns $50 v2 external

run;

PGStats
Opal | Level 21

My proposition is similar to DN's :

data have;

input pt &:$32.;

datalines;

Aug Earn $25 v2 external

Aug Earns $125 v2 external

Aug Earns $50 v2 external

;

data want;

if not prxId then prxId + prxparse("/\$\s?\d+(\.\d{2})?/");

set have;

call prxsubstr(prxId, pt, pos, len);

if pos > 0 then price = input (substr(pt, pos, len), dollar10.);

format price dollar11.2;

drop prxId pos len;

run;

proc print data=want noobs; run;

PG

PG
Peter_C
Rhodochrosite | Level 12

There is a feature in proc format that enableS the definition of the prx in a INVALUE range

The feature has been there for a few years. Rick Langston has been demonstrating it at SAS GF (along with the concept of using a function in  "label" area.)

Once compiled the syntax to pull that "price" could just be the usual INPUT statement with the very unusual user INFORMAT.

Peter_C
Rhodochrosite | Level 12

Patrick

you're welcome. And thanks for reporting the link.

Of course, it doesn't make prx any simpler   just their implementation 

regards

Peter

PGStats
Opal | Level 21

Thanks Patrick and Peter, very good to know. I wish ALL the features of my licenced SAS software were described in the main doc.I find proc format particularly poor in that respect.

PG

PG
Patrick
Opal | Level 21

The question is now if these are unsupported features or just something missing in the documentation. I've asked SAS TechSupport this question and will post the answer here.

And here how this could work using the REGEXPE option:

proc format;

  invalue Price (default=20)

    's/.*\$(\d+\.?\d*).*/\1/' (REGEXPE) = [best32.]

    other=0

    ;

run;

data have;

  input pt &:Price.;

  datalines;

Aug Earn $25 v2 external

Aug Earns $125.45 v2 external

Aug Earns $50 v2 external

Aug Earns v2 external

;

run;

Patrick
Opal | Level 21

The answer I've got from SAS TechSupport: REGEXP and REGEXPE are supported features. The SAS documentation will get updated and they will be fully documented in the future.

PGStats
Opal | Level 21

Thank you Patrick! According to Rick Langston's paper, the feature was introduced with SAS 9.3. Makes me wonder how many goodies I am missing. Maybe the slow disclosure rate is just another step in the testing process. Smiley Happy

PG

PG
FriedEgg
SAS Employee

data have;

input pt &:$32.;

datalines;

Aug Earn $25 v2 external

Aug Earns $125 v2 external

Aug Earns $50 v2 external

Aug Earns $125.01 v2 external

Aug Earns $50x v2 external

;

You can parse using double scan (it will be faster than prx)

data want;

set have;

price=input(scan(scan(pt,2,'$'),1,,'as'),best.);

format price dollar8.2;

Alternatively, using prx function

data want2;

re=prxparse('#(?<=\$)(\d+(\.\d{2})?)#');

format price dollar8.2;

do until(done);

set have end=done;

if prxmatch(re,pt) then do;

price=input(prxposn(re,1,pt),best.);

output;

end;

end;

run;

Using a regexpe informat

proc format;

invalue ReDollar 's#^.*(?<=\$)(\d+(\.\d{2})?).*$#\1#' (regexpe) = [14.];

run;

data want3;

set have;

price=input(pt,ReDollar.);

format price dollar8.2;

run;

^.*(?<=\$)(\d+(\.\d{2})?).*$

Options: ^ and $ match at line breaks

Assert position at the beginning of a line (at beginning of the string or after a line break character) «^»

Match any single character that is not a line break character «.*»

   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»

Assert that the regex below can be matched, with the match ending at this position (positive lookbehind) «(?<=\$)»

   Match the character “$” literally «\$»

Match the regular expression below and capture its match into backreference number 1 «(\d+(\.\d{2})?)»

   Match a single digit 0..9 «\d+»

      Between one and unlimited times, as many times as possible, giving back as needed (greedy) «+»

   Match the regular expression below and capture its match into backreference number 2 «(\.\d{2})?»

      Between zero and one times, as many times as possible, giving back as needed (greedy) «?»

      Match the character “.” literally «\.»

      Match a single digit 0..9 «\d{2}»

         Exactly 2 times «{2}»

Match any single character that is not a line break character «.*»

   Between zero and unlimited times, as many times as possible, giving back as needed (greedy) «*»

Assert position at the end of a line (at the end of the string or before a line break character) «$»

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 16 replies
  • 3226 views
  • 4 likes
  • 8 in conversation