DATA Step, Macro, Functions and more

Text Parsing

Accepted Solution Solved
Reply
Contributor
Posts: 66
Accepted Solution

Text Parsing

Hi,

I'm looking for help in figuring out the logic to parse a text string.

The variable contains a string with purchase type, dollar value, and conversions, for example:

ID PurchaseType

1 CPA ($2.50 x 20)

2 CPAA ($13.00 x 10)

3 CPC  ($0.10 x 100)

etc.

I'm looking to prse the string by purchase type and value, for example

ID Type Value

1 CPA 2.50

2 CPAA 13.00

3 CPC 0.10

etc.

Thanks for your help.

KD


Accepted Solutions
Solution
‎11-08-2012 01:21 PM
Super Contributor
Posts: 1,636

Re: Text Parsing

Posted in reply to Danglytics

try:

data have;

id=1;

PurchaseType='CPA ($0.50 x 20)';

run;

data want;

  set have;

  length type $8;

  type=scan(PurchaseType,1);

  value=compress(scan(PurchaseType,1,'x'),'.','kd');

  keep id type value;

  proc print;run;

View solution in original post


All Replies
Solution
‎11-08-2012 01:21 PM
Super Contributor
Posts: 1,636

Re: Text Parsing

Posted in reply to Danglytics

try:

data have;

id=1;

PurchaseType='CPA ($0.50 x 20)';

run;

data want;

  set have;

  length type $8;

  type=scan(PurchaseType,1);

  value=compress(scan(PurchaseType,1,'x'),'.','kd');

  keep id type value;

  proc print;run;

PROC Star
Posts: 7,468

Re: Text Parsing

Posted in reply to Danglytics

data have;

  informat PurchaseType $50.;

  input id 1 PurchaseType 3-50;

  cards;

1 CPA ($2.50 x 20)

2 CPAA ($13.00 x 10)

3 CPC  ($0.10 x 100)

;

data want;

  set have;

  format value 6.2;

  Type=scan(PurchaseType,1);

  value=input(scan(PurchaseType,2,"("),dollar6.);

run;

Super Contributor
Posts: 1,636

Re: Text Parsing

Art,

It is unfair. You always come up with better solution:smileycry:.

PROC Star
Posts: 7,468

Re: Text Parsing

Linlin:  Not true!  You have offered better solutions than me on a number of occasions.

Respected Advisor
Posts: 3,156

Re: Text Parsing

OK, here is a challenge to Art and Linlin:

Are you able to get the second cluster of numbers (20,10,100) without using PRX functions? Smiley Wink

Haikuo

BTW, here is one using PRX:

data want;

  set have;

    retain _p;

      _p=prxparse("/\s\d+\)/");

      call prxsubstr(_p,PurchaseType,_st, _len);

      value=substr(PurchaseType,_st+1,_len-2);

      drop _:;

run;

PROC Star
Posts: 7,468

Re: Text Parsing

:  too easy of a challenge!

data have;

  informat PurchaseType $50.;

  input id 1 PurchaseType 3-50;

  cards;

1 CPA ($2.50 x 20)

2 CPAA ($13.00 x 10)

3 CPC  ($0.10 x 100)

;

data want;

  set have;

  format value 6.2;

  Type=scan(PurchaseType,1);

  value=input(scan(PurchaseType,2,"("),dollar6.);

  amount=input(scan(PurchaseType,-1),8.);

run;

Respected Advisor
Posts: 3,156

Re: Text Parsing

Well, like they just said, you are the best!

Respected Advisor
Posts: 3,156

Re: Text Parsing

Posted in reply to Danglytics

SAS does come with a very powerful set of character functions. Here is to use regular expression:

data have;

  informat PurchaseType $50.;

  input id PurchaseType 3-50;

  cards;

1 CPA ($2.50 x 20)

2 CPAA ($13.00 x 10)

3 CPC ($0.10 x 100)

;

data want;

  set have;

  retain _p;

_p=prxparse("/\$\d+\.\d+ /");

call prxsubstr(_p,PurchaseType,_st, _len);

value=substr(PurchaseType,_st+1,_len-1);

drop _:;

run;

proc print;run;

Haikuo

Super User
Posts: 10,023

Re: Text Parsing

SCAN() is also very powerful .

data have;
  informat PurchaseType $50.;
  input id PurchaseType 3-50;
  cards;
1 CPA ($2.50 x 20)
2 CPAA ($13.00 x 10)
3 CPC ($0.10 x 100)
;
run;
data want;
 set have;
 type=scan(PurchaseType,1, ,'ka');
 value=scan(PurchaseType,1,',.','kd');
run;

Ksharp

🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 355 views
  • 1 like
  • 5 in conversation