BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Gladis6680
Obsidian | Level 7

Hi there,

I have a character variable called Description

HAVEWANT
Sauce Alfredo #2 4/2 KgSauce Alfredo
Sauce Alfredo 2/3.7 LtSauce Alfredo
Assist 1Al1 1/18.9 LTAssist 1Al1
Assouplissant Solid Navisoft Bp S#Ncf# 2/6LbAssouplissant Solid Navisoft Bp S#Ncf#
100% Whole Wheat 1/675 Gr100% Whole Wheat
15X20X1/2 White Cut/Board 6/1 Ea15X20X1/2 White Cut/Board
16Oz White Plastic Funnel 1/1 Ea16Oz White Plastic Funnel
A&W Diet Root Beer 12/355 MlA&W Diet Root Beer
A&W Rootbeer Fridge Pack 12/355 MlA&W Rootbeer Fridge Pack
Almond Slivered Blanched (20067261010715) 2/1.50 KgAlmond Slivered Blanched (20067261010715)
Almond Slivered Blanched 1/3 KgAlmond Slivered Blanched
Almond Whole Blanched (20067261010418) 2/1.50 KgAlmond Whole Blanched (20067261010418)
Bacon 14/16 Ct Center Cut 8.75 In Sliced 1/5 KgBacon 14/16 Ct Center Cut 8.75 In Sliced
Bacon 16/18 Ct Applewood Smoked 1/5 KgBacon 16/18 Ct Applewood Smoked
Bacon 16/18 Ct Center Cut 1/5 KgBacon 16/18 Ct Center Cut
Bag Garbage 35X50 Heavy Duty Natural 1/200 CtBag Garbage 35X50 Heavy Duty Natural
Bag Garbage 35X50 Regular Black 1/100 CtBag Garbage 35X50 Regular Black
Bar Granola Fruit & Nut Trail Mix (000065633408811) 8/12/35 GrBar Granola Fruit & Nut Trail Mix (000065633408811)
Bar Granola Oats N' Honey Crunchy 6/12/46 GrBar Granola Oats N' Honey Crunchy
Spoon Set Measure SS 6/1 EaSpoon Set Measure SS
Spoon Soup Celine 18/0 Round 7 - 17.8Cm (502513) 12/1 EaSpoon Soup Celine 18/0 Round 7 - 17.8Cm (502513)
Spoon Soup Ecopro Poly 1/1000 Ea (10057937706427)Spoon Soup Ecopro Poly (10057937706427)
Spoon Soup Medium Weight Polypro Dense (00620868514701) 1/1000 CtSpoon Soup Medium Weight Polypro Dense (00620868514701)
Spoon Wooden 18 In W/Long Handle 1/1 EaSpoon Wooden 18 In W/Long Handle
Spoon Wooden Heavy Duty 16 In - 40.6 Cm 1/1 Ea (575386)Spoon Wooden Heavy Duty 16 In - 40.6 Cm (575386)
Spread Apple Raspberry 1/14 KgSpread Apple Raspberry

 

I would like to take the Quantity out for eg (Ea, Ct, Kg, Kg), some of the Ea's are in between the description.

I would appreciate any help.

Thanks

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

The following could give a start.

 

 

data have;
infile cards dsd truncover;
input x : $80.;
pid=prxparse('/[\.\/\d]+\s*(Ea|Kg|Ct|Gr|Lt|Lb|Ml)\b/i');
call prxsubstr(pid,x,p,l);
if p then want=substr(x,p,l);
drop pid p l;
cards;
Sauce Alfredo #2 4/2 Kg	 4/2 Kg
Sauce Alfredo 2/3.7 Lt	2/3.7 Lt
Assist 1Al1 1/18.9 LT	1/18.9 LT
Assouplissant Solid Navisoft Bp S#Ncf# 2/6Lb	2/6Lb
100% Whole Wheat 1/675 Gr	1/675 Gr
15X20X1/2 White Cut/Board 6/1 Ea	6/1 Ea
16Oz White Plastic Funnel 1/1 Ea	1/1 Ea
A&W Diet Root Beer 12/355 Ml	12/355 Ml
A&W Rootbeer Fridge Pack 12/355 Ml	12/355 Ml
Almond Slivered Blanched (20067261010715) 2/1.50 Kg	2/1.50 Kg
Almond Slivered Blanched 1/3 Kg	1/3 Kg
Almond Whole Blanched (20067261010418) 2/1.50 Kg	2/1.50 Kg
Bacon 14/16 Ct Center Cut 8.75 In Sliced 1/5 Kg	1/5 Kg
Bacon 16/18 Ct Applewood Smoked 1/5 Kg	1/5 Kg
Bacon 16/18 Ct Center Cut 1/5 Kg	1/5 Kg
Bag Garbage 35X50 Heavy Duty Natural 1/200 Ct	1/200 Ct
Bag Garbage 35X50 Regular Black 1/100 Ct	1/100 Ct
Bar Granola Fruit & Nut Trail Mix (000065633408811) 8/12/35 Gr	8/12/35 Gr
Bar Granola Oats N' Honey Crunchy 6/12/46 Gr	6/12/46 Gr
Spoon Set Measure SS 6/1 Ea	6/1 Ea
Spoon Soup Celine 18/0 Round 7 - 17.8Cm (502513) 12/1 Ea	12/1 Ea
Spoon Soup Ecopro Poly 1/1000 Ea (10057937706427)	1/1000 Ea
Spoon Soup Medium Weight Polypro Dense (00620868514701) 1/1000 Ct	1/1000 Ct
Spoon Wooden 18 In W/Long Handle 1/1 Ea	1/1 Ea
Spoon Wooden Heavy Duty 16 In - 40.6 Cm 1/1 Ea (575386)	 1/1 Ea
Spread Apple Raspberry 1/14 Kg
;
run;

View solution in original post

20 REPLIES 20
Reeza
Super User
Data cleaning. Welcome to the world of fun with functions.

Functions that will be helpful here:

COMPRESS()
TRANSLATE
TRANWRD
SCAN()
COUNTW()
FIND/FINDW
INDEX/INDEXW
and
Regular Expressions.

TRANWRD will replace words, translate will replace strings.
SCAN() will separate words
COUNTW() will count the number of words
Gladis6680
Obsidian | Level 7

Is it possible to give me an example with some code 

Thanks

ballardw
Super User

If this were my project before spending a lot time with what is going to be messy code I would go to the data source and see if there is a way to provide the information with the units and quantity as a separate field from the product or whatever.

 

It may be that whoever supplied this to you actually combined things from multiple fields into a single field thinking that was what was desired.

 

If you have an inventory data base floating around talk to the users or DB admins and see if there isn't a description separate from the packing units. It is very likely as many times you have "case" as well as "box" or "package" units which are handled in the inventory program to help when someone needs xxx of something.

Gladis6680
Obsidian | Level 7

Thank you for your advise, I have reached out to the source.

Ksharp
Super User

The following could give you a start.

 

data have;
input have $80. ;
want=prxchange('s/\S+\s*(Ea|Ct|Kg|Gr|Lt|Lb|Ml)\b//i',-1,have);
cards;
Sauce Alfredo #2 4/2 Kg	
Sauce Alfredo 2/3.7 Lt
Assist 1Al1 1/18.9 LT	
Assouplissant Solid Navisoft Bp S#Ncf# 2/6Lb
100% Whole Wheat 1/675 Gr	
15X20X1/2 White Cut/Board 6/1 Ea	
16Oz White Plastic Funnel 1/1 Ea	
A&W Diet Root Beer 12/355 Ml	
A&W Rootbeer Fridge Pack 12/355 Ml	
Almond Slivered Blanched (20067261010715) 2/1.50 Kg	
Almond Slivered Blanched 1/3 Kg	
Almond Whole Blanched (20067261010418) 2/1.50 Kg	
Bacon 14/16 Ct Center Cut 8.75 In Sliced 1/5 Kg
Bacon 16/18 Ct Applewood Smoked 1/5 Kg	
Bacon 16/18 Ct Center Cut 1/5 Kg	
Bag Garbage 35X50 Heavy Duty Natural 1/200 Ct
;
Gladis6680
Obsidian | Level 7

Thank you so much !!! this will help me get started for sure

Gladis6680
Obsidian | Level 7

Hi,

Also is it possible to put the quantity (what was deleted in the prxchange command) into another field?

For example:

HAVEQuantity
Sauce Alfredo #2 4/2 Kg4/2 Kg
Sauce Alfredo 2/3.7 Lt2/3.7 Lt
Assist 1Al1 1/18.9 LT1/18.9 LT
Assouplissant Solid Navisoft Bp S#Ncf# 2/6Lb2/6Lb
100% Whole Wheat 1/675 Gr1/675 Gr
15X20X1/2 White Cut/Board 6/1 Ea6/1 Ea
16Oz White Plastic Funnel 1/1 Ea1/1 Ea
A&W Diet Root Beer 12/355 Ml12/355 Ml
A&W Rootbeer Fridge Pack 12/355 Ml12/355 Ml
Almond Slivered Blanched (20067261010715) 2/1.50 Kg2/1.50 Kg
Almond Slivered Blanched 1/3 Kg1/3 Kg
Almond Whole Blanched (20067261010418) 2/1.50 Kg2/1.50 Kg
Bacon 14/16 Ct Center Cut 8.75 In Sliced 1/5 Kg1/5 Kg
Bacon 16/18 Ct Applewood Smoked 1/5 Kg1/5 Kg
Bacon 16/18 Ct Center Cut 1/5 Kg1/5 Kg
Bag Garbage 35X50 Heavy Duty Natural 1/200 Ct1/200 Ct
Bag Garbage 35X50 Regular Black 1/100 Ct1/100 Ct
Bar Granola Fruit & Nut Trail Mix (000065633408811) 8/12/35 Gr8/12/35 Gr
Bar Granola Oats N' Honey Crunchy 6/12/46 Gr6/12/46 Gr
Spoon Set Measure SS 6/1 Ea6/1 Ea
Spoon Soup Celine 18/0 Round 7 - 17.8Cm (502513) 12/1 Ea12/1 Ea
Spoon Soup Ecopro Poly 1/1000 Ea (10057937706427)1/1000 Ea
Spoon Soup Medium Weight Polypro Dense (00620868514701) 1/1000 Ct1/1000 Ct
Spoon Wooden 18 In W/Long Handle 1/1 Ea1/1 Ea
Spoon Wooden Heavy Duty 16 In - 40.6 Cm 1/1 Ea (575386) 1/1 Ea
Spread Apple Raspberry 1/14 Kg1/14 Kg

thank you for your help

Ksharp
Super User
Bacon 16/18 Ct Center Cut 1/5 Kg 1/5 Kg
Bag Garbage 35X50 Heavy Duty Natural 1/200 Ct 1/200 Ct

 

Why 16/18 Ct in the first obs is not picked up ?

hashman
Ammonite | Level 13

@Ksharp:

Because the OP's pattern rule appears to be simpler than you might have thought. It is merely "find the last slash and split on the position of the first blank before it". 

 

Kind regards

Paul D.

Ksharp
Super User

Then SCAN() is more than suffice .

 

data have ;                                                               
  input have $80. ;                                                       
  cards;                                                                  
Sauce Alfredo #2 4/2 Kg                                                   
Sauce Alfredo 2/3.7 Lt                                                    
Assist 1Al1 1/18.9 LT                                                     
Assouplissant Solid Navisoft Bp S#Ncf# 2/6Lb                              
100% Whole Wheat 1/675 Gr                                                 
15X20X1/2 White Cut/Board 6/1 Ea                                          
16Oz White Plastic Funnel 1/1 Ea                                          
A&W Diet Root Beer 12/355 Ml                                              
A&W Rootbeer Fridge Pack 12/355 Ml                                        
Almond Slivered Blanched (20067261010715) 2/1.50 Kg                       
Almond Slivered Blanched 1/3 Kg                                           
Almond Whole Blanched (20067261010418) 2/1.50 Kg                          
Bacon 14/16 Ct Center Cut 8.75 In Sliced 1/5 Kg                           
Bacon 16/18 Ct Applewood Smoked 1/5 Kg                                    
Bacon 16/18 Ct Center Cut 1/5 Kg                                          
Bag Garbage 35X50 Heavy Duty Natural 1/200 Ct
; 
run ;                                                                     
                                                                          
data want ;                                                               
  set have ;    
  want=catx('/',scan(scan(have,-2,'/'),-1,' '),scan(have,-1,'/'));
run;
hashman
Ammonite | Level 13

@Ksharp:

To form the "quantity" part, which you name here WANT, - yes, it does (though I'd rather not leave it at the default length 200). However, it leaves HAVE uncut. Methinks to rid it of the trailing "quantity", determining the split position and making SUBSTR rely on it is still called for. In principle, using TRANWRD to replace the trailing "quantity" with blanks is possible, but it's iffy since the presence of the same substring in HAVE, though unlikely, cannot be ruled out.

 

Best regards

Paul D. 

Ksharp
Super User

The following could give a start.

 

 

data have;
infile cards dsd truncover;
input x : $80.;
pid=prxparse('/[\.\/\d]+\s*(Ea|Kg|Ct|Gr|Lt|Lb|Ml)\b/i');
call prxsubstr(pid,x,p,l);
if p then want=substr(x,p,l);
drop pid p l;
cards;
Sauce Alfredo #2 4/2 Kg	 4/2 Kg
Sauce Alfredo 2/3.7 Lt	2/3.7 Lt
Assist 1Al1 1/18.9 LT	1/18.9 LT
Assouplissant Solid Navisoft Bp S#Ncf# 2/6Lb	2/6Lb
100% Whole Wheat 1/675 Gr	1/675 Gr
15X20X1/2 White Cut/Board 6/1 Ea	6/1 Ea
16Oz White Plastic Funnel 1/1 Ea	1/1 Ea
A&W Diet Root Beer 12/355 Ml	12/355 Ml
A&W Rootbeer Fridge Pack 12/355 Ml	12/355 Ml
Almond Slivered Blanched (20067261010715) 2/1.50 Kg	2/1.50 Kg
Almond Slivered Blanched 1/3 Kg	1/3 Kg
Almond Whole Blanched (20067261010418) 2/1.50 Kg	2/1.50 Kg
Bacon 14/16 Ct Center Cut 8.75 In Sliced 1/5 Kg	1/5 Kg
Bacon 16/18 Ct Applewood Smoked 1/5 Kg	1/5 Kg
Bacon 16/18 Ct Center Cut 1/5 Kg	1/5 Kg
Bag Garbage 35X50 Heavy Duty Natural 1/200 Ct	1/200 Ct
Bag Garbage 35X50 Regular Black 1/100 Ct	1/100 Ct
Bar Granola Fruit & Nut Trail Mix (000065633408811) 8/12/35 Gr	8/12/35 Gr
Bar Granola Oats N' Honey Crunchy 6/12/46 Gr	6/12/46 Gr
Spoon Set Measure SS 6/1 Ea	6/1 Ea
Spoon Soup Celine 18/0 Round 7 - 17.8Cm (502513) 12/1 Ea	12/1 Ea
Spoon Soup Ecopro Poly 1/1000 Ea (10057937706427)	1/1000 Ea
Spoon Soup Medium Weight Polypro Dense (00620868514701) 1/1000 Ct	1/1000 Ct
Spoon Wooden 18 In W/Long Handle 1/1 Ea	1/1 Ea
Spoon Wooden Heavy Duty 16 In - 40.6 Cm 1/1 Ea (575386)	 1/1 Ea
Spread Apple Raspberry 1/14 Kg
;
run;
Gladis6680
Obsidian | Level 7

Thank you so much !!! I have to do some manual parsing in order to the quantity, but this has definitely helped me!!!

hashman
Ammonite | Level 13

@Gladis6680:

Since your pattern is rather simple, surely it is possible, even without the sophistication of regexen. All you have to do is find the position of the blank before the last slash:

data have ;                                                               
  input have $80. ;                                                       
  cards;                                                                  
Sauce Alfredo #2 4/2 Kg                                                   
Sauce Alfredo 2/3.7 Lt                                                    
Assist 1Al1 1/18.9 LT                                                     
Assouplissant Solid Navisoft Bp S#Ncf# 2/6Lb                              
100% Whole Wheat 1/675 Gr                                                 
15X20X1/2 White Cut/Board 6/1 Ea                                          
16Oz White Plastic Funnel 1/1 Ea                                          
A&W Diet Root Beer 12/355 Ml                                              
A&W Rootbeer Fridge Pack 12/355 Ml                                        
Almond Slivered Blanched (20067261010715) 2/1.50 Kg                       
Almond Slivered Blanched 1/3 Kg                                           
Almond Whole Blanched (20067261010418) 2/1.50 Kg                          
Bacon 14/16 Ct Center Cut 8.75 In Sliced 1/5 Kg                           
Bacon 16/18 Ct Applewood Smoked 1/5 Kg                                    
Bacon 16/18 Ct Center Cut 1/5 Kg                                          
Bag Garbage 35X50 Heavy Duty Natural 1/200 Ct                             
run ;                                                                     
                                                                          
data want ;                                                               
  set have ;                                                              
  _n_ = findc (substr (have, 1, findc (have, "/", -32767)), " ", -32767) ;
  quantity = substr (have, _n_ + 1) ;                                     
  have     = substr (have,  1, _n_) ;                                     
run ;                                                                     

 Note that you cannot swap the order of the assignment statements for QUANTITY and HAVE without breaking the algorithm. 

 

Kind regards

Paul D.

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 20 replies
  • 2478 views
  • 6 likes
  • 5 in conversation