Desktop productivity for business analysts and programmers

How to create a computed column that contains a numerical value found in a text string from....

Reply
Occasional Contributor
Posts: 13

How to create a computed column that contains a numerical value found in a text string from....

Good morning

 

I have another silly noob questions for you

 

I have a variable that contains a long text like

"Mary had 50 lambs"

"Julie had 100 lambs"

"Laura has 75 lambs"

 

I would like to create a computed numerical column that retrieves the number contained in the text string and will return the value 50 or 100 or 75

 

I only need to retrieve a list of 7 different numbers (10 , 20 , 50 , 75, 100, 200 , 300)

 

what are the functions or the procedures to use ?

I have tried FINDC but it does not do what I need

 

many thanks for your help

Super User
Super User
Posts: 9,840

Re: How to create a computed column that contains a numerical value found in a text string from....

Have a look at this post (the one right before yours):

https://communities.sas.com/t5/Base-SAS-Programming/How-to-keep-only-digits-when-data-contents-numer...

 

The function you want is compress() with a couple of options 'kd'.

Occasional Contributor
Posts: 13

Re: How to create a computed column that contains a numerical value found in a text string from....

Hi thanks for the quick reply

 

I forgot to mention that I might have other numbers in the string

that s why I thought it would be better to match what I was looking for based on a list of matching numbers

 

I was wondering if and how I could use PRXMATCH or PRXCHANGE

Super User
Super User
Posts: 9,840

Re: How to create a computed column that contains a numerical value found in a text string from....

Well, not sure about computed columns.  In actual code, yes you could do, the prx would be:

"([0-9]+)"
Super User
Posts: 13,934

Re: How to create a computed column that contains a numerical value found in a text string from....


Rickyboy wrote:

Hi thanks for the quick reply

 

I forgot to mention that I might have other numbers in the string

that s why I thought it would be better to match what I was looking for based on a list of matching numbers

 

I was wondering if and how I could use PRXMATCH or PRXCHANGE


The you have to provide a rule for either which number you want extracted or if you want all of them then what the output looks like.

As in show the structure of the desired output data set.

 

Really best is to show some example input data and the desired output for that. Make sure your example data includes extreme values such as 1) no number at all, 2) exactly one number, 3) the largest number of numbers you expect to see. If you do not know the largest number of numbers expected then that is also important (and likely makes your data set structure a very good question in light of that info)

Occasional Contributor
Posts: 13

Re: How to create a computed column that contains a numerical value found in a text string from....

The data I have is a string containing some numbers, and the numbers can be anywhere.

I am interested in only a few of these numbers , for example 50 , 75 , 100 , and this is why I m not sure the compress function will work

 

essentially I am interested in finding this list of numbers (50, 75, 100) and creating a column that contains it, (I can then convert the column to numerical from text)

 

Mary has 50 lambs, and 23 rabbits                 List i m interested in 50 , 75, 100   output       50

Julie has 20 goats and 75 lambs                                                                                             75

Michelle has 100 lambs and 10 dogs                                                                                      100

Grace has 50 goats and 13 lambs                                                                                           50

 

I hoper this makes sense

 

Super User
Super User
Posts: 9,840

Re: How to create a computed column that contains a numerical value found in a text string from....

Well (again not sure about computed fields) in code:

data want (drop=i);
  set have;
  length wrd $100;
  do i=1 to countw(thestring," ");
    wrd=scan(thestring,i," ");
    if lengthn(compress(wrd," ","kd")) > 0 then output;
  end;
run;
Occasional Contributor
Posts: 13

Re: How to create a computed column that contains a numerical value found in a text string from....

Hi

 

I run the code but I get two rows of new output per original row when the string contains 2 number

so in effect the output I get is :

 

50

23

20

75

100

10

50

13

 

when all I need is just the numbers contained in a list (50 , 75, 100)

 

thanks for taking time to help me

Super User
Super User
Posts: 9,840

Re: How to create a computed column that contains a numerical value found in a text string from....

Lets start again.  Post test data - in the form of a dataset - using the code window which is the {i} above the post area.  Follow this post if you are unsure:

https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat...

 

Next, post example required output based on that test data.  

 

Then we will have something definite to work to.

Occasional Contributor
Posts: 13

Re: How to create a computed column that contains a numerical value found in a text string from....

namestringDesired output
rickmary had 50 lambs and 10 geese50
markgary had 20 lambs and 75 geese75
johnfred had 0.50 fish.
stevemark had 100 cats100
nickpaul had no dogs.

 

basically I only want to find 50 , 75 and 100 (3 possible cases) from the string (I have no interest for the 10 , 20 and 0.50)

Super User
Super User
Posts: 9,840

Re: How to create a computed column that contains a numerical value found in a text string from....

That is not a datastep, I am not here to type in that data, or get it looking as it is on your machine.

 

Now in your example, how am I to know that you do not want 10 or 20, is it that you want the maximum number in the sentance?

 

You can get the maximum out using a slight modification of the code I presented earlier:

data want (drop=i);
  set have;
  length wrd $100;
  do i=1 to countw(thestring," ");
    wrd=scan(thestring,i," ");
    if lengthn(compress(wrd," ","kd")) > 0 then do;
if input(wrd,best.) > max_num then max_num=input(wrd,best.);
end; end; run;

This will for each line take any number (which is all numeric separated by spaces) and keep the maximum from each it finds.  Note I can't test this as have no test data to use. 

Occasional Contributor
Posts: 13

Re: How to create a computed column that contains a numerical value found in a text string from....

Sorry I am a noob

 

no I don't want the maximum number , just the number that matches from a list of my numbers

which in this case is 50 , 75 and 100

 

If I had 200 I would still want to discard it

 

I thought you could copy and paste the data easily and I don't know what a data step is

Super User
Super User
Posts: 9,840

Re: How to create a computed column that contains a numerical value found in a text string from....

A datastep is something which begins data and ends run; which I am posting.  Maybe this isn't the solution for you then.  This computed thing leads me to believe you are using Visual Analytics or something like that yes.  If so then its perhaps not a straight forward answer.  Computed columns only work as part of a select clause in SQL, you could try (and this is pushing it):

catx(',',ifc(findw(yourstring,"50"," "),"50",""),ifc(findw(yourstring,"75"," "),"75",""),ifc(findw(yourstring,"100"," "),"100",""))

This basically does an if statement for each of 50,75,100, and if it finds it in the string, then puts it back for the catx() to concat all results into a comma delimited list.

Occasional Contributor
Posts: 13

Re: How to create a computed column that contains a numerical value found in a text string from....

I m using enterprise guide

 

Occasional Contributor
Posts: 13

Re: How to create a computed column that contains a numerical value found in a text string from....

@RW9

 

Thanks for your help

 

in the end I had to use excel

I created a summary statistics of all the 1200 possible cases I had and for each one I created the desired response value I wanted in a separate excel column

I then did a left join with the original dataset

 

 

Ask a Question
Discussion stats
  • 14 replies
  • 339 views
  • 0 likes
  • 3 in conversation