turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- BI
- /
- Enterprise Guide
- /
- How to create a computed column that contains a nu...

Topic Options

- RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-08-2017 05:55 AM

Good morning

I have another silly noob questions for you

I have a variable that contains a long text like

"Mary had 50 lambs"

"Julie had 100 lambs"

"Laura has 75 lambs"

I would like to create a computed numerical column that retrieves the number contained in the text string and will return the value 50 or 100 or 75

I only need to retrieve a list of 7 different numbers (10 , 20 , 50 , 75, 100, 200 , 300)

what are the functions or the procedures to use ?

I have tried FINDC but it does not do what I need

many thanks for your help

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-08-2017 06:06 AM

Have a look at this post (the one right before yours):

The function you want is compress() with a couple of options 'kd'.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-08-2017 06:57 AM

Hi thanks for the quick reply

I forgot to mention that I might have other numbers in the string

that s why I thought it would be better to match what I was looking for based on a list of matching numbers

I was wondering if and how I could use PRXMATCH or PRXCHANGE

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-08-2017 09:28 AM

Well, not sure about computed columns. In actual code, yes you could do, the prx would be:

"([0-9]+)"

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-08-2017 10:50 AM

Rickyboy wrote:

Hi thanks for the quick reply

I forgot to mention that I might have other numbers in the string

that s why I thought it would be better to match what I was looking for based on a list of matching numbers

I was wondering if and how I could use PRXMATCH or PRXCHANGE

The you have to provide a rule for either which number you want extracted or if you want all of them then what the output looks like.

As in show the structure of the desired output data set.

Really best is to show some example input data and the desired output for that. Make sure your example data includes extreme values such as 1) no number at all, 2) exactly one number, 3) the largest number of numbers you expect to see. If you do not know the largest number of numbers expected then that is also important (and likely makes your data set structure a very good question in light of that info)

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to ballardw

12-11-2017 05:15 AM

The data I have is a string containing some numbers, and the numbers can be anywhere.

I am interested in only a few of these numbers , for example 50 , 75 , 100 , and this is why I m not sure the compress function will work

essentially I am interested in finding this list of numbers (50, 75, 100) and creating a column that contains it, (I can then convert the column to numerical from text)

Mary has 50 lambs, and 23 rabbits List i m interested in 50 , 75, 100 output 50

Julie has 20 goats and 75 lambs 75

Michelle has 100 lambs and 10 dogs 100

Grace has 50 goats and 13 lambs 50

I hoper this makes sense

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-11-2017 05:27 AM

Well (again not sure about computed fields) in code:

data want (drop=i); set have; length wrd $100; do i=1 to countw(thestring," "); wrd=scan(thestring,i," "); if lengthn(compress(wrd," ","kd")) > 0 then output; end; run;

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-13-2017 07:59 AM

Hi

I run the code but I get two rows of new output per original row when the string contains 2 number

so in effect the output I get is :

50

23

20

75

100

10

50

13

when all I need is just the numbers contained in a list (50 , 75, 100)

thanks for taking time to help me

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-13-2017 08:07 AM

Lets start again. Post test data - in the form of a dataset - using the code window which is the {i} above the post area. Follow this post if you are unsure:

Next, post example required output based on that test data.

Then we will have something definite to work to.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-13-2017 08:26 AM

name | string | Desired output |

rick | mary had 50 lambs and 10 geese | 50 |

mark | gary had 20 lambs and 75 geese | 75 |

john | fred had 0.50 fish | . |

steve | mark had 100 cats | 100 |

nick | paul had no dogs | . |

basically I only want to find 50 , 75 and 100 (3 possible cases) from the string (I have no interest for the 10 , 20 and 0.50)

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-13-2017 08:46 AM

That is not a datastep, I am not here to type in that data, or get it looking as it is on your machine.

Now in your example, how am I to know that you do not want 10 or 20, is it that you want the maximum number in the sentance?

You can get the maximum out using a slight modification of the code I presented earlier:

data want (drop=i); set have; length wrd $100; do i=1 to countw(thestring," "); wrd=scan(thestring,i," "); if lengthn(compress(wrd," ","kd")) > 0 then do;

if input(wrd,best.) > max_num then max_num=input(wrd,best.);

end; end; run;

This will for each line take any number (which is all numeric separated by spaces) and keep the maximum from each it finds. Note I can't test this as have no test data to use.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-13-2017 08:49 AM

Sorry I am a noob

no I don't want the maximum number , just the number that matches from a list of my numbers

which in this case is 50 , 75 and 100

If I had 200 I would still want to discard it

I thought you could copy and paste the data easily and I don't know what a data step is

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-13-2017 09:10 AM

A datastep is something which begins data and ends run; which I am posting. Maybe this isn't the solution for you then. This computed thing leads me to believe you are using Visual Analytics or something like that yes. If so then its perhaps not a straight forward answer. Computed columns only work as part of a select clause in SQL, you could try (and this is pushing it):

catx(',',ifc(findw(yourstring,"50"," "),"50",""),ifc(findw(yourstring,"75"," "),"75",""),ifc(findw(yourstring,"100"," "),"100",""))

This basically does an if statement for each of 50,75,100, and if it finds it in the string, then puts it back for the catx() to concat all results into a comma delimited list.

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

12-13-2017 10:12 AM

I m using enterprise guide

- Mark as New
- Bookmark
- Subscribe
- RSS Feed
- Permalink
- Email to a Friend
- Report Inappropriate Content

Posted in reply to Rickyboy

12-14-2017 04:29 AM

Thanks for your help

in the end I had to use excel

I created a summary statistics of all the 1200 possible cases I had and for each one I created the desired response value I wanted in a separate excel column

I then did a left join with the original dataset