## Calculation from a string

Frequent Contributor
Posts: 143

# Calculation from a string

Hi

I am working in a bank.

We have a data set with scores information for every customer.

Score for each customer can be any integer between 0 to 11.(11 is a failure status).

I created a vector of last 12 scores for each customer(with comma between values).

I have 3 questions:

Question1-Calculate How many non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 7

Question2-Calculate How many missing+non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 12

Question3-Calculate How many  times 11 appear

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 5

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 0

Question4-Calculate location of first failure(11)

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here we need to get:6

3,.,.,.,.,.,0,0,0,0,6,7   Here we need to get:0

Data tbl1;
input ID Score1 Score2 Score3 Score4 Score5 Score6
Score7 Score8 Score9 Score10 Score11 Score12;
cards;
1 7 8 9 10 9 11 11 11 11 10 11 9
2 8 8 7 7 7 7 6 7 7 8 7 8
3 3 3 3 3 2 2 2 3 3 3 2 2
3 . . . . . 0 0 0 0 6 7
;
run;

Data want;
length ScoreVector1 ScoreVector2 ScoreVector3 ScoreVector4 \$32.;
set tbl1;
ScoreVector1=compress(Score1)||','|| compress(Score1)||','|| compress(Score3)
||','|| compress(Score4)||','|| compress(Score5)||','|| compress(Score6)
||','|| compress(Score7)||','|| compress(Score8)||','|| compress(Score9)
||','|| compress(Score10)||','|| compress(Score11)||','|| compress(Score12);
Run;

Frequent Contributor
Posts: 143

## calculations from a string

Hi

I am working in a bank.

We have a data set with scores information for every customer.

Score for each customer can be any integer between 0 to 11.(11 is a failure status).

I created a vector of last 12 scores for each customer(with comma between values).

I have 3 questions:

Question1-Calculate How many non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 7

Question2-Calculate How many missing+non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 12

Question3-Calculate How many  times 11 appear

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 5

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 0

Question4-Calculate location of first failure(11)

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here we need to get:6

3,.,.,.,.,.,0,0,0,0,6,7   Here we need to get:0

Data tbl1;
input ID Score1 Score2 Score3 Score4 Score5 Score6
Score7 Score8 Score9 Score10 Score11 Score12;
cards;
1 7 8 9 10 9 11 11 11 11 10 11 9
2 8 8 7 7 7 7 6 7 7 8 7 8
3 3 3 3 3 2 2 2 3 3 3 2 2
3 . . . . . 0 0 0 0 6 7
;
run;

Data tbl2;
length  ScoreVector  \$32.;
set tbl1;
ScoreVector=compress(Score1)||','|| compress(Score1)||','|| compress(Score3)
||','|| compress(Score4)||','|| compress(Score5)||','|| compress(Score6)
||','|| compress(Score7)||','|| compress(Score8)||','|| compress(Score9)
||','|| compress(Score10)||','|| compress(Score11)||','|| compress(Score12);
Run;

Super User
Posts: 9,599

## Re: Calculation from a string

[ Edited ]

In general, it is better to keep data points in their own variables.  In this instance it may actually help, but you could do the same thing with an array and catx, e.g:

```data want;
length scorevector1 \$2000;
set tbl1;
scorevector1=catx(',',score1,score2,score3...score12);
run;```

Note I use consistent case, the code window (its the {i} above post area) and indentation.

1) Note, this assumes the variable are in order, score1, score2 etc.  If not use and array and replace score1--score12 with of <arrayname>{*}.

```data want;
set tbl1;
num_miss=nmiss(score1--score12);
run;```

2) I don't get this question, the answer is always 12?

3)

```data want;
set tbl1;
num_11=count(catx(',',score1--score12),"11");
run;```

4) Get this by using the index function.

These are all basic string manipulation functions which you can find in the SAS manual:

http://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=titlepage.htm&docsetVersion=9.4&l...

Make a point of reading the manual.

Frequent Contributor
Posts: 143

## Re: Calculation from a string

In my question 2 we need to get 12 for all rows because in all rows there are 12 arguments (numbers).

How can we calculate it please?

Super User
Posts: 9,599

## Re: Calculation from a string

`num_arg=12;`

Why would you want to calculate something that is fixed and known, just a waste of processing.  If you still want to use the countw() function which you can find in the docs.

Posts: 4,743

## Re: Calculation from a string

@Ronein

If working with a wide table structure then I'd be using proper SAS array processing for this as shown below.

``````Data tbl1_wide tbl1_long;
infile datalines dlm=' ' truncover;
input ID Score1 Score2 Score3 Score4 Score5 Score6
Score7 Score8 Score9 Score10 Score11 Score12;
cards;
1 7 8 9 10 9 11 11 11 11 10 11 9
2 8 8 7 7 7 7 6 7 7 8 7 8
3 3 3 3 3 . 2 2 3 3 3 2 2
3 . . . . . 0 0 0 0 6 7
;
run;

data want;
set tbl1_wide;
array scores {*} Score1 - Score12;

/*Question1-Calculate How many non-missing arguments exits in the vector.*/
non_miss=n(of scores[*]);

/*Question2-Calculate How many missing+non-missing arguments exits in the vector.*/
miss_non_miss=dim(scores);

do _i=1 to dim(scores);
/*Question3-Calculate How many  times 11 appear*/
n_11=sum(0 ,n_11 ,scores[_i]=11);

/*Question4-Calculate location of first failure(11)*/
if missing(pos_11) then
do;
if scores[_i]=11 then pos_11=_i;
end;
end;

run;``````
Frequent Contributor
Posts: 143

## Re: Calculation from a string

Hi

INDEX function returns the position of the first occurrence of the string’s first character.

I need a different thing.

I need to know in which position   (1 until 12) was first failure (value11)

1 7 8 9 10 9 11 11 11 11 10 11 9    (Here the answer is 7)
2 8 8 7 7 7 7 6 7 7 8 7 11              (Here the answer is 12)
3 3 3 3 3 2 2 2 3 3 3 2 2                (Here the answer is 0)
4 . . . . . . 0 0 0 0 6 7                      (Here the answer is 0)

Super User
Posts: 9,599

## Re: Calculation from a string

```data want;
set tbl1;
num_11=findw(catx(',',score1--score12),"11");
run;```

That should find first occurence of word.

Super User
Posts: 13,583

## Re: Calculation from a string

@Ronein wrote:

Hi

INDEX function returns the position of the first occurrence of the string’s first character.

I need a different thing.

I need to know in which position   (1 until 12) was first failure (value11)

1 7 8 9 10 9 11 11 11 11 10 11 9    (Here the answer is 7)
2 8 8 7 7 7 7 6 7 7 8 7 11              (Here the answer is 12)
3 3 3 3 3 2 2 2 3 3 3 2 2                (Here the answer is 0)
4 . . . . . . 0 0 0 0 6 7                      (Here the answer is 0)

If these values are numeric and you will be numeric calculations later I strongly suggest looking at @Patrick's array approach.

An alternate of finding first value would be the WHICHN function (referencing Patricks array definition)

First11 = whichn(11, of scores(*));

which does return 0 when the value in the first position is not found in any of the array elements.

Super User
Posts: 10,280

## Re: Calculation from a string

Quote:

"I created a vector of last 12 scores for each customer(with comma between values)."

Since you are in command, I strongly (and I mean STRONGLY) recommend to use the correct data structure for this, creating a long dataset with 12 observations with a single variable. Then most of your calculations can be done by simple counts in SQL or data steps or with proc summary.

If you add time-point variables to it, you gain by simply not writing a particular observation when the value is missing.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Discussion stats
• 9 replies
• 176 views
• 1 like
• 5 in conversation