BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hi

I am working in a bank.

We have a data set with scores information for every customer.

Score for each customer can be any integer between 0 to 11.(11 is a failure status).

I created a vector of last 12 scores for each customer(with comma between values).

I have 3 questions:

 

Question1-Calculate How many non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 7

 

Question2-Calculate How many missing+non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 12

 

Question3-Calculate How many  times 11 appear

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 5

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 0

 

Question4-Calculate location of first failure(11)

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here we need to get:6

3,.,.,.,.,.,0,0,0,0,6,7   Here we need to get:0

 

 

Data tbl1;
input ID Score1 Score2 Score3 Score4 Score5 Score6
Score7 Score8 Score9 Score10 Score11 Score12;
cards;
1 7 8 9 10 9 11 11 11 11 10 11 9
2 8 8 7 7 7 7 6 7 7 8 7 8
3 3 3 3 3 2 2 2 3 3 3 2 2
3 . . . . . 0 0 0 0 6 7
;
run;

 

 

Data want;
length ScoreVector1 ScoreVector2 ScoreVector3 ScoreVector4 $32.;
set tbl1;
ScoreVector1=compress(Score1)||','|| compress(Score1)||','|| compress(Score3)
||','|| compress(Score4)||','|| compress(Score5)||','|| compress(Score6)
||','|| compress(Score7)||','|| compress(Score8)||','|| compress(Score9)
||','|| compress(Score10)||','|| compress(Score11)||','|| compress(Score12);
Run;

9 REPLIES 9
Ronein
Meteorite | Level 14

Hi

I am working in a bank.

We have a data set with scores information for every customer.

Score for each customer can be any integer between 0 to 11.(11 is a failure status).

I created a vector of last 12 scores for each customer(with comma between values).

I have 3 questions:

 

Question1-Calculate How many non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 7

 

Question2-Calculate How many missing+non-missing arguments exits in the vector.

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 12

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 12

 

Question3-Calculate How many  times 11 appear

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here there are 5

3,.,.,.,.,.,0,0,0,0,6,7     Here there are 0

 

Question4-Calculate location of first failure(11)

For example:

7,7,9,10,9,11,11,11,11,10,11,9     Here we need to get:6

3,.,.,.,.,.,0,0,0,0,6,7   Here we need to get:0

 

 

Data tbl1;
input ID Score1 Score2 Score3 Score4 Score5 Score6
Score7 Score8 Score9 Score10 Score11 Score12;
cards;
1 7 8 9 10 9 11 11 11 11 10 11 9
2 8 8 7 7 7 7 6 7 7 8 7 8
3 3 3 3 3 2 2 2 3 3 3 2 2
3 . . . . . 0 0 0 0 6 7
;
run;

 

 

Data tbl2;
length  ScoreVector  $32.;
set tbl1;
ScoreVector=compress(Score1)||','|| compress(Score1)||','|| compress(Score3)
||','|| compress(Score4)||','|| compress(Score5)||','|| compress(Score6)
||','|| compress(Score7)||','|| compress(Score8)||','|| compress(Score9)
||','|| compress(Score10)||','|| compress(Score11)||','|| compress(Score12);
Run;

RW9
Diamond | Level 26 RW9
Diamond | Level 26

In general, it is better to keep data points in their own variables.  In this instance it may actually help, but you could do the same thing with an array and catx, e.g:

data want;
  length scorevector1 $2000;
  set tbl1;
  scorevector1=catx(',',score1,score2,score3...score12);
run;

Note I use consistent case, the code window (its the {i} above post area) and indentation.

For your questions:

1) Note, this assumes the variable are in order, score1, score2 etc.  If not use and array and replace score1--score12 with of <arrayname>{*}.

data want;
  set tbl1;
  num_miss=nmiss(score1--score12);
run;

2) I don't get this question, the answer is always 12?  

3) 

data want;
  set tbl1;
  num_11=count(catx(',',score1--score12),"11");
run;

4) Get this by using the index function.

 

These are all basic string manipulation functions which you can find in the SAS manual:

http://documentation.sas.com/?docsetId=lefunctionsref&docsetTarget=titlepage.htm&docsetVersion=9.4&l...

Make a point of reading the manual.

 

Ronein
Meteorite | Level 14

Hi and thank you so much for your reply.

In my question 2 we need to get 12 for all rows because in all rows there are 12 arguments (numbers).

How can we calculate it please?

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26
num_arg=12;

Why would you want to calculate something that is fixed and known, just a waste of processing.  If you still want to use the countw() function which you can find in the docs.

Patrick
Opal | Level 21

@Ronein

If working with a wide table structure then I'd be using proper SAS array processing for this as shown below.

Data tbl1_wide tbl1_long;
  infile datalines dlm=' ' truncover;
  input ID Score1 Score2 Score3 Score4 Score5 Score6
    Score7 Score8 Score9 Score10 Score11 Score12;
  cards;
1 7 8 9 10 9 11 11 11 11 10 11 9
2 8 8 7 7 7 7 6 7 7 8 7 8
3 3 3 3 3 . 2 2 3 3 3 2 2
3 . . . . . 0 0 0 0 6 7
;
run;

data want;
  set tbl1_wide;
  array scores {*} Score1 - Score12;

  /*Question1-Calculate How many non-missing arguments exits in the vector.*/
  non_miss=n(of scores[*]);

  /*Question2-Calculate How many missing+non-missing arguments exits in the vector.*/
  miss_non_miss=dim(scores);

  do _i=1 to dim(scores);
    /*Question3-Calculate How many  times 11 appear*/
    n_11=sum(0 ,n_11 ,scores[_i]=11);

    /*Question4-Calculate location of first failure(11)*/
    if missing(pos_11) then
      do;
        if scores[_i]=11 then pos_11=_i;
      end;
  end;

run;
Ronein
Meteorite | Level 14

Hi

INDEX function returns the position of the first occurrence of the string’s first character.

I need a different thing.

I need to know in which position   (1 until 12) was first failure (value11)

1 7 8 9 10 9 11 11 11 11 10 11 9    (Here the answer is 7)
2 8 8 7 7 7 7 6 7 7 8 7 11              (Here the answer is 12)
3 3 3 3 3 2 2 2 3 3 3 2 2                (Here the answer is 0)
4 . . . . . . 0 0 0 0 6 7                      (Here the answer is 0)

RW9
Diamond | Level 26 RW9
Diamond | Level 26
data want;
  set tbl1;
  num_11=findw(catx(',',score1--score12),"11");
run;

That should find first occurence of word. 

ballardw
Super User

@Ronein wrote:

Hi

INDEX function returns the position of the first occurrence of the string’s first character.

I need a different thing.

I need to know in which position   (1 until 12) was first failure (value11)

1 7 8 9 10 9 11 11 11 11 10 11 9    (Here the answer is 7)
2 8 8 7 7 7 7 6 7 7 8 7 11              (Here the answer is 12)
3 3 3 3 3 2 2 2 3 3 3 2 2                (Here the answer is 0)
4 . . . . . . 0 0 0 0 6 7                      (Here the answer is 0)


If these values are numeric and you will be numeric calculations later I strongly suggest looking at @Patrick's array approach.

An alternate of finding first value would be the WHICHN function (referencing Patricks array definition)

 

First11 = whichn(11, of scores(*));

which does return 0 when the value in the first position is not found in any of the array elements.

 

Kurt_Bremser
Super User

Quote:

 

"I created a vector of last 12 scores for each customer(with comma between values)."

 

Since you are in command, I strongly (and I mean STRONGLY) recommend to use the correct data structure for this, creating a long dataset with 12 observations with a single variable. Then most of your calculations can be done by simple counts in SQL or data steps or with proc summary.

If you add time-point variables to it, you gain by simply not writing a particular observation when the value is missing.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 9 replies
  • 1048 views
  • 1 like
  • 5 in conversation