Count the first digits

Reply
Occasional Contributor
Posts: 16

Count the first digits

Hello,

I'm new to SAS and found it difficult to write a code for the following steps.


I'm trying to count how many numbers (variables) in each observation start with '1' '2' '3' ... '9' in a very large data set.

The variables are including numeric and character.

The frequency of each digit (1-9) need to be stored in a new variable in the original data set (or separate data set) as I need to use the result for calculation later.

P.S. Please see the attachment for the sample data.

Could anyone help me with the code?

Many thanks!

Esteemed Advisor
Posts: 6,699

Re: Count the first digits

Use an array (declared with _numeric_) to reference all numeric variables.

Then you can iterate in a do loop over the array, convert the numerical values to strings, strip() those of leading blanks (and minus signs), and get the first character for counting.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 16

Re: Count the first digits

Thanks for your reply,

Could you please give me more detail how to write the code? I'm not familiar with them Smiley Sad

Esteemed Advisor
Posts: 6,699

Re: Count the first digits

Since you are here to learn, I will not write your code for you, as I assume you know the basics of SAS programming.

array declaration in a data step:

SAS(R) 9.4 Statements: Reference, Third Edition

(look at the _numeric_ option to inclkude all numeric variables of the data step in the array)

iterating:

SAS(R) 9.4 Statements: Reference, Third Edition

determining the size of an array for the iteration:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

Convert numeric to character with the put() function:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

(use a suitable numeric format, often the simple BEST. is the best option)

remove unwanted blanks with the strip function:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

substr() to get the first character:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Frequent Contributor
Posts: 144

Re: Count the first digits

And if you like to take the first not zero value, I will delete all character that doesn't interest you with the compress function (after to pass numeric to character) like that

compress(Variable,'123456789','K')

The option 'K' of the compress keep only the values that you have specified, so it will delete periods and 0 values.

After that you could use the substr()

Esteemed Advisor
Posts: 6,699

Re: Count the first digits

Thanks. The last missing piece of the puzzle to deal with values < 1.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Grand Advisor
Posts: 10,239

Re: Count the first digits

How are decimal values treated? Would 0.00145 be considered as "starting" with 0 or 1?

Or if you have a character variable "0001234"?

Occasional Contributor
Posts: 16

Re: Count the first digits

Thanks for your reply.

I'm counting the first non-zero number. So 0.00145 should be counted as starting with 1.

Most of my character variables are letters so it doesn't matter much.

Super Contributor
Posts: 336

Re: Count the first digits

The "store in a new variable" part is not possible, you need an array which tells you how often 1, 2, etc. occur, do you?

Data Want;

  Set Try;

  Array AN _NUMERIC_;

  Array AC _CHARACTER_;

  Array ACount {*} ACount1-ACount9;

  Do over AC; * count 1st non-zero number in strings (?);

    If Input(Substr(AC,PRXMatch('/\d/',AC),1),Best.) Then ACount{Input(Substr(AC,PRXMatch('/\d/',AC),1),Best.)}=

      Sum(Input(Substr(ACount{Input(Substr(AC,PRXMatch('/\d/',AC),1),Best.)},PRXMatch('/\d/',AC),1),Best.),1);

  End;

  Do over AN; * count first places;

    ACount{Abs(Input(Scan(Put(AN,e.),1,"."),Best.))}=Sum(1,ACount{Abs(Input(Scan(Put(AN,e.),1,"."),Best.))});

  End;

Run;

Occasional Contributor
Posts: 16

Re: Count the first digits

Hi,

I try the code.

On the step of "Do over AC", can I use it on AN? (Because I need to count the 1st non-zero numeric value).

I try to use PUT with AN in this step within substr function

Do over AN; * count 1st non-zero number in strings (?);

    If Input(Substr(put(AN,PRXMatch('/\d/',AN),1),1,best.),Best.) Then ACount{Input(Substr(put(AN,PRXMatch('/\d/',AN),1),1,best.),Best.)}=

      Sum(Input(Substr(ACount{Input(Substr(put(AN,PRXMatch('/\d/',AN),1),1,best.),Best.)},PRXMatch('/\d/',AN),1),Best.),1);

  End;

But I think PRXMatch cannot be use in PUT function..

Can I convert all the numeric values into Character? If I use PUT, I have to do each variables one by one, but I have almost 800 variables..

Occasional Contributor
Posts: 16

Re: Count the first digits

Thanks everyone!

I'll try those codes and get back to you guys Smiley Happy

Thanks a lot!

Ask a Question
Discussion stats
  • 10 replies
  • 436 views
  • 6 likes
  • 5 in conversation