turn on suggestions

Auto-suggest helps you quickly narrow down your search results by suggesting possible matches as you type.

Showing results for

Find a Community

- Home
- /
- SAS Programming
- /
- General Programming
- /
- Count the first digits

Topic Options

- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2015 02:21 AM

Hello,

I'm new to SAS and found it difficult to write a code for the following steps.

I'm trying to count how many numbers (variables) in each observation start with '1' '2' '3' ... '9' in a very large data set.

The variables are including numeric and character.

The frequency of each digit (1-9) need to be stored in a new variable in the original data set (or separate data set) as I need to use the result for calculation later.

P.S. Please see the attachment for the sample data.

Could anyone help me with the code?

Many thanks!

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2015 03:23 AM

Use an array (declared with _numeric_) to reference all numeric variables.

Then you can iterate in a do loop over the array, convert the numerical values to strings, strip() those of leading blanks (and minus signs), and get the first character for counting.

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2015 10:10 PM

Thanks for your reply,

Could you please give me more detail how to write the code? I'm not familiar with them

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-12-2015 01:42 AM

Since you are here to learn, I will not write your code for you, as I assume you know the basics of SAS programming.

array declaration in a data step:

SAS(R) 9.4 Statements: Reference, Third Edition

(look at the _numeric_ option to inclkude all numeric variables of the data step in the array)

iterating:

SAS(R) 9.4 Statements: Reference, Third Edition

determining the size of an array for the iteration:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

Convert numeric to character with the put() function:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

(use a suitable numeric format, often the simple BEST. is the best option)

remove unwanted blanks with the strip function:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

substr() to get the first character:

SAS(R) 9.4 Functions and CALL Routines: Reference, Third Edition

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-12-2015 02:23 AM

And if you like to take the first not zero value, I will delete all character that doesn't interest you with the compress function (after to pass numeric to character) like that

compress(Variable,'123456789','K')

The option 'K' of the compress keep only the values that you have specified, so it will delete periods and 0 values.

After that you could use the substr()

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-12-2015 02:27 AM

Thanks. The last missing piece of the puzzle to deal with values < 1.

---------------------------------------------------------------------------------------------

Maxims of Maximally Efficient SAS Programmers

Maxims of Maximally Efficient SAS Programmers

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2015 01:07 PM

How are decimal values treated? Would 0.00145 be considered as "starting" with 0 or 1?

Or if you have a character variable "0001234"?

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-11-2015 10:06 PM

Thanks for your reply.

I'm counting the first non-zero number. So 0.00145 should be counted as starting with 1.

Most of my character variables are letters so it doesn't matter much.

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-12-2015 03:33 AM

The "store in a new variable" part is not possible, you need an array which tells you how often 1, 2, etc. occur, do you?

Data Want;

Set Try;

Array AN _NUMERIC_;

Array AC _CHARACTER_;

Array ACount {*} ACount1-ACount9;

Do over AC; * count 1st non-zero number in strings (?);

If Input(Substr(AC,PRXMatch('/\d/',AC),1),Best.) Then ACount{Input(Substr(AC,PRXMatch('/\d/',AC),1),Best.)}=

Sum(Input(Substr(ACount{Input(Substr(AC,PRXMatch('/\d/',AC),1),Best.)},PRXMatch('/\d/',AC),1),Best.),1);

End;

Do over AN; * count first places;

ACount{Abs(Input(Scan(Put(AN,e.),1,"."),Best.))}=Sum(1,ACount{Abs(Input(Scan(Put(AN,e.),1,"."),Best.))});

End;

Run;

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-14-2015 01:22 AM

Hi,

I try the code.

On the step of "Do over AC", can I use it on AN? (Because I need to count the 1st non-zero numeric value).

I try to use PUT with AN in this step within substr function

Do over AN; * count 1st non-zero number in strings (?);

If Input(Substr(put(AN,PRXMatch('/\d/',AN),1),1,best.),Best.) Then ACount{Input(Substr(put(AN,PRXMatch('/\d/',AN),1),1,best.),Best.)}=

Sum(Input(Substr(ACount{Input(Substr(put(AN,PRXMatch('/\d/',AN),1),1,best.),Best.)},PRXMatch('/\d/',AN),1),Best.),1);

End;

But I think PRXMatch cannot be use in PUT function..

Can I convert all the numeric values into Character? If I use PUT, I have to do each variables one by one, but I have almost 800 variables..

- Mark as New
- Bookmark
- Subscribe
- Subscribe to RSS Feed
- Highlight
- Email to a Friend
- Report Inappropriate Content

06-14-2015 12:25 AM

Thanks everyone!

I'll try those codes and get back to you guys

Thanks a lot!