Desktop productivity for business analysts and programmers

How can I code for a uniform string length where there is variation in data entry

Accepted Solution Solved
Reply
Contributor
Posts: 33
Accepted Solution

How can I code for a uniform string length where there is variation in data entry

While using EGP 6.1, I am trying to get a common look for each row of a specific column where the data varies in length.  The full ID Number has a prefix of "00" then is followed by 9 other digits.  However, some people enter the full ID Number while others only enter the last 9 digits.  I am trying to just return the last 9 digits but am having trouble finding what function to use.  I have been trying to use the SCAN, the SUBPAD, the combination of SUBPAD with REVERSE functions, etc. and am getting nowhere fast.  To make matters more difficult, some of the last 9 digits do begin with "00" which I want to include.

 

Here is an example of data and what I am getting back (the first column is the ID Number as people have entered (I tried to reverse the order and then do a SUBSTRN with a limit of 9. As you can see, the good data that started with just 9 digits, now only displays 7 digits but shows a length of 9.  On the other hand, the numbers too long numbers that showed 11, are now correctly showing just 9 digits.)  Any thoughts on the best practice to limit the number of digits to the last 9?  Ideally I would like to start from the end of the number and move to the left 9 digits and return only those 9 digits.

ID NumberReversed ID NumberSUBSTRN ID NumberLNGTH of SUBSTRN
000359260  062953000  06295309
001094641  146490100  14649019
00161892070070298161000702981619
001750341  143057100  14305719
001750357  753057100  75305719
01000009831138900000101389000009
01000047161161740000101617400009
01000053759957350000109573500009
016506959  959605610  95960569

Accepted Solutions
Solution
‎10-12-2017 05:25 PM
Super User
Posts: 9,611

Re: How can I code for a uniform string length where there is variation in data entry

How about this:

data have;
input idnum :$11.;
cards;
000359260
001094641
00161892070
001750341
001750357
01000009831
01000047161
01000053759
016506959
;
run;

data want;
set have;
length new_idnum $9;
new_idnum = substr(put(input(idnum,11.),z11.),3);
run;

Note how I presented example data in a data step, so that others (and you!) only need to copy/paste and submit the code to recreate the dataset.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Super User
Posts: 7,860

Re: How can I code for a uniform string length where there is variation in data entry

Since you only have 9 digits you could just convert the string to a number and then back to a string using the Z format to make sure it prefixes zeros for the small numbers.

want = put(input(id_number,32.),Z9.);
Solution
‎10-12-2017 05:25 PM
Super User
Posts: 9,611

Re: How can I code for a uniform string length where there is variation in data entry

How about this:

data have;
input idnum :$11.;
cards;
000359260
001094641
00161892070
001750341
001750357
01000009831
01000047161
01000053759
016506959
;
run;

data want;
set have;
length new_idnum $9;
new_idnum = substr(put(input(idnum,11.),z11.),3);
run;

Note how I presented example data in a data step, so that others (and you!) only need to copy/paste and submit the code to recreate the dataset.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 33

Re: How can I code for a uniform string length where there is variation in data entry

Posted in reply to KurtBremser

Hi Kurt,

That makes sense, except in the full dataset, I have thousands of rows with separate ID Numbers.

Super User
Posts: 9,611

Re: How can I code for a uniform string length where there is variation in data entry


gregor1 wrote:

Hi Kurt,

That makes sense, except in the full dataset, I have thousands of rows with separate ID Numbers.


And? What works on 10 rows also works on 10 billion rows.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 33

Re: How can I code for a uniform string length where there is variation in data entry

Posted in reply to KurtBremser

I get it, so I don't need to enter the individual ID  Numbers in the code?

Super User
Posts: 9,611

Re: How can I code for a uniform string length where there is variation in data entry


gregor1 wrote:

I get it, so I don't need to enter the individual ID  Numbers in the code?


No, not at all. That first data step is just creating some example data on the fly from the examples you gave in your initial post. Instead of the cards section, one could also use an infile statement to read from an external file.

Using a datastep with cards or datalines is the preferred way to present example data here, as it allows easy (and exact) recreation of data by everyone reading the posts.

 

Just apply the second data step to your dataset, and adapt the variable names to your needs.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Contributor
Posts: 33

Re: How can I code for a uniform string length where there is variation in data entry

Posted in reply to KurtBremser
Great...I will give it a try. Thanks Kurt!
Contributor
Posts: 33

Re: How can I code for a uniform string length where there is variation in data entry

Posted in reply to KurtBremser
Thanks Kurt...this worked great!
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 208 views
  • 1 like
  • 3 in conversation