DATA Step, Macro, Functions and more

Replacing Embedded blanks with 0's

Reply
Contributor
Posts: 25

Replacing Embedded blanks with 0's

Problem: For some observations, the SSN variable contains embedded blanks. Create a data file with those observations. You need to create a new (corrected) SSN variable by placing those embedded blanks with zeros. Be sure to use the appropriate SAS function. Screenshot the partial SAS log, showing the number of observations in this output data.

 

For this question I tried:

 

data hi.Missing_SSN;
infile 'C:\Users\larkj\Desktop\STUDENT.DAT';
input ssn 1-9;
SSN_Zeros= compress(SSN,' ');
drop SSN;
rename SSN_Zeros = SSN;
run;

 

I really have no idea do I need some sort of length statement and do I need to make a character value?

PROC Star
Posts: 7,492

Re: Replacing Embedded blanks with 0's

Your description definitely sounds like a homework assignment AND you haven't provide enough info for anyone to help.

 

You have to know what your raw data looks like. Since SSN always have nine digits, the field they're in is either greater than nine characters (with spaces imbedded within the numbers e.g., 11 characters with spaces separating the 3 digit-2digit-4 digit combinations), or numbers rather than characters, thus missing the leading zeros.

 

You'll only know once you look at some of the files records.

 

Art, CEO, AnalystFinder.com

 

PROC Star
Posts: 1,760

Re: Replacing Embedded blanks with 0's

The translate function may be of interest to you.

PROC Star
Posts: 7,492

Re: Replacing Embedded blanks with 0's

Doubt it. Just saw another post from the same poster. Looks like SSN is a numeric field where the leading zeros were dropped.

 

Needs to create a new variable (like, e.g., cSSN) using a put statement (e.g.:

 

cSSN=put(SSN, z9.);

Art, CEO, AnalystFinder.com

Contributor
Posts: 25

Re: Replacing Embedded blanks with 0's

The z9. worked great is there any way to create a new data file with just the observations that added the extra 0's?

Super User
Posts: 11,343

Re: Replacing Embedded blanks with 0's


larkjr18 wrote:

The z9. worked great is there any way to create a new data file with just the observations that added the extra 0's?


I think this may work:

data want;
   set have;
   if length(put(ssn,f9.-L)) <9 then output;
run;
Contributor
Posts: 25

Re: Replacing Embedded blanks with 0's

Wow that worked great thank you!!! It is still printing all the other observations but just as 0's any way to stop that?


2017-03-30.png
Contributor
Posts: 20

Re: Replacing Embedded blanks with 0's

 

What you wrote will remove embedded blanks with the COMPRESS statament.

The questions askes to change the embedded blanks to 0's - so we need to keep those in the dataset so we know where the 0's should go.

 

I can think of a few ways this may be set up without seeing the data set - 1) need leading zeroes or 2) need zeros inside or 3) need leading and zeros inside which we will not know unless you have a variable like '  3-29-3 9 '  so i'm assuming you have a numeric variable and you need either #1 or #2 done. Smiley Happy

****************************************************************************************************************************************

 

So if we want to add the leading zeros - we can define the new SSN variable and tell it it'll be 9 spaces.

We will conver the numeric to a character value format with leading zeros.

...
SSN_LEADING= put(SSN,z9.);

format NEW_SSN z9.;
...

More info here about leading zeros: http://support.sas.com/documentation/cdl/en/lrdict/64316/HTML/default/viewer.htm#a000205244.htm

 

****************************************************************************************************************************************

 To change the embedded blanks that are not leading you can try this function:

....

SSN_INSIDE= tranwrd(strip(SSN)," ","0");

... 

PROC Star
Posts: 7,492

Re: Replacing Embedded blanks with 0's

[ Edited ]

@ccarel: if you look at the jpegs in the OP's other post (different thread from tonight), you'll see that SSN is a numeric field, but not all entries have 9 digits.

 

Art, CEO, AnalystFinder.com

 

Super User
Posts: 11,343

Re: Replacing Embedded blanks with 0's

Note that the SSN Format will assume any numeric that looks like an SSN but is missing digits that the leading zeros were dropped.

 

data junk;
   x=12345678;
   put x ssn.;
   format x ssn.;
run;
Ask a Question
Discussion stats
  • 9 replies
  • 322 views
  • 0 likes
  • 5 in conversation