How to remove digits from the observations?

Accepted Solution Solved
Reply
Regular Contributor
Posts: 183
Accepted Solution

How to remove digits from the observations?

Hi Experts,

The following is the sample of my dataset. I need to remove the first two and last two digits of "firms".

date               firms

22/10/2008AN8068571086
26/02/2010AN8068571086
05/08/2011AN8068571086
27/09/2011AN8068571086
25/10/2011AN8068571086
23/11/2011AN8068571086
26/06/2012AN8068571086

It means, now "firms" will be like 80685710.

Regards,

Abu


Accepted Solutions
Solution
‎03-17-2015 09:02 AM
Super User
Posts: 9,676

Re: How to remove digits from the observations?

data have;

input date ddmmyy10. firms :$15.;

datalines;

22/10/2008    AN8068571086

26/02/2010    AN8068571086

05/08/2011    AN8068571086

27/09/2011    AN8068571086

25/10/2011    AN8068571086

23/11/2011    AN8068571086

26/06/2012    AN8068571086

;

data want;

set have;

length want $ 20;

want=substr(firms,3,length(firms)-4);

run;

View solution in original post


All Replies
SAS Employee
Posts: 232

Re: How to remove digits from the observations?

Hi Abu, I was out on the communities today and noticed your question. I'm sure that a SAS expert will answer you soon. I happened to see this, just in case it helps:

Frequent Contributor
Posts: 115

Re: How to remove digits from the observations?

Hi Abu,

data have;

input date ddmmyy10. firms :$15.;

datalines;

22/10/2008    AN8068571086

26/02/2010    AN8068571086

05/08/2011    AN8068571086

27/09/2011    AN8068571086

25/10/2011    AN8068571086

23/11/2011    AN8068571086

26/06/2012    AN8068571086

;

data want;

set have;

do i= 1 to length(firms);

new_firms=substr(firms,i,1);

if i in (1,2) or i=length(firms)-1 or i=length(firms) then continue;

group+1;

output;

end;

call missing(group);

run;

data final;

array a(15) $1 _temporary_;

call missing (of a{*});

do until(last.date);

set want;

by date;

a(group)=new_firms;

firm=cats(of a{*});

end;

keep date firm;

format date ddmmyy10.;

run;

HTH,

Naveen Srinivasan

L&T Infotech

Regular Contributor
Posts: 183

Re: How to remove digits from the observations?

Hi,

After running your codes, more observations are shown in the "Final" dataset than the original dataset. How to address this issue?

Frequent Contributor
Posts: 115

Re: How to remove digits from the observations?

I tested mine using the sample data you provided. It appeared to me the dataset is sorted by date. So, in the 1st datastep, the dataset gets flipped as each date duplicates forms a group. So here unless your real matches with your sample, i can't tell until you let us know what it really looks like. I just reran the same code that I gave you and that seems ok.

Either way, prx is perhaps the most ideal for character manipulation, so I'd suggest implement Haikuo's code.

Regular Contributor
Posts: 183

Re: How to remove digits from the observations?

I include the SAS file.I cannot attach the SAS file, but I have attached the CSV file.

Attachment
Super User
Posts: 5,081

Re: How to remove digits from the observations?

It sounds like you are looking for a simple "chop off some characters":

firms = substr(firms, 3, 8);

If it is possible that number of characters changes from record to record, you could use:

firms = substr(firms, 3, length(firms)-4);

Good luck.

Regular Contributor
Posts: 183

Re: How to remove digits from the observations?

Hi,

It is not working and shows the following messages:

NOTE: Numeric values have been converted to character values at the places given by:

      (Line)Smiley SadColumn).

      127:16

NOTE: Character values have been converted to numeric values at the places given by:

      (Line)Smiley SadColumn).

      127:9

Solution
‎03-17-2015 09:02 AM
Super User
Posts: 9,676

Re: How to remove digits from the observations?

data have;

input date ddmmyy10. firms :$15.;

datalines;

22/10/2008    AN8068571086

26/02/2010    AN8068571086

05/08/2011    AN8068571086

27/09/2011    AN8068571086

25/10/2011    AN8068571086

23/11/2011    AN8068571086

26/06/2012    AN8068571086

;

data want;

set have;

length want $ 20;

want=substr(firms,3,length(firms)-4);

run;

Super User
Super User
Posts: 6,499

Re: How to remove digits from the observations?

You did not include the key information from the log that would show what statement was on line 127 and what variable was being referenced at columns 9 and 16.

Are you trying to remove digits from the middle of a NUMBER?  If so then those are not errors just notes showing that SAS had to convert the number to a character string to perform the substr() and then convert the string result back to a number.

Regular Contributor
Posts: 183

Re: How to remove digits from the observations?

No, I want to remove the first two digits and last two digits from "cusip" observations. Those notes were shown in log file. And the output dataset shows blank observations i.e. no observation at all.

Respected Advisor
Posts: 3,124

Re: How to remove digits from the observations?

data have;

input date ddmmyy10. firms :$15.;

datalines;

22/10/2008    AN8068571086

26/02/2010    AN8068571086

05/08/2011    AN8068571086

27/09/2011    AN8068571086

25/10/2011    AN8068571086

23/11/2011    AN8068571086

26/06/2012    AN8068571086

;

data want;

set have;

new_firms=prxchange('s/^\w\w|\w\w$//io',-1, strip(firms));

run;

PROC Star
Posts: 7,363

Re: How to remove digits from the observations?

Abu,

Both Astounding's and Haikuo's code should have worked. I think you will have to post an example of your data (in the form of a data step), and the code you ran that produced the notes, in order for anyone here to be able to help you.

Regular Contributor
Posts: 183

Re: How to remove digits from the observations?

Hi,

I cannot attach the SAS file. So I have attached the CSV file.

Attachment
Frequent Contributor
Posts: 115

Re: How to remove digits from the observations?

Thanks Abu for making it more clear. Thumb rule is always to explain the question well by helping us to help you. I have made the necessary changes in the following:

data want;

set have;

do i= 1 to length(cusip);

new_cusip=substr(cusip,i,1);

if i in (1,2) or i=length(cusip)-1 or i=length(cusip) then continue;

group=_n_;

output;

end;

run;

data final;

array a(15) $1 _temporary_;

call missing (of a{*});

do _n_=1 by 1 until(last.group);

set want;

by group;

a(_n_)=new_cusip;

trunc_cusip=cats(of a{*});

end;

keep tradedate trunc_cusip;

format tradedate ddmmyy10.;

run;

Have a nice day,

Naveen Srinivasan

L&T Infotech

☑ This topic is SOLVED.

Need further help from the community? Please ask a new question.

Discussion stats
  • 19 replies
  • 602 views
  • 8 likes
  • 9 in conversation