Hi Experts,
The following is the sample of my dataset. I need to remove the first two and last two digits of "firms".
date firms
22/10/2008 | AN8068571086 |
26/02/2010 | AN8068571086 |
05/08/2011 | AN8068571086 |
27/09/2011 | AN8068571086 |
25/10/2011 | AN8068571086 |
23/11/2011 | AN8068571086 |
26/06/2012 | AN8068571086 |
It means, now "firms" will be like 80685710.
Regards,
Abu
data have;
input date ddmmyy10. firms :$15.;
datalines;
22/10/2008 AN8068571086
26/02/2010 AN8068571086
05/08/2011 AN8068571086
27/09/2011 AN8068571086
25/10/2011 AN8068571086
23/11/2011 AN8068571086
26/06/2012 AN8068571086
;
data want;
set have;
length want $ 20;
want=substr(firms,3,length(firms)-4);
run;
Hi Abu,
data have;
input date ddmmyy10. firms :$15.;
datalines;
22/10/2008 AN8068571086
26/02/2010 AN8068571086
05/08/2011 AN8068571086
27/09/2011 AN8068571086
25/10/2011 AN8068571086
23/11/2011 AN8068571086
26/06/2012 AN8068571086
;
data want;
set have;
do i= 1 to length(firms);
new_firms=substr(firms,i,1);
if i in (1,2) or i=length(firms)-1 or i=length(firms) then continue;
group+1;
output;
end;
call missing(group);
run;
data final;
array a(15) $1 _temporary_;
call missing (of a{*});
do until(last.date);
set want;
by date;
a(group)=new_firms;
firm=cats(of a{*});
end;
keep date firm;
format date ddmmyy10.;
run;
HTH,
Naveen Srinivasan
L&T Infotech
Hi,
After running your codes, more observations are shown in the "Final" dataset than the original dataset. How to address this issue?
I tested mine using the sample data you provided. It appeared to me the dataset is sorted by date. So, in the 1st datastep, the dataset gets flipped as each date duplicates forms a group. So here unless your real matches with your sample, i can't tell until you let us know what it really looks like. I just reran the same code that I gave you and that seems ok.
Either way, prx is perhaps the most ideal for character manipulation, so I'd suggest implement Haikuo's code.
I include the SAS file.I cannot attach the SAS file, but I have attached the CSV file.
It sounds like you are looking for a simple "chop off some characters":
firms = substr(firms, 3, 8);
If it is possible that number of characters changes from record to record, you could use:
firms = substr(firms, 3, length(firms)-4);
Good luck.
Hi,
It is not working and shows the following messages:
NOTE: Numeric values have been converted to character values at the places given by:
(Line):(Column).
127:16
NOTE: Character values have been converted to numeric values at the places given by:
(Line):(Column).
127:9
data have;
input date ddmmyy10. firms :$15.;
datalines;
22/10/2008 AN8068571086
26/02/2010 AN8068571086
05/08/2011 AN8068571086
27/09/2011 AN8068571086
25/10/2011 AN8068571086
23/11/2011 AN8068571086
26/06/2012 AN8068571086
;
data want;
set have;
length want $ 20;
want=substr(firms,3,length(firms)-4);
run;
You did not include the key information from the log that would show what statement was on line 127 and what variable was being referenced at columns 9 and 16.
Are you trying to remove digits from the middle of a NUMBER? If so then those are not errors just notes showing that SAS had to convert the number to a character string to perform the substr() and then convert the string result back to a number.
No, I want to remove the first two digits and last two digits from "cusip" observations. Those notes were shown in log file. And the output dataset shows blank observations i.e. no observation at all.
data have;
input date ddmmyy10. firms :$15.;
datalines;
22/10/2008 AN8068571086
26/02/2010 AN8068571086
05/08/2011 AN8068571086
27/09/2011 AN8068571086
25/10/2011 AN8068571086
23/11/2011 AN8068571086
26/06/2012 AN8068571086
;
data want;
set have;
new_firms=prxchange('s/^\w\w|\w\w$//io',-1, strip(firms));
run;
Abu,
Both Astounding's and Haikuo's code should have worked. I think you will have to post an example of your data (in the form of a data step), and the code you ran that produced the notes, in order for anyone here to be able to help you.
Hi,
I cannot attach the SAS file. So I have attached the CSV file.
Thanks Abu for making it more clear. Thumb rule is always to explain the question well by helping us to help you. I have made the necessary changes in the following:
data want;
set have;
do i= 1 to length(cusip);
new_cusip=substr(cusip,i,1);
if i in (1,2) or i=length(cusip)-1 or i=length(cusip) then continue;
group=_n_;
output;
end;
run;
data final;
array a(15) $1 _temporary_;
call missing (of a{*});
do _n_=1 by 1 until(last.group);
set want;
by group;
a(_n_)=new_cusip;
trunc_cusip=cats(of a{*});
end;
keep tradedate trunc_cusip;
format tradedate ddmmyy10.;
run;
Have a nice day,
Naveen Srinivasan
L&T Infotech
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.