Hi there
I have a data set with a variable "ID" shown below and would like to create a new ID variable as shown below in column "New ID" (need to get rid of the hyphen and numbers after the hyphen). Please notice that the ID changes ONLY when ONE hyphen is present in variable "ID".
ID | New ID |
1170192-01 | 1170192 |
1171775-01 | 1171775 |
1175345-01 | 1175345 |
12-06-0219 | 12-06-0219 |
12-06-0223 | 12-06-0223 |
12-06-0235 | 12-06-0235 |
12-06-0236 | 12-06-0236 |
12-06-0263 | 12-06-0263 |
13-06-0291 | 13-06-0291 |
13-06-0294 | 13-06-0294 |
13-06-0321 | 13-06-0321 |
14-06-0347 | 14-06-0347 |
14-06-0351 | 14-06-0351 |
14-06-0597 | 14-06-0597 |
15-06-0116 | 15-06-0116 |
15-06-0353 | 15-06-0353 |
15-06-0365 | 15-06-0365 |
15-06-0367 | 15-06-0367 |
15-06-0369 | 15-06-0369 |
15-06-0371 | 15-06-0371 |
99-06-9921 | 99-06-9921 |
SP0100470- | SP0100470 |
SP080004-0 | SP080004 |
SP080005-0 | SP080005 |
All help will be appreciated it
Thank you
Here you go:
DATA have;
infile cards dsd;
informat id $10.;
format id $10.;
input id;
cards;
1170192-01
1171775-01
1175345-01
12-06-0219
12-06-0223
12-06-0235
12-06-0236
12-06-0263
13-06-0291
13-06-0294
13-06-0321
14-06-0347
14-06-0351
14-06-0597
15-06-0116
15-06-0353
15-06-0365
15-06-0367
15-06-0369
15-06-0371
99-06-9921
SP0100470-
SP080004-0
SP080005-0
;
run;
data want;*(rename=(new_id=id));
set have;
count = countc(id,'-');
if count = 1 then do;
New_ID = scan(id,1,'-');
end;
if count ne 1 then do;
New_ID = ID;
end;
/*drop id count;*/
run;
After you run it and see what the dataset looks like uncomment the few things that are commented and see if that is the desired output.
Here you go:
DATA have;
infile cards dsd;
informat id $10.;
format id $10.;
input id;
cards;
1170192-01
1171775-01
1175345-01
12-06-0219
12-06-0223
12-06-0235
12-06-0236
12-06-0263
13-06-0291
13-06-0294
13-06-0321
14-06-0347
14-06-0351
14-06-0597
15-06-0116
15-06-0353
15-06-0365
15-06-0367
15-06-0369
15-06-0371
99-06-9921
SP0100470-
SP080004-0
SP080005-0
;
run;
data want;*(rename=(new_id=id));
set have;
count = countc(id,'-');
if count = 1 then do;
New_ID = scan(id,1,'-');
end;
if count ne 1 then do;
New_ID = ID;
end;
/*drop id count;*/
run;
After you run it and see what the dataset looks like uncomment the few things that are commented and see if that is the desired output.
THANK YOU VERY MUCH MARK!!
This is a good set of tools to use here, but you might want to simplify things a bit:
data want;
set have;
if countc(id, '-') = 1 then id = scan(id, 1, '-');
run;
Thanks again Mark,
how should I modify the code if I want to keep in one separated column the hyphen and numbers?, this column for the fist three cases will have -01
Eduardo.
Here's what you would want:
data want;
set have;
count = countc(id,'-');
if count = 1 then do;
New_ID = scan(id,1,'-');
end;
if count ne 1 then do;
New_ID = ID;
end;
drop count;
run;
The changes I made, remove the rename statement from the first line, we won't be renaming anything, and I'm only dropping the count at the bottom, I used that to identify what has more than one hyphens.
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.