Dear all,
I would like to split the following column (i.e., table1) into three columns(i.e., table2),
table 1
NAME
DYNAMIC COML.FIN. DEAD - DELIST 23/09/04
THALES (LON) DEAD - DEAD 28/07/03
EDIRECTORY CO UK DEAD - 20/05/08
INTERNET TIMES DEAD - 29/08/03
CYBERES DEAD - DELIST 24/12/04
MTR CORP. (XSQ) DEAD - 01/10/08
INTERCEDE GROUP
XPLOITE DEAD - T/O BY 31135W
TIGER RESOURCE
GALAHAD GOLD DEAD - 21/02/08
INNOBOX DEAD - 19/05/08
PORTMAN (LON) DEAD - DELIST 02/08/04
WARTHOG DEAD - 13/05/08
LASMO ASSD AGIP INV.CSH
MCBRIDE
SUMITOMO CORP. (UK) LTD
NATIONAL GRID (WI)
IST MOBL.ITAL.ADS (LON) DEAD - DELIST 06/02/98
IST MOBL.ITAL.SPA (LON) DEAD - DELIST 03/07/98
ITALGAS (LON) DEAD - DELIST 03/07/98
MITSUBISHI ESTATES (XSQ) DEAD - 01/10/08
MITSUBISHI HEAVY (XSQ) DEAD - 01/10/08
MITSUBISHI MOTORS (XSQ) DEAD - 01/10/08
MITSUI & COMPANY (XSQ) DEAD - 01/10/08
like
table 2
NAME, STATUS, DATE
DYNAMIC COML.FIN., DEAD - DELIST, 23/09/04
THALES (LON) DEAD, DEAD, 28/07/03
EDIRECTORY CO UK, DEAD, 20/05/08
INTERNET TIMES, DEAD, 29/08/03
CYBERES, DEAD - DELIST, 24/12/04
MTR CORP. (XSQ), DEAD, 01/10/08
INTERCEDE GROUP
XPLOITE DEAD - T/O BY 31135W
TIGER RESOURCE
GALAHAD GOLD, DEAD, 21/02/08
INNOBOX, DEAD, 19/05/08
Thanks in advance.
What are the rules used to do this?
Post test data in the form of a datastep in future please. I cannot see any logical way to split that data as there is no consistency. I would advise you go back to the source and fix, or get fixed, there.
Your three variables are: Name, Status, Date.
You probably know what statuses are valid - according to your sample any status starts with 'DEAD'.
Date always start with a number.
Assuming no name include status string and the order is always Name then Status then Date
you can try next code:
data want;
infile datalines truncover;
input a_line $50;
p1 = index(a_line, 'DEAD'); /* status start position */
p2 = indexc(a_line, '0123456789'); /* date start position */
Name = substr(a_line,1,p1-1);
Status = substr(a_line,p1,p2-p1);
Date = input(strip(substr(a_line,p2)),ddmmyy8.);
format date ddmmyy10.;
datalines;
DYNAMIC COML.FIN. DEAD - DELIST 23/09/04
THALES (LON) DEAD - DEAD 28/07/03
EDIRECTORY CO UK DEAD - 20/05/08
INTERNET TIMES DEAD - 29/08/03
CYBERES DEAD - DELIST 24/12/04
MTR CORP. (XSQ) DEAD - 01/10/08
INTERCEDE GROUP
XPLOITE DEAD - T/O BY 31135W
TIGER RESOURCE
GALAHAD GOLD DEAD - 21/02/08
INNOBOX DEAD - 19/05/08
PORTMAN (LON) DEAD - DELIST 02/08/04
WARTHOG DEAD - 13/05/08
LASMO ASSD AGIP INV.CSH
MCBRIDE
SUMITOMO CORP. (UK) LTD
NATIONAL GRID (WI)
IST MOBL.ITAL.ADS (LON) DEAD - DELIST 06/02/98
IST MOBL.ITAL.SPA (LON) DEAD - DELIST 03/07/98
ITALGAS (LON) DEAD - DELIST 03/07/98
MITSUBISHI ESTATES (XSQ) DEAD - 01/10/08
MITSUBISHI HEAVY (XSQ) DEAD - 01/10/08
MITSUBISHI MOTORS (XSQ) DEAD - 01/10/08
MITSUI & COMPANY (XSQ) DEAD - 01/10/08
;
run;
Fixed code is:
data want;
keep Name Status Date;
length Name $20 status $15; /* adapt to max expected */
infile datalines truncover;
input a_line $50;
p1 = index(a_line, 'DEAD'); /* status start position */
if p1 = 0 then Name = trim(a_line);
else Name = substr(a_line,1,p1-1);
p2 = indexc(a_line, '0123'); /* date start position, day is 0n/1n/2n/3n */
if p1 = 0 or p2=0 then date=.;
else do:
Status = substr(a_line,p1,p2-p1);
Date = input(strip(substr(a_line,p2)),ddmmyy8.);
end;
format date ddmmyy10.;
datalines;
...........
; run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.