Dear all,
I would like to split the following column (i.e., table1) into three columns(i.e., table2),
table 1
NAME
DYNAMIC COML.FIN. DEAD - DELIST 23/09/04
THALES (LON) DEAD - DEAD 28/07/03
EDIRECTORY CO UK DEAD - 20/05/08
INTERNET TIMES DEAD - 29/08/03
CYBERES DEAD - DELIST 24/12/04
MTR CORP. (XSQ) DEAD - 01/10/08
INTERCEDE GROUP
XPLOITE DEAD - T/O BY 31135W
TIGER RESOURCE
GALAHAD GOLD DEAD - 21/02/08
INNOBOX DEAD - 19/05/08
PORTMAN (LON) DEAD - DELIST 02/08/04
WARTHOG DEAD - 13/05/08
LASMO ASSD AGIP INV.CSH
MCBRIDE
SUMITOMO CORP. (UK) LTD
NATIONAL GRID (WI)
IST MOBL.ITAL.ADS (LON) DEAD - DELIST 06/02/98
IST MOBL.ITAL.SPA (LON) DEAD - DELIST 03/07/98
ITALGAS (LON) DEAD - DELIST 03/07/98
MITSUBISHI ESTATES (XSQ) DEAD - 01/10/08
MITSUBISHI HEAVY (XSQ) DEAD - 01/10/08
MITSUBISHI MOTORS (XSQ) DEAD - 01/10/08
MITSUI & COMPANY (XSQ) DEAD - 01/10/08
like
table 2
NAME, STATUS, DATE
DYNAMIC COML.FIN., DEAD - DELIST, 23/09/04
THALES (LON) DEAD, DEAD, 28/07/03
EDIRECTORY CO UK, DEAD, 20/05/08
INTERNET TIMES, DEAD, 29/08/03
CYBERES, DEAD - DELIST, 24/12/04
MTR CORP. (XSQ), DEAD, 01/10/08
INTERCEDE GROUP
XPLOITE DEAD - T/O BY 31135W
TIGER RESOURCE
GALAHAD GOLD, DEAD, 21/02/08
INNOBOX, DEAD, 19/05/08
Thanks in advance.
What are the rules used to do this?
Post test data in the form of a datastep in future please. I cannot see any logical way to split that data as there is no consistency. I would advise you go back to the source and fix, or get fixed, there.
Your three variables are: Name, Status, Date.
You probably know what statuses are valid - according to your sample any status starts with 'DEAD'.
Date always start with a number.
Assuming no name include status string and the order is always Name then Status then Date
you can try next code:
data want;
infile datalines truncover;
input a_line $50;
p1 = index(a_line, 'DEAD'); /* status start position */
p2 = indexc(a_line, '0123456789'); /* date start position */
Name = substr(a_line,1,p1-1);
Status = substr(a_line,p1,p2-p1);
Date = input(strip(substr(a_line,p2)),ddmmyy8.);
format date ddmmyy10.;
datalines;
DYNAMIC COML.FIN. DEAD - DELIST 23/09/04
THALES (LON) DEAD - DEAD 28/07/03
EDIRECTORY CO UK DEAD - 20/05/08
INTERNET TIMES DEAD - 29/08/03
CYBERES DEAD - DELIST 24/12/04
MTR CORP. (XSQ) DEAD - 01/10/08
INTERCEDE GROUP
XPLOITE DEAD - T/O BY 31135W
TIGER RESOURCE
GALAHAD GOLD DEAD - 21/02/08
INNOBOX DEAD - 19/05/08
PORTMAN (LON) DEAD - DELIST 02/08/04
WARTHOG DEAD - 13/05/08
LASMO ASSD AGIP INV.CSH
MCBRIDE
SUMITOMO CORP. (UK) LTD
NATIONAL GRID (WI)
IST MOBL.ITAL.ADS (LON) DEAD - DELIST 06/02/98
IST MOBL.ITAL.SPA (LON) DEAD - DELIST 03/07/98
ITALGAS (LON) DEAD - DELIST 03/07/98
MITSUBISHI ESTATES (XSQ) DEAD - 01/10/08
MITSUBISHI HEAVY (XSQ) DEAD - 01/10/08
MITSUBISHI MOTORS (XSQ) DEAD - 01/10/08
MITSUI & COMPANY (XSQ) DEAD - 01/10/08
;
run;
Fixed code is:
data want;
keep Name Status Date;
length Name $20 status $15; /* adapt to max expected */
infile datalines truncover;
input a_line $50;
p1 = index(a_line, 'DEAD'); /* status start position */
if p1 = 0 then Name = trim(a_line);
else Name = substr(a_line,1,p1-1);
p2 = indexc(a_line, '0123'); /* date start position, day is 0n/1n/2n/3n */
if p1 = 0 or p2=0 then date=.;
else do:
Status = substr(a_line,p1,p2-p1);
Date = input(strip(substr(a_line,p2)),ddmmyy8.);
end;
format date ddmmyy10.;
datalines;
...........
; run;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.