HI,
I've data as below and embedded with multiple spaces between time and number columns data. Please let me know how to read this inconsistent "ID" column data
SNO, DATE, TIME,ID, NUMBER, COST, DISCOUNT
0 26/08/2010 16:33:36 LOC MH 5533442211 42 0.90
1 26/08/2010 16:34:29 LOC MH 5533442212 14 0.90
2 26/08/2010 16:35:20 9422 5533442213 74 1.80
3 28/08/2010 12:13:38 CHI - MH 5533442214 41 0.90
4 28/08/2010 12:59:20 9823 5533442215 15 0.90
You could play with the `_infile_` and scan() function:
data want;
input;
length SNO $ 8 DATE TIME 8 ID $ 32 NUMBER COST DISCOUNT 8 _time _number $ 20;
format DATE yymmdd10. TIME time9.;
SNO = scan (_infile_, 1, " ");
DATE = input( scan (_infile_, 2, " "), ddmmyy10.);
_TIME = scan (_infile_, 3, " ");
TIME = input(_TIME, time9.);
_NUMBER = scan (_infile_, -3, " ");
NUMBER = input(_NUMBER, best32.);
COST = input( scan (_infile_, -2, " "), best32.);
DISCOUNT = input( scan (_infile_, -1, " "), best32.);
_n = find(_infile_, _TIME, "T") + 8;
_m = find(_infile_, _NUMBER, "T");
ID = substr(_infile_, _n, _m - _n);
drop _:;
cards;
0 26/08/2010 16:33:36 LOC MH 5533442211 42 0.90
1 26/08/2010 16:34:29 LOC MH 5533442212 14 0.90
2 26/08/2010 16:35:20 9422 5533442213 74 1.80
3 28/08/2010 12:13:38 CHI - MH 5533442214 41 0.90
4 28/08/2010 12:59:20 9823 5533442215 15 0.90
;
run;
all the best
Bart
Is that really what you have in the file?
The header is comma-separated, but the data is not.
I suggest that you view the file with a pure text editor (Windows Editor or Notepad++) and copy/paste the first few lines into a window opened with the indicated button:
This will give us a clearer picture of what you are dealing with.
You could play with the `_infile_` and scan() function:
data want;
input;
length SNO $ 8 DATE TIME 8 ID $ 32 NUMBER COST DISCOUNT 8 _time _number $ 20;
format DATE yymmdd10. TIME time9.;
SNO = scan (_infile_, 1, " ");
DATE = input( scan (_infile_, 2, " "), ddmmyy10.);
_TIME = scan (_infile_, 3, " ");
TIME = input(_TIME, time9.);
_NUMBER = scan (_infile_, -3, " ");
NUMBER = input(_NUMBER, best32.);
COST = input( scan (_infile_, -2, " "), best32.);
DISCOUNT = input( scan (_infile_, -1, " "), best32.);
_n = find(_infile_, _TIME, "T") + 8;
_m = find(_infile_, _NUMBER, "T");
ID = substr(_infile_, _n, _m - _n);
drop _:;
cards;
0 26/08/2010 16:33:36 LOC MH 5533442211 42 0.90
1 26/08/2010 16:34:29 LOC MH 5533442212 14 0.90
2 26/08/2010 16:35:20 9422 5533442213 74 1.80
3 28/08/2010 12:13:38 CHI - MH 5533442214 41 0.90
4 28/08/2010 12:59:20 9823 5533442215 15 0.90
;
run;
all the best
Bart
Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.
Register today!Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.