HI,
I've data as below and embedded with multiple spaces between time and number columns data. Please let me know how to read this inconsistent "ID" column data
SNO, DATE, TIME,ID, NUMBER, COST, DISCOUNT
0 26/08/2010 16:33:36 LOC MH 5533442211 42 0.90
1 26/08/2010 16:34:29 LOC MH 5533442212 14 0.90
2 26/08/2010 16:35:20 9422 5533442213 74 1.80
3 28/08/2010 12:13:38 CHI - MH 5533442214 41 0.90
4 28/08/2010 12:59:20 9823 5533442215 15 0.90
You could play with the `_infile_` and scan() function:
data want;
input;
length SNO $ 8 DATE TIME 8 ID $ 32 NUMBER COST DISCOUNT 8 _time _number $ 20;
format DATE yymmdd10. TIME time9.;
SNO = scan (_infile_, 1, " ");
DATE = input( scan (_infile_, 2, " "), ddmmyy10.);
_TIME = scan (_infile_, 3, " ");
TIME = input(_TIME, time9.);
_NUMBER = scan (_infile_, -3, " ");
NUMBER = input(_NUMBER, best32.);
COST = input( scan (_infile_, -2, " "), best32.);
DISCOUNT = input( scan (_infile_, -1, " "), best32.);
_n = find(_infile_, _TIME, "T") + 8;
_m = find(_infile_, _NUMBER, "T");
ID = substr(_infile_, _n, _m - _n);
drop _:;
cards;
0 26/08/2010 16:33:36 LOC MH 5533442211 42 0.90
1 26/08/2010 16:34:29 LOC MH 5533442212 14 0.90
2 26/08/2010 16:35:20 9422 5533442213 74 1.80
3 28/08/2010 12:13:38 CHI - MH 5533442214 41 0.90
4 28/08/2010 12:59:20 9823 5533442215 15 0.90
;
run;
all the best
Bart
Is that really what you have in the file?
The header is comma-separated, but the data is not.
I suggest that you view the file with a pure text editor (Windows Editor or Notepad++) and copy/paste the first few lines into a window opened with the indicated button:
This will give us a clearer picture of what you are dealing with.
You could play with the `_infile_` and scan() function:
data want;
input;
length SNO $ 8 DATE TIME 8 ID $ 32 NUMBER COST DISCOUNT 8 _time _number $ 20;
format DATE yymmdd10. TIME time9.;
SNO = scan (_infile_, 1, " ");
DATE = input( scan (_infile_, 2, " "), ddmmyy10.);
_TIME = scan (_infile_, 3, " ");
TIME = input(_TIME, time9.);
_NUMBER = scan (_infile_, -3, " ");
NUMBER = input(_NUMBER, best32.);
COST = input( scan (_infile_, -2, " "), best32.);
DISCOUNT = input( scan (_infile_, -1, " "), best32.);
_n = find(_infile_, _TIME, "T") + 8;
_m = find(_infile_, _NUMBER, "T");
ID = substr(_infile_, _n, _m - _n);
drop _:;
cards;
0 26/08/2010 16:33:36 LOC MH 5533442211 42 0.90
1 26/08/2010 16:34:29 LOC MH 5533442212 14 0.90
2 26/08/2010 16:35:20 9422 5533442213 74 1.80
3 28/08/2010 12:13:38 CHI - MH 5533442214 41 0.90
4 28/08/2010 12:59:20 9823 5533442215 15 0.90
;
run;
all the best
Bart
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.