I think that was a slight misstatement. The code you are referring to takes the last word from the line that was input and tries to input it as an up-to-16 digit number.
Hi,
Thanks a lot.
What would it do for this particular case??
kkkkkkkkkkkgggggghhhhhdnfejfudsdhjn:2 (more than 16 letters)
Regards
using PGStat's code it would select 2 and read it as a number since it was preceded by a :.
The other versions would drop the record since the last word isn't a number.
Did you really want to remove the entire line that has the underscores in it, or just that portion from the underscore onward?
Similarly, since all of your other lines end with a number, I would approach the problem slightly differently than my colleagues. Would the following work for you?:
data have (drop=_:);
length line1 line2 $128;
infile datalines truncover;
input line1 char. / line2 char.;
if find(line2,"_") then line2=substr(line2,1,find(line2,"_")-1);
else do;
call scan(line2,-1,_position,_length," ");
line2=strip(substr(line2,1,_position-1));
end;
datalines;
C) D.C.CIRCUITS
EER-126 D.C. CIRCUITS.............. 98/WI A 2.67
N) FUNDAMENTAL MECHANICAL SYSTEMS
EGR-100 FUNDMNTL MECHNCL SKILLS______________________ 1 course needed
C) ROBOTICS IN CIM SYSTEMS
EGR-128 ROBOTICS IN CIM SYSTEMS.... 99/SP A 2
C) COMPUTER PROGRAMMING APPLICATIONS IN EGR TECHNOLOGY
IET-198 COMP PRGM APP IN ENG TECH.. 00/SP B 1.33
;
In doing some troubleshooting, I found out I was missing some data and below is the reason why. I need the row and the *F is the problem.
Things that I know about the file:
1) The first hyphen is at position 14.
2) A number will be at position 15.
What i have for the IF condition
if index(row, '-')= 14 and input(scan(row,-1), ??16.) >. and ANYDIGIT(row)= 15
INT-270 INT INTERNSHIP............. 09/SUB A | 3.33 *F |
You could use something like:
data have;
length line1 line2 $128;
infile datalines truncover;
input line1 char. / line2 char.;
if anyalpha(scan(line2,-1)) then
line2=substr(line2,1,length(line2)-2);
if not missing(input(scan(line2,-1), ??16.));
datalines;
C) D.C.CIRCUITS
EER-126 D.C. CIRCUITS.............. 98/WI A 2.67
N) FUNDAMENTAL MECHANICAL SYSTEMS
EGR-100 FUNDMNTL MECHNCL SKILLS______________________ 1 course needed
C) ROBOTICS IN CIM SYSTEMS
EGR-128 ROBOTICS IN CIM SYSTEMS.... 99/SP A 2
C) COMPUTER PROGRAMMING APPLICATIONS IN EGR TECHNOLOGY
IET-198 COMP PRGM APP IN ENG TECH.. 00/SP B 1.33
X) NEW RECORD JUST ADDED
INT-270 INT INTERNSHIP............. 09/SUB A 3.33 *F
;
That works but I think I would have to modify what I already have. Is there a way work from right to left and check a value? For example, I know that a number must be in spot 67. I went ahead and attached what I have to parse this file.
data Student_Degree_Audit;
infile 'D:\temp\nick_tmp\Loaded_Degree_Audit\Semester\Degree_audit_1030.txt' length=len;
input row $varying200. len;
IF index(row, "Student...") = 1 then
do;
positionOfName=index(row,"(");
positionOfEndName = index(row,")");
Studentid=substr(row,positionOfName+1,(positionOfEndName - positionOfName -1));
end;
IF index(row, "Program...") = 1 then
do;
positionOfName=index(row,"(");
positionOfEndName = index(row,")");
Program_Nm=substr(row,positionOfName+1,(positionOfEndName - positionOfName -1));
end;
IF index(row, "Catalog...") = 1 then
do;
p=index(row,":");
Catalog_Yr=substr(row,p+2,4);
end;
a=substr(row,anydigit(scan(row,-1),-1));
cfind=input(scan(row,-1), ??16.);
dfind=index(row,')');
IF index(row, '-')= 14 and ANYDIGIT(row)= 15 and index(row,')')NE 8 then
/*
not working. Picking up too much stuff.
*/
do
Course_Nm= substr(row,findc(row, '-')-3,7);
Term_Id_Cd=substr(row,findc(row,'/', "b")-2,6);
Grade=substr(row,findc(row,'/',"b")+7, 1);
credit=substr(row,62,5);
credit=left(credit);
newcredit= input(credit,5.);
drop credit;
rename newcredit=credit;
/*
revrow = reverse(trim(row));
credit=substr(revrow,1,index(revrow," "));
credit = reverse(trim(credit));
newcredit = input(credit,3.);
drop credit;
rename newcredit=Credit;
*/
end;
run;
: Of course there is a way, but I think the forum would have to see at least an example of the file you are working with and what the new rules are. We know that you don't want one lines that ends with characters, but do want some lines that do end in characters. In short, we don't know what the rules are that differentiate the two. If it is simply whether position 67 is a number, then that is extremely easy to check.
Attached is a snippet of the file. I also copied a snippet below to reiterate what I need from the file (in red).Thanks for all the help.
C) ETD-230
!! Exception
kb
INT-297 PRINCIPLES OF MANUFACTURING 09/WI B 2.67 *F
C) C: THIRD QUARTER
Credits: 13.33
C) ALL COURSES LISTED BELOW ARE REQUIRED:
!! Exception
kb
kb
Credits: 8
INT-143 APPLIED SHOP MATH III...... 09/SP A 2.00
INT-270 INT INTERNSHIP............. 09/SP A 3.33
INT-297 CO-OP MACHINE LAB III...... 09/SUA A 2.67
C) ETD-199
!! Exception
kb
INT-270 INT INTERNSHIP............. 09/SUB A 3.33 *F
C) INT ELECTIVE - INT-165 OR INT-211
INT-211 COMP NUM CONTROL PROC I.... 09/SUB B 2
: I'm still not sure what you are looking for. However, that said, would using call scan, rather than scan, provide what you need? e.g., I used it in place of two lines in your code:
/* --->*/ CALL SCAN(row, -1, position, length," ");
a=substr(row,anydigit(scan(row,-1),-1));
/* --->*/ cfind=input(substr(row,position,length), ??16.);
dfind=index(row,')');
But, like I said, I'm not sure if that is the functionality you are seeking.
I'm trying to run through a large file and only pick up the lines that are in red above. Pretty much the only piece of the IF statement that I'm missing is to pick up:
INT-270 INT INTERNSHIP............. 09/SUB A 3.33 *F
This was working great until I started to validate the data and found the above line is being omitted because of the scan.
IF index(row, '-')= 14 and ANYDIGIT(row)= 15 and index(row,')')NE 8 input(scan(row,-1), ??16.)>. then
/* not working. Picking up too much stuff.
*/
do
Course_Nm= substr(row,findc(row, '-')-3,7);
Term_Id_Cd=substr(row,findc(row,'/', "b")-2,6);
Grade=substr(row,findc(row,'/',"b")+7, 1);
credit=substr(row,62,5);
credit=left(credit);
newcredit= input(credit,5.);
drop credit;
rename newcredit=credit;
Registration is now open for SAS Innovate 2025 , our biggest and most exciting global event of the year! Join us in Orlando, FL, May 6-9.
Sign up by Dec. 31 to get the 2024 rate of just $495.
Register now!
Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.