Hello Everyone !
I want to read this data , but not able ,
kindly advise me .
Problem 1
P9988 HR Finance Analytics S3498 HR IT Finance
R4634 Finance Analytics Sale
Vocab: EMPID Department
Output Desired
EMPID Department
P9988 HR
P9988 Finance
P9988 Analytics
Not exactly sure if I understand what you're trying to do, but the following should at least give you some idea of how you can parse such data:
data have; length string $80; input; string=_infile_; cards; P9988 HR Finance Analytics S3498 HR IT Finance R4634 Finance Analytics Sale ; data want (keep=empid department); set have; length substring $80 empid $5 department $50; retain pattern; if _n_ eq 1 then pattern=PRXPARSE("/[a-zA-Z]\d\d\d\d/"); substring=string; do until (start eq 0); CALL PRXSUBSTR(pattern, substring, start, length); if start gt 0 then do; EMPID=substr(substring,start,length); substring=substrn(substring,start+length); CALL PRXSUBSTR(pattern, substring, start, length); if start gt 0 then do; Department=substr(substring,1,start-1); substring=substrn(substring,start); end; else Department=substring; output; end; end; run;
Art, CEO, AnalystFinder.com
The code I suggested did create a table, but apparently not containing what you want. I think the following does match what you want:
data have; length string $80; input; string=_infile_; cards; P9988 HR Finance Analytics S3498 HR IT Finance R4634 Finance Analytics Sale ; data want (keep=empid department); set have; length substring $80 empid $5 department full_department $50; retain pattern; if _n_ eq 1 then pattern=PRXPARSE("/[a-zA-Z]\d\d\d\d/"); substring=string; do until (start eq 0); CALL PRXSUBSTR(pattern, substring, start, length); if start gt 0 then do; EMPID=substr(substring,start,length); substring=substrn(substring,start+length); CALL PRXSUBSTR(pattern, substring, start, length); if start gt 0 then do; Full_Department=substr(substring,1,start-1); substring=substrn(substring,start); end; else Full_Department=substring; counter=1; do while (scan(Full_Department,counter) ne ''); department=scan(Full_Department,counter); counter+1; output; end; end; end; run;
Art, CEO, AnalystFinder.com
data have; input x : $100. @@; length id $ 100; retain id; pid=prxparse('/[a-z]\d+/i'); if prxmatch(pid,strip(x)) then id=x; else do;department=x;output;end; drop pid x; cards; P9988 HR Finance Analytics S3498 HR IT Finance R4634 Finance Analytics Sale ; run;
Your post looks garbled. Please post same data using the Insert Code icon on the toolbar in the editor. This will pop-up a new window where you can past the data and/or code and it will preserve the spacing and line breaks.
If your data is in lines then something as simple as this will combine the first word with all of the following words on the line.
data want ;
length empid $10 department $20 ;
infile datalines truncover ;
input empid department @ ;
do until (missing(department ));
output;
input department @;
end;
datalines;
P9988 HR Finance Analytics
S3498 HR IT Finance
R4634 Finance Analytics Sale
;
Is the there are multiple EMPID on the same line then you need some logic to tell an EMPID from a DEPARTMENT name. In you example if looks like they are a letter followed by 4 digits. So something like this should work.
data want ;
length empid $10 department $20 ;
infile datalines flowover ;
retain empid ;
input department @@ ;
if prxmatch('/^[a-z][0-9]{4}$/i',trim(department)) then empid=department;
else output;
datalines;
P9988 HR Finance Analytics S3498 HR IT Finance
R4634 Finance Analytics Sale
;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Use this tutorial as a handy guide to weigh the pros and cons of these commonly used machine learning algorithms.
Find more tutorials on the SAS Users YouTube channel.