BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Anthony_eng
Obsidian | Level 7

Good Afternoon,

I am trying to pull specific information based on a pattern shown below.  The code is shown below.  I have tested my regular expressions in other environments and they worked without issue.  I believe the issue has something to do with my lack of understanding in SAS.  My idea was to pull the required information from the end of each line (that is the reason I am using an end of line marker in RegEx)  for each applicable observation and then concatenate as needed.  However, when I tried to use that method outside of my loops it would always pull the first observation and then continue adding data onto the end of the string.  When I used this RegEx inside the loops it pulled the very last piece of data that I wanted, but it wouldn't pull earlier iterations (In the example data I posted at the bottom of this message it would pull the 2nd highlighted portion of line 3, but nothing from line 2).  

data test;
	length text $32767;
	retain text '';
	infile msghtml flowover dlmstr='//' end=last;
	input;
	text=cats(text,'~',_infile_);
	patternID1 = prxparse("/,Trans,.*?,Trans,/s");
	patternID2 = prxparse('/^(~)([0-9]*\.[0-9]+),(Trans),(\w+),(\w+)/');
	patternID3 = prxparse('/(?<=\d{2}\.\d{4},Rec,DATATYPE,(\w){8}\s(\w){2})(\s\w\w)+\s+$/');

	if prxmatch(patternID1, text) then do;
			text = substr(text,1,length(text)-length(_infile_)-1);
			/* Tried placing RegEx here with end of line anchors, did not give desired result */

			if prxmatch(patternID2, text) then do;
					Match_2=prxmatch(patternID2, text);
					Match_3=prxmatch(patternID3, text);
					Buffer_Msg_Trans = prxposn(patternID2, 0, text);
					Buffer_Msg_Trans1 = prxposn(patternID2, 2, text);
					Buffer_Msg_Trans2 = prxposn(patternID2, 3, text);
					Buffer_Msg_Trans3 = prxposn(patternID2, 4, text);
					Buffer_Msg_Trans4 = prxposn(patternID2, 5, text);
					Buffer_Msg_3 = prxposn(patternID3, 0, text);
					
			end;
			text=cats('~',_infile_);
	end;
	/* Tried placing code here with end of line anchors, did not give desired results */
run;

 

 

 

 

 

I would like the following string to be the end result:

37 46 01 80 11 00 01 02 7A 00 4F FF

OR

1) 37 46 01 80 11 00

2) 01 02 7A 00 4F FF

(So then I can easily concatenate afterwards)

 

String that is currently in my 'text' column (bold portions contain the data that I want to keep:

1)  ~68.1626,Trans,DATATYPE,0601
2)  ~68.1626,Trans,DATATYPE,0601~68.1809,Rec,DATATYPE,29EEF98A 10 37 46 01 80 11 00 6A
3)  ~68.1626,Trans,DATATYPE,0601~68.1809,Rec,DATATYPE,29EEF98A 10 37 46 01 80 11 00 6A~68.1855,Rec,DATATYPE,29EEF98A 27 01 02 7A 00 4F FF FF
4)  ~69.2385,Trans,DATATYPE,0602

 

Any help or advice would be greatly appreciated.  Thank you for your time.

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

@Anthony_eng If the sample data you've posted is sufficiently representative of what you've got then below should work.

data have;
  infile datalines truncover;
  input have_string $200.;
  datalines;
~68.1626,Trans,DATATYPE,0601
~68.1626,Trans,DATATYPE,0601~68.1809,Rec,DATATYPE,29EEF98A 10 37 46 01 80 11 00 6A
~68.1626,Trans,DATATYPE,0601~68.1809,Rec,DATATYPE,29EEF98A 10 37 46 01 80 11 00 6A~68.1855,Rec,DATATYPE,29EEF98A 27 01 02 7A 00 4F FF FF
~69.2385,Trans,DATATYPE,0602
;


data want(drop=_:);
  row_id=_n_;
  set have;
  length want_string $17;
  _prxid=prxparse('/\bRec,DATATYPE,\w+\s\w{2}\s(([0-9a-f]{2}\s){6})/oi');
  _start=1;
  _stop=length(have_string);
  call prxnext(_prxid,_start,_stop,strip(have_string),_pos,_len);
  do while(_pos>0);
    want_string=prxposn(_prxid, 1, strip(have_string));
    output;
    call prxnext(_prxid,_start,_stop,strip(have_string),_pos,_len);
  end;
run;

proc print data=want;
run;

Patrick_0-1647734830790.png

 

 

View solution in original post

2 REPLIES 2
Patrick
Opal | Level 21

@Anthony_eng If the sample data you've posted is sufficiently representative of what you've got then below should work.

data have;
  infile datalines truncover;
  input have_string $200.;
  datalines;
~68.1626,Trans,DATATYPE,0601
~68.1626,Trans,DATATYPE,0601~68.1809,Rec,DATATYPE,29EEF98A 10 37 46 01 80 11 00 6A
~68.1626,Trans,DATATYPE,0601~68.1809,Rec,DATATYPE,29EEF98A 10 37 46 01 80 11 00 6A~68.1855,Rec,DATATYPE,29EEF98A 27 01 02 7A 00 4F FF FF
~69.2385,Trans,DATATYPE,0602
;


data want(drop=_:);
  row_id=_n_;
  set have;
  length want_string $17;
  _prxid=prxparse('/\bRec,DATATYPE,\w+\s\w{2}\s(([0-9a-f]{2}\s){6})/oi');
  _start=1;
  _stop=length(have_string);
  call prxnext(_prxid,_start,_stop,strip(have_string),_pos,_len);
  do while(_pos>0);
    want_string=prxposn(_prxid, 1, strip(have_string));
    output;
    call prxnext(_prxid,_start,_stop,strip(have_string),_pos,_len);
  end;
run;

proc print data=want;
run;

Patrick_0-1647734830790.png

 

 

Anthony_eng
Obsidian | Level 7

Good Morning Patrick,

 

Thank you so much for your help!  Your code is wonderful, I had hit a wall trying to figure out how to compile this last group of data.  Thank you again, I truly appreciate it.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 348 views
  • 1 like
  • 2 in conversation