Hello,
I am hoping for some help with parsing a string in a data set to create a one-many relationship.
What I have: A data set with two variables/fields. 1 containing an Order_Number, the other containing Order_notes / user input free flow text
Order_Number | Order_Notes |
---|---|
12345 | 14-07-05[userid1] Note1 14-07-05[userid3] Note3 |
What I need: A similiar data set, but with the User_Notes parsed into individual records:
Order_Number | Order_Notes |
---|---|
12345 | 14-07-05[userid1] Note1 |
12345 | 14-07-05[userid2] Note2 |
12345 | 14-07-05[userid3] Note3 |
Obviously, my dataset is much larger, and the number of Order_Notes for each Order_Number is dynamic. Also, the length of the UserID is variable. However, each user note is seperated by a carriage return inside the Order_Notes. Any help is very greatly appreciated.
Thank you,
Cory
You'll need to determine the hex code that represents a carriage return, but then it becomes pretty easy.
data want;
set have;
length just_1_note $ 100;
do i=1 to 10000 until (just_1_note=' ');
just_1_note = scan(order_notes, 'HC'x);
if just_1_note > ' ' then output;
end;
keep order_number just_1_note;
rename just_1_note = order_notes;
run;
Whatever the proper hex code is, replace HC with those characters.
You'll need to determine the hex code that represents a carriage return, but then it becomes pretty easy.
data want;
set have;
length just_1_note $ 100;
do i=1 to 10000 until (just_1_note=' ');
just_1_note = scan(order_notes, 'HC'x);
if just_1_note > ' ' then output;
end;
keep order_number just_1_note;
rename just_1_note = order_notes;
run;
Whatever the proper hex code is, replace HC with those characters.
Astounding,
THANK YOU!
Now, would there be a way to do that same thing, only broken out with a regular expression?
I can't say ... I don't know enough about regular expressions. There are some on the board who do, but you would be more likely to get a reply if you post that as a separate question. An old, answered question will soon be off the radar screens.
Good luck.
With regular expressions, it would be something like :
data test;
length Note $40;
if not prxID then prxID + prxparse("/.*/");
set myData;
start = 1;
stop = length(Order_Notes);
call prxnext(prxID, start, stop, Order_Notes, position, length);
do while (position > 0);
Note = substr(Order_Notes, position, length);
output;
start + 1;
call prxnext(prxID, start, stop, Order_Notes, position, length);
end;
keep Order_Number Note;
run;
proc print data=test noobs; run;
Note: within a regular expression the period (.) matches everything except a newline.
PG
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.