DATA Step, Macro, Functions and more

Help - Splitting a data set

Accepted Solution Solved
Reply
New Contributor
Posts: 2
Accepted Solution

Help - Splitting a data set

Hello,

I am hoping for some help with parsing a string in a data set to create a one-many relationship.

What I have: A data set with two variables/fields.  1 containing an Order_Number, the other containing Order_notes / user input free flow text

Order_NumberOrder_Notes
12345

14-07-05[userid1] Note1
14-07-05[userid2] Note2

14-07-05[userid3] Note3

What I need: A similiar data set, but with the User_Notes parsed into individual records:

Order_NumberOrder_Notes
1234514-07-05[userid1] Note1
1234514-07-05[userid2] Note2
1234514-07-05[userid3] Note3

Obviously, my dataset is much larger, and the number of Order_Notes for each Order_Number is dynamic.  Also, the length of the UserID is variable.  However, each user note is seperated by a carriage return inside the Order_Notes.  Any help is very greatly appreciated.

Thank you,

Cory


Accepted Solutions
Solution
‎08-06-2014 02:34 PM
Super User
Posts: 5,500

Re: Help - Splitting a data set

You'll need to determine the hex code that represents a carriage return, but then it becomes pretty easy.

data want;

   set have;

   length just_1_note $ 100;

   do i=1 to 10000 until (just_1_note=' ');

      just_1_note = scan(order_notes, 'HC'x);

      if just_1_note > ' ' then output;

   end;

   keep order_number just_1_note;

   rename just_1_note = order_notes;

run;

Whatever the proper hex code is, replace HC with those characters.

View solution in original post


All Replies
Solution
‎08-06-2014 02:34 PM
Super User
Posts: 5,500

Re: Help - Splitting a data set

You'll need to determine the hex code that represents a carriage return, but then it becomes pretty easy.

data want;

   set have;

   length just_1_note $ 100;

   do i=1 to 10000 until (just_1_note=' ');

      just_1_note = scan(order_notes, 'HC'x);

      if just_1_note > ' ' then output;

   end;

   keep order_number just_1_note;

   rename just_1_note = order_notes;

run;

Whatever the proper hex code is, replace HC with those characters.

New Contributor
Posts: 2

Re: Help - Splitting a data set

Posted in reply to Astounding

Astounding,

THANK YOU!

Now, would there be a way to do that same thing, only broken out with a regular expression?

Super User
Posts: 5,500

Re: Help - Splitting a data set

I can't say ... I don't know enough about regular expressions.  There are some on the board who do, but you would be more likely to get a reply if you post that as a separate question.  An old, answered question will soon be off the radar screens.

Good luck.

Respected Advisor
Posts: 4,920

Re: Help - Splitting a data set

With regular expressions, it would be something like :

data test;

length Note $40;

if not prxID then prxID + prxparse("/.*/");

set myData;

start = 1;

stop = length(Order_Notes);

call prxnext(prxID, start, stop, Order_Notes, position, length);

do while (position > 0);

    Note = substr(Order_Notes, position, length);

    output;

    start + 1;

    call prxnext(prxID, start, stop, Order_Notes, position, length);

    end;

keep Order_Number Note;

run;

proc print data=test noobs; run;

Note: within a regular expression the period (.) matches everything except a newline.

PG

PG
🔒 This topic is solved and locked.

Need further help from the community? Please ask a new question.

Discussion stats
  • 4 replies
  • 238 views
  • 3 likes
  • 3 in conversation