BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Ramani
Fluorite | Level 6

Hi,

I am having trouble spliiting some unstructured data.

My data looks like this

ColA                                                                             ColB

Fund1                        [2011-07-01--2013-07-31] Richard Glass
Fund2                [2011-06-30--2015-05-15] William M. Noonan; [2011-06-30--2015-05-15] Philip A. Noonan ;                                                                      [2011-06-30--2015-05-15] David Canal
Fund3                [2012-01-03--] Robert E. Hall; [2012-01-03--] Keith A. Lee; [2012-01-03--] Kempton M. Ingersol; [2012-01-03--2015-                              03-01] Amy Y. Zhang; [2013-12-31--] Damien Davis; [2014-12-31--] Andrew J. Fones

I want my data to look like this

 

ColA             Entry1          Exit1          Manager1                Entry2        Exit2        Manager2           Entry3,exit3, Manager3 and so on

Fund1          20110701     20130731  Richard Glass         

Fund2         20110630     20150515   William M. Noonan  20110630 20150515  Phili A.Noonan

 

Currently I am using a scan function to split using delimiters,but it is taking too long

I am trying something like this, but i feel like I need to repeat this code too much and it is not optimal.

data split;
set ms_fundmanager;
length v1-v10 $100.;
array v(10) $;
do i = 1 to dim(v);
v[i]=scan(manager_history,i,';','M');
end;
run;

data split1;
set split;
length av1-av2 $100.;
array av(2) $;
do i = 1 to dim(av);
av[i]=scan(v1,i,']','M');
end;
rename av2 = manager1;
run;

1 ACCEPTED SOLUTION

Accepted Solutions
PGStats
Opal | Level 21

This would extract the fields into a long format dataset:

 

data have;
infile datalines4 truncover;
input fund $ txt $500. ;
datalines4;
Fund1  [2011-07-01--2013-07-31] Richard Glass
Fund2  [2011-06-30--2015-05-15] William M. Noonan; [2011-06-30--2015-05-15] Philip A. Noonan ;  [2011-06-30--2015-05-15] David Canal
Fund3  [2012-01-03--] Robert E. Hall; [2012-01-03--] Keith A. Lee; [2012-01-03--] Kempton M. Ingersol; [2012-01-03--2015-03-01] Amy Y. Zhang; [2013-12-31--] Damien Davis; [2014-12-31--] Andrew J. Fones
;;;;

data want;
if not prxId then prxId + prxParse("/\[(\d{4}-\d{2}-\d{2})--(\d{4}-\d{2}-\d{2})?\] ([^[;]+)/i");
set have;
start = 1;
stop = length(txt);
call prxnext(prxId, start, stop, txt, pos, len);
do No = 1 by 1 while (pos>0);
    entry = prxposn(prxId, 1, txt);
    exit = prxposn(prxId, 2, txt);
    manager = prxposn(prxId, 3, txt);
    output;
    call prxnext(prxId, start, stop, txt, pos, len);
    end;
drop prxId start stop pos len txt;
run;

proc print; run;
PG

View solution in original post

2 REPLIES 2
PGStats
Opal | Level 21

This would extract the fields into a long format dataset:

 

data have;
infile datalines4 truncover;
input fund $ txt $500. ;
datalines4;
Fund1  [2011-07-01--2013-07-31] Richard Glass
Fund2  [2011-06-30--2015-05-15] William M. Noonan; [2011-06-30--2015-05-15] Philip A. Noonan ;  [2011-06-30--2015-05-15] David Canal
Fund3  [2012-01-03--] Robert E. Hall; [2012-01-03--] Keith A. Lee; [2012-01-03--] Kempton M. Ingersol; [2012-01-03--2015-03-01] Amy Y. Zhang; [2013-12-31--] Damien Davis; [2014-12-31--] Andrew J. Fones
;;;;

data want;
if not prxId then prxId + prxParse("/\[(\d{4}-\d{2}-\d{2})--(\d{4}-\d{2}-\d{2})?\] ([^[;]+)/i");
set have;
start = 1;
stop = length(txt);
call prxnext(prxId, start, stop, txt, pos, len);
do No = 1 by 1 while (pos>0);
    entry = prxposn(prxId, 1, txt);
    exit = prxposn(prxId, 2, txt);
    manager = prxposn(prxId, 3, txt);
    output;
    call prxnext(prxId, start, stop, txt, pos, len);
    end;
drop prxId start stop pos len txt;
run;

proc print; run;
PG
Ramani
Fluorite | Level 6

Thank you,  this works wonderfully well. I did not know about those functions, I will learn about them.

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 2 replies
  • 357 views
  • 1 like
  • 2 in conversation