Hi,
Requesting character string data manipulation help? I have a character variable with values shown below:/*the dataset has 1 million rows*/
Variable_A
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*AED1001199990201*DEBIT
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*ZZY /*this kind ending with ZZY should be deleted*/
AkBGL.0130.AE0010201*AED*TOTAL/*this kind of obs ending with TOTAL to be kept*/
AkBGL.0130.AE0010201*ZZY/*this kind ending with ZZY should be deleted*/
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1404900037*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1424400010*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*ZZY/*this kind ending with ZZY should be deleted*/
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1403400034*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409000161*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409900008*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1412000039*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*ZZY/*this kind ending with ZZY should be deleted*/
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.2.2400.AE....AE0010201*LD1231800018*LIVEDB
Want:/*two new variables*/
new_variable1 should have values in green extracted from the character string.
new_variable2 should have values in blue extracted from the character string
data have;
input Variable_A : $200.;
cards;
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*AED1001199990201*DEBIT
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*ZZY
AkBGL.0130.AE0010201*AED*TOTAL
AkBGL.0130.AE0010201*ZZY
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1404900037*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1424400010*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*ZZY
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1403400034*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409000161*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409900008*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1412000039*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*ZZY
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.2.2400.AE....AE0010201*LD1231800018*LIVEDB
;
run;
data want;
set have;
length var1 var2 $ 100;
if scan(strip(Variable_A),-1,'*') = 'ZZY' then delete;
re=prxparse('/(?<=AED\.)(\d+).*\*(\w+)$/o');
if prxmatch(re,strip(Variable_A)) then do;
var1=prxposn(re,1,strip(Variable_A));
var2=prxposn(re,2,strip(Variable_A));
end;
drop re;
run;
Xia Keshan
Xia Keshan, You are a genius and a SAS super star. Thank you so much. I need one more help in manipulation using the same input, I'll phrase that as a separate question and will look forward to your help if you can.
Many thanks from Down under,
Mark Wik
(from Australia)
OK. If I could . Start a new session . Other SAS super stars would be able to offer you good advice too.
Xia Keshan
please mark question as answered as well
Indeed I want to mark the question answered and embrace the great genius, but the green and orange pointers isn't appearing in my screen for some reason despite the fact i logged in again and again to see if i can spot that. I guess perhaps i made a mistake to have missed out mark as a question when i first wrote the request. Please accept my heartfelt apologies
Available on demand!
Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.