Hi,
Requesting character string data manipulation help? I have a character variable with values shown below:/*the dataset has 1 million rows*/
Variable_A
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*AED1001199990201*DEBIT
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*ZZY /*this kind ending with ZZY should be deleted*/
AkBGL.0130.AE0010201*AED*TOTAL/*this kind of obs ending with TOTAL to be kept*/
AkBGL.0130.AE0010201*ZZY/*this kind ending with ZZY should be deleted*/
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1404900037*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1424400010*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*ZZY/*this kind ending with ZZY should be deleted*/
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1403400034*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409000161*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409900008*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1412000039*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*ZZY/*this kind ending with ZZY should be deleted*/
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.2.2400.AE....AE0010201*LD1231800018*LIVEDB
Want:/*two new variables*/
new_variable1 should have values in green extracted from the character string.
new_variable2 should have values in blue extracted from the character string
data have;
input Variable_A : $200.;
cards;
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*AED1001199990201*DEBIT
AkBGL.0130.AE0010201*AED*AC.1.TR.AED.10011............AE0010201*ZZY
AkBGL.0130.AE0010201*AED*TOTAL
AkBGL.0130.AE0010201*ZZY
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1404900037*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*LD1424400010*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.200004.1.17.2400.AE....AE0010201*ZZY
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1403400034*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409000161*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1409900008*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*LD1412000039*LIVEDB
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.10.2400.AE....AE0010201*ZZY
AkBGL.0920.AE0010201*AED*LD.1.TR.AED.21069.1017.AE.5Y.400022.1.2.2400.AE....AE0010201*LD1231800018*LIVEDB
;
run;
data want;
set have;
length var1 var2 $ 100;
if scan(strip(Variable_A),-1,'*') = 'ZZY' then delete;
re=prxparse('/(?<=AED\.)(\d+).*\*(\w+)$/o');
if prxmatch(re,strip(Variable_A)) then do;
var1=prxposn(re,1,strip(Variable_A));
var2=prxposn(re,2,strip(Variable_A));
end;
drop re;
run;
Xia Keshan
Xia Keshan, You are a genius and a SAS super star. Thank you so much. I need one more help in manipulation using the same input, I'll phrase that as a separate question and will look forward to your help if you can.
Many thanks from Down under,
Mark Wik
(from Australia)
OK. If I could . Start a new session . Other SAS super stars would be able to offer you good advice too.
Xia Keshan
please mark question as answered as well
Indeed I want to mark the question answered and embrace the great genius, but the green and orange pointers isn't appearing in my screen for some reason despite the fact i logged in again and again to see if i can spot that. I guess perhaps i made a mistake to have missed out mark as a question when i first wrote the request. Please accept my heartfelt apologies
Don't miss out on SAS Innovate - Register now for the FREE Livestream!
Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.