Hi there, I'm trying to create a new column with a given pattern found in another column.
My data looks like this:
data have;
infile datalines dlm="¬";
length SERIAID $20;
input SERIAID $;
datalines;
L1CU
L1UL1C
L2C
L2CL3U
L2CU
L3UC
L3CZ
L4C
L4CL5U
L6U
;
run;
I want to add a new column called "Level" where I add the existing pattern "L", "X" ( a number") and a "C". If this pattern does not exist (for example in the last row), that row should be removed from the dataset.
data have2;
set have;
level = scan(SERIALID, 'C');
RUN;
But I get a column with only blanks.
Does anyone knows how to do this in a fast way? I'm new to SAS so I'm not sure which SAS functions is better for stringing.
Thanks
Then do:
data want;
if not id then id + prxParse("/L\d+C/");
set have;
call prxSubstr(id, SERIAID, pos, len);
if pos then newColumn = substr(SERIAID, pos, len);
else delete;
drop id pos len;
run;
SCAN looks for a delimited "word" by specified count position: Scan('this is a list of words', 4) returns the fourth word: "list".
So I am pretty sure your log would show some variation of ;
NOTE: Character values have been converted to numeric values at the places given by: (Line):(Column). 29:13 NOTE: Invalid numeric data, 'c' , at line 29 column 13. NOTE: Argument 2 to function SCAN('abc',.) at line 29 column 6 is invalid.
which tells pretty clearly that SCAN is not what you want.
It is not clear exactly what you are searching for since L and C occur multiple times in multiple positions in some of those values.
You should provide examples of what you expect the result for LEVEL should be for your example data.
Thanks @ballardw I actually forgot to give an example for the output. I want the new column named "level" to show the pattern "LXC", where X is a number, present in SERIAID:
L1C
L1C
L2C
L2C
L2C
NA (not existing)
L3C
L4C
L4C
NA (not existing)
The rows where the pattern "LXC" does not exist (NA (not existing)) should also be removed from the dataset.
Pattern matching is best done with PERL regular expressions:
data want;
set have;
if prxMatch("/L\d+C/", SERIAID);
run;
Then do:
data want;
if not id then id + prxParse("/L\d+C/");
set have;
call prxSubstr(id, SERIAID, pos, len);
if pos then newColumn = substr(SERIAID, pos, len);
else delete;
drop id pos len;
run;
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.