- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
Hello experts,
I have a sample dataset with 10 columns, all of them are text characters, and each column has over 400 rows. I would like to search any rows that are over 50 characters in those 10 columns. Please advise how to approach it. Thank you.
Accepted Solutions
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you've decided on 10 rows.
it's a simple code.
proc sql;
create table want as
select * from sample
where length(col1)>50 or
length(col2)>50 or
length(col3)>50 or
length(col4)>50 or
length(col5)>50 or
length(col6)>50 or
length(col7)>50 or
length(col8)>50 or
length(col9)>50 or
length(col10)>50
;
quit;
If the number of columns fluctuates, you can use the max and length functions to find the maximum number of characters and check if it exceeds 50.
data want;
set sample;
array c{*} col1-col10;
do i=1 to dim(c);
maxl=max(maxl,length(c{i}));
end;
if maxl>50;
drop i;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
And do what with them? What do you want as output?
@ybz12003 wrote:
Hello experts,
I have a sample dataset with 10 columns, all of them are text characters, and each column has over 400 rows. I would like to search any rows that are over 50 characters in those 10 columns. Please advise how to approach it. Thank you.
data want;
set have;
array _chars(10) list of variables here;
array _l(10) len1-len10;
do i=1 to 10;
if length(_chars(i)) >= 50 then _l(i) =1;
else _l(i)=0;
end;
run;
- Mark as New
- Bookmark
- Subscribe
- Mute
- RSS Feed
- Permalink
- Report Inappropriate Content
If you've decided on 10 rows.
it's a simple code.
proc sql;
create table want as
select * from sample
where length(col1)>50 or
length(col2)>50 or
length(col3)>50 or
length(col4)>50 or
length(col5)>50 or
length(col6)>50 or
length(col7)>50 or
length(col8)>50 or
length(col9)>50 or
length(col10)>50
;
quit;
If the number of columns fluctuates, you can use the max and length functions to find the maximum number of characters and check if it exceeds 50.
data want;
set sample;
array c{*} col1-col10;
do i=1 to dim(c);
maxl=max(maxl,length(c{i}));
end;
if maxl>50;
drop i;
run;