DATA Step, Macro, Functions and more

Identifying data longer than defined

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 13
Accepted Solution

Identifying data longer than defined

Hi Team,

 

I have raw data in .csv file. The length of all the attributes are defined and given by business but they still would like to identify if any data has exceeded already defined length. 

How can i do identify ? What if column A has length greater than 11 in Raw File? How can i filter the observation out?

 

DATA <dataset1> <dataset2>;
INFILE <fileName>
DLM=',' LRECL=32000 DSD MISSOVER FIRSTOBS=2 TERMSTR=CRLF;
INPUT
A :$11.
B
C :$54.
;
IF _ERROR_=1 THEN OUTPUT <dataset1>;
ELSE OUTPUT <dataset2>;
RUN;


Accepted Solutions
Solution
‎04-13-2016 01:54 PM
Super User
Posts: 10,044

Re: Identifying data longer than defined

Posted in reply to Vimal_Kurup
Opps.
Check column= option

INFILE 
DLM=',' LRECL=32000 DSD MISSOVER FIRSTOBS=2 TERMSTR=CRLF
column=len ;
INPUT
A :$11. @;
put len=;

INPUT B @;
put len=;

INPUT C :$54. @;
put len=;

View solution in original post


All Replies
Super User
Posts: 5,437

Re: Identifying data longer than defined

Posted in reply to Vimal_Kurup

Try make your informats much wider/longer.

But this will make quite an over head for your data import process. A simple matter of filed length should be verified by the sending party.

Data never sleeps
Super User
Posts: 11,343

Re: Identifying data longer than defined

I might be tempted to read the data with the specification and then turn Proc import loose with a large guessing rows option.. Look at the resulting variable lengths by position in the dataset using either proc contents, Sashelp.vcolumn or Dictionary.columns. If there are variables in the proc import results that are longer then the defined variables then you something to target.  

Respected Advisor
Posts: 4,934

Re: Identifying data longer than defined

Posted in reply to Vimal_Kurup

Do

 

DATA <dataset1> <dataset2>;
INFILE <fileName>
DLM=',' LRECL=32000 DSD MISSOVER FIRSTOBS=2 TERMSTR=CRLF;
INPUT
A :$32.
B
C :$54.
;
IF _ERROR_=1 OR length(A) > 11 THEN OUTPUT <dataset1>;
ELSE OUTPUT <dataset2>;
RUN;

PG
Super User
Posts: 10,044

Re: Identifying data longer than defined

Posted in reply to Vimal_Kurup
Check length= option:


INFILE 
DLM=',' LRECL=32000 DSD MISSOVER FIRSTOBS=2 TERMSTR=CRLF
length=len ;
INPUT
A :$11. @;
put len=;

INPUT B @;
put len=;

INPUT C :$54. @;
put len=;


Solution
‎04-13-2016 01:54 PM
Super User
Posts: 10,044

Re: Identifying data longer than defined

Posted in reply to Vimal_Kurup
Opps.
Check column= option

INFILE 
DLM=',' LRECL=32000 DSD MISSOVER FIRSTOBS=2 TERMSTR=CRLF
column=len ;
INPUT
A :$11. @;
put len=;

INPUT B @;
put len=;

INPUT C :$54. @;
put len=;

Occasional Contributor
Posts: 13

Re: Identifying data longer than defined

I have employed something very similar. Thanks for your time
Super User
Posts: 7,854

Re: Identifying data longer than defined

Posted in reply to Vimal_Kurup

How about using a macro?

%macro do_input(varname,len);
input ___&varname :$%eval(&len+100). @;
length &varname $&len;
&varname = ___&varname;
if length(___&varname) > &len
then &varname._err = 1;
else &varname._err = 0;
drop ___&varname;
%mend;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
Occasional Contributor
Posts: 13

Re: Identifying data longer than defined

Posted in reply to KurtBremser
Hi Kurt,

Thanks for your time. I have employed something very similar though not through macro.

I like your solution.
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 8 replies
  • 422 views
  • 2 likes
  • 6 in conversation