BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
luca87
Obsidian | Level 7

Hi,

I need to import a .dat file that contains this information: FILE DAT 

A                    PER: TEST0  1232156000002024-02-072024-02-07
A                    PER: TEST2 12345678000002024-02-072024-02-07
A                    PER: TEST3 XXXXXXXX000002024-02-072024-02-07
A                    PER: N° Fattura 575000002024-02-072024-02-07

and I use this code:

filename test "path/test.dat";

data test;
	length 
        ONE         $40. 
        TWO         $5.
        three       $10.
        four        $10.
;
	infile test ;
	input;
	        ONE    =ksubstr(_infile_,1	,	41	);
	        TWO    =ksubstr(_INFILE_,41	,	5	);
	        three  =ksubstr(_INFILE_,46	,	10	);
	        four   =ksubstr(_INFILE_,56	,	10	);
run;

I have a problem on the last line for the character ° because sas truncates the last character in the first column (missing 5, Fattura 575 is correct)

result.png

How can I import the file correctly?

 

Thanks,

Luca

 

1 ACCEPTED SOLUTION

Accepted Solutions
FreelanceReinh
Jade | Level 19

Hi @luca87,

 

I think the truncation is just due to the insufficiently defined length of variable ONE:


@luca87 wrote:
data test;
	length 
        ONE         $40. 
        TWO         $5.
        three       $10.
        four        $10.
;
	infile test ;
	input;
	        ONE    =ksubstr(_infile_,1	,	41	);
	        TWO    =ksubstr(_INFILE_,41	,	5	);
	        three  =ksubstr(_INFILE_,46	,	10	);
	        four   =ksubstr(_INFILE_,56	,	10	);
run;

With length $41 (no periods needed after length specifications) variable ONE should contain the missing character.

EDIT: This will also append an additional character to the values of ONE in the first three observations, though, causing an overlap with variable TWO. To avoid this overlap, use 40, not 41, in the third argument of the KSUBSTR function:

ONE=ksubstr(_infile_, 1, 40);

EDIT 2: To be on the safe side in case of more or longer multi-byte characters to be stored in variable ONE, just increase the defined length of the variable further (as this length is measured in bytes), but keep the 40 (i.e., 40 characters) in the KSUBSTR argument.

View solution in original post

3 REPLIES 3
LinusH
Tourmaline | Level 20

What encoding are your SAS session running in?

proc options option=ENCODING;
run;

And what is the encoding of your file?

If I copy your input data in my editor and import it using DATALINES, it looks correct (I'm using LATIN9 as encoding).

Data never sleeps
luca87
Obsidian | Level 7

Hi!

The .dat file is in UTF-8.

SAS:

 ENCODING=UTF-8    Specifies the default character-set encoding for the SAS session.
NOTE: PROCEDURE OPTIONS used (Total process time):
      real time           0.00 seconds
      cpu time            0.00 seconds


I add file .dat in the first topic.

 

Thanks,
Luca

FreelanceReinh
Jade | Level 19

Hi @luca87,

 

I think the truncation is just due to the insufficiently defined length of variable ONE:


@luca87 wrote:
data test;
	length 
        ONE         $40. 
        TWO         $5.
        three       $10.
        four        $10.
;
	infile test ;
	input;
	        ONE    =ksubstr(_infile_,1	,	41	);
	        TWO    =ksubstr(_INFILE_,41	,	5	);
	        three  =ksubstr(_INFILE_,46	,	10	);
	        four   =ksubstr(_INFILE_,56	,	10	);
run;

With length $41 (no periods needed after length specifications) variable ONE should contain the missing character.

EDIT: This will also append an additional character to the values of ONE in the first three observations, though, causing an overlap with variable TWO. To avoid this overlap, use 40, not 41, in the third argument of the KSUBSTR function:

ONE=ksubstr(_infile_, 1, 40);

EDIT 2: To be on the safe side in case of more or longer multi-byte characters to be stored in variable ONE, just increase the defined length of the variable further (as this length is measured in bytes), but keep the 40 (i.e., 40 characters) in the KSUBSTR argument.

sas-innovate-white.png

Missed SAS Innovate in Orlando?

Catch the best of SAS Innovate 2025 — anytime, anywhere. Stream powerful keynotes, real-world demos, and game-changing insights from the world’s leading data and AI minds.

 

Register now

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 1176 views
  • 0 likes
  • 3 in conversation