BookmarkSubscribeRSS Feed
AhmedAccordBGro
Fluorite | Level 6

Hi, the case is that I have a data (one column csv file), but some values ending with a trailing space.

 

I use proc import to get it in to SAS.

 

Issue is that I need to use the strings with space, but it always gets imported with no space.

 

It guesses the max length properly, but when retrieving the data, it always trim the space.

 

Also, it's saved to the dataset without the trailing space.

 

anyone can help here?

12 REPLIES 12
PeterClemmensen
Tourmaline | Level 20

Welcome to the SAS Community.

 

Use a data step to import CSV files instead. Do a Google search or search this community. Plenty of code examples out there 🙂

AhmedAccordBGro
Fluorite | Level 6

Thanks for welcoming me aboard:)

 

I tried dataset as well, and same issue, for two days i'm exploring the internet for a solution, so I decided to ask here..

 

below is my code, 

 

sample data "ABDCEF" that got processed normally, but like "ABCDEF " (with trailing space) it got trimmed.

 


data IMPORT;
infile "/folders/myfolders/ecsql193/DM.csv"
delimiter = ','
missover
firstobs = 2
DSD
lrecl = 32767;
Length File_Name $12.;
Length Worksheet_Name $12.;
Length Start_Position $2.;
Length End_Position $4.;
Length Structure $8;
Length File_Process_Flag $8;
Length KPI_Process_Flag $31;
Length Fact_Name $103;
Length F_DIM_Category $35;
Length F_DIM_Product_Group $6;
Length F_DIM_Product $21;
Length F_DIM_Subject_Area $22;
Length DIM_Month $1;
Length DIM_RatePlan_Type $1;
Length DIM_Cust_Seg $1;
Length DIM_Traffic_Type $1;
Length DIM_BB_Spd $1;
Length DIM_BB_Technology $1;
Length DIM_TER_NW__FLAG_Y_N_ $1;
Length DIM_USG_TIME__Flag_Y_N_ $1;
Length DIM_BNDL_Flag__PAYG_Flag_Y_N_ $1;
Length DIM_REV_CAT $1;
Length DIM_ROAM_CNTY $1;
Length DIM_DATA_BUNDLE $1;
Length Measures $26;
Length Conditions $17;
input
File_Name
Worksheet_Name
Start_Position
End_Position
Structure
File_Process_Flag
KPI_Process_Flag
Fact_Name
F_DIM_Category
F_DIM_Product_Group
F_DIM_Product
F_DIM_Subject_Area
DIM_Month
DIM_RatePlan_Type
DIM_Cust_Seg
DIM_Traffic_Type
DIM_BB_Spd
DIM_BB_Technology
DIM_TER_NW__FLAG_Y_N_
DIM_USG_TIME__Flag_Y_N_
DIM_BNDL_Flag__PAYG_Flag_Y_N_
DIM_REV_CAT
DIM_ROAM_CNTY
DIM_DATA_BUNDLE
Measures
Conditions
;
Run;

Kurt_Bremser
Super User

Once a character variable is defined with a certain length, it will always be padded up to that length with blanks. A missing character variable contains only blanks.

So if your character column has the correct length, everything's OK.

AhmedAccordBGro
Fluorite | Level 6

What if data looks like:

 

ABC

  DEF

XYZ  (two trailing spaces)

 

How to handle this?

Kurt_Bremser
Super User

See this code as an example:

data test;
infile datalines dlm="," dsd;
input number string:$char12.;
test_of_string = put(string,$hex24.);
datalines4;
1,ABC
2,  DEF
3,XYZ
;;;;

proc print data=test noobs;
run;

Result:

number    string         test_of_string

   1      ABC       414243202020202020202020
   2        DEF     202044454620202020202020
   3      XYZ       58595A202020202020202020

As you can see, all values are padded with blanks to the defined length, and the use of the $CHARw. informat preserves leading blanks. With the standard $w. informat, leading blanks would be discarded.

AhmedAccordBGro
Fluorite | Level 6

Thanks man, but if you add extra space after row#3, you'll get my point exactly. spaces are trimmed, and I need to preserve them!

 

data test;
infile datalines dlm="," dsd;
input number string:$char12.;
test_of_string = put(string,$hex24.);
datalines4;
1,ABC
2, ABC
3,ABC   
;;;;

proc print data=test noobs;

%PUT

AhmedAccordBGro
Fluorite | Level 6

Thanks buddy anyways, i got stuck and I think I should handle it out of SAS, on the data level.

Kurt_Bremser
Super User

@AhmedAccordBGro wrote:

Thanks man, but if you add extra space after row#3, you'll get my point exactly. spaces are trimmed, and I need to preserve them!

 

data test;
infile datalines dlm="," dsd;
input number string:$char12.;
test_of_string = put(string,$hex24.);
datalines4;
1,ABC
2, ABC
3,ABC   
;;;;

proc print data=test noobs;

%PUT


PS what is that %PUT doing in there?

AhmedAccordBGro
Fluorite | Level 6

yes I need to preserve the blank space, 

 

I even executed below query, and found the space is trimmed!

 

Proc SQL;
Select Length(string) From test;
Quit;

 

Results in 3, 4, 3

FreelanceReinh
Jade | Level 19

Hi @AhmedAccordBGro,

 

I think I see your point: You need to preserve the blanks contained in your CSV file because they carry valuable information, e.g., 'XYZ', ' XYZ', 'XYZ ' and 'XYZ     ' must later be recognized as different character values in the SAS dataset. Normally this would fail because of the padding (trailing) blanks which are contained in every character variable of sufficient length: You couldn't distinguish between these blanks and the original blanks from the CSV file as they would look exactly the same (characters with ASCII code [decimal] 32, i.e. hex 20).

 

For a possible solution you could replace the original blanks by a suitable replacement character which does not occur in the CSV file, perhaps the "non-breakable space" with ASCII code 160 (hex A0), while reading the CSV file.

 

Here's an example:

/* Create a text file for demonstration */

filename test temp;

data _null_;
file test;
put 'ABC' / ' DEF' / 'XYZ  ' / '2 words';
run;

/* Read the text file preserving the blanks contained 
   (as protected blanks, hex A0) */

data want;
infile test;
input @;
_infile_=translate(_infile_,'A0'x,' ');
input c :$6.;
run;

/* Show the result */

proc print data=want;
format c $hex12.;
run;

Result:

Obs    c

 1     414243202020
 2     A04445462020
 3     58595AA0A020
 4     32A0776F7264

As you can see, leading blanks (obs. 2), trailing blanks (obs. 3) and embedded blanks (obs. 4) can still be recognized in character variable C (by their hex code A0) and distinguished from those trailing blanks (hex 20) which SAS introduced as usual (such as the last blank in obs. 3 and the trailing blanks in obs. 1 and 2).

 

This approach should work for a comma-delimited text file as well.

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 12 replies
  • 5563 views
  • 0 likes
  • 4 in conversation