BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
daisy6
Quartz | Level 8

Hello,

 

I would like to keep the leading and trailing spaces when import the csv files into SAS. For example,

hoping SAS treat the below response as different.

 

"ItemKey","Response",
"4615313059","contribution  "
"4615313067","   contribution"

 

however, when I use:

filename xfile "datadir\test.csv" encoding="utf-8" lrecl=532767;
proc import datafile=xfile out=test1 dbms=csv replace;
getnames=yes;
datarow=2;
guessingrows=592767;
run;

 

SAS treat them as the same. Can somebody help me out?

 

1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User

Proc import generates data step code. Copy that code to the editor and modify.

If you remove the option DSD on the infile statement. That might work for you need if your data does not contain any commas in the value of any of your fields. You might have to strip out quote marks though. If your actual data does not have quotes then we will have to go back to the drawing board and require some parsing of your file differently. 

 

Compare the results from these two data steps with and without DSD.

data work.want;
   infile datalines dlm=',' ;
   informat itemkey $12. response $25.;
   input itemkey response;

datalines;
"4615313059","contribution  "
"4615313067","   contribution"
;
run;

data work.want2;
   infile datalines dlm=',' dsd;
   informat itemkey $12. response $25.;
   input itemkey response;

datalines;
"4615313059","contribution  "
"4615313067","   contribution"
;
run;

View solution in original post

8 REPLIES 8
Reeza
Super User

PROC IMPORT will not do this, you'll have to write a custom data step to import the file and specify the information using CHAR to get the leading and trailing spaces. 

 

Trailing spaces always exist in SAS, depending on the length of the variable. 

 


@daisy6 wrote:

Hello,

 

I would like to keep the leading and trailing spaces when import the csv files into SAS. For example,

hoping SAS treat the below response as different.

 

"ItemKey","Response",
"4615313059","contribution  "
"4615313067","   contribution"

 

however, when I use:

filename xfile "datadir\test.csv" encoding="utf-8" lrecl=532767;
proc import datafile=xfile out=test1 dbms=csv replace;
getnames=yes;
datarow=2;
guessingrows=592767;
run;

 

SAS treat them as the same. Can somebody help me out?

 


 

ballardw
Super User

Proc import generates data step code. Copy that code to the editor and modify.

If you remove the option DSD on the infile statement. That might work for you need if your data does not contain any commas in the value of any of your fields. You might have to strip out quote marks though. If your actual data does not have quotes then we will have to go back to the drawing board and require some parsing of your file differently. 

 

Compare the results from these two data steps with and without DSD.

data work.want;
   infile datalines dlm=',' ;
   informat itemkey $12. response $25.;
   input itemkey response;

datalines;
"4615313059","contribution  "
"4615313067","   contribution"
;
run;

data work.want2;
   infile datalines dlm=',' dsd;
   informat itemkey $12. response $25.;
   input itemkey response;

datalines;
"4615313059","contribution  "
"4615313067","   contribution"
;
run;
daisy6
Quartz | Level 8

Thanks for the help. But for the trailing space, how to deal with that? like:

"ItemKey","Response","Iscorret'
"4615313059","contribution","correct"

"4615313067","contribution  ","not correct"

 

because I keep these two different response

 

If using dsd, the import dataset is the same as using proc import.  If not using dsd, the dataset will keep the quote symbol. Is any better way to deal with my case? Thanks a lot.

 

 

ballardw
Super User

@daisy6 wrote:

Thanks for the help. But for the trailing space, how to deal with that? like:

"ItemKey","Response","Iscorret'
"4615313059","contribution","correct"

"4615313067","contribution  ","not correct"

 

because I keep these two different response

 

If using dsd, the import dataset is the same as using proc import.  If not using dsd, the dataset will keep the quote symbol. Is any better way to deal with my case? Thanks a lot.

 

 


Since the code submitted with DSD was generated by Proc Import you should not be surprised that the sets are the same as proc import.

 

As @Reeza mentioned, trailing blanks are a very problematic issue. If you do not want the quotes then before removing the quotes you would have to supply some character other than a blank and then for whatever purpose you consider a trailing blank significant that character would have to suffice.

 

Personally I am having a very hard time coming up with a use case for a significant difference between a trailing blank and not.

Leading blanks, possibly, though even then I would spend a certain amount of time in how one leading blank is actually significantly different than two or more leading blanks for most purposes.

 

Can you provide an explicit example of your process that fails because the trailing blank is "not there"? Or is this hypothetical and no other processing code has been written yet?

daisy6
Quartz | Level 8

I cann't provide you the exact data for private information. However, I believe that you already gave me a good idea to deal with the leading and trailing space problem. Thank you for the reply.

 

 

ballardw
Super User

@daisy6 wrote:

I cann't provide you the exact data for private information. However, I believe that you already gave me a good idea to deal with the leading and trailing space problem. Thank you for the reply.

 

 


Just for the record, I was requesting anything related to sensitive data values, just an example of how processing could be affected.

 

 

Ksharp
Super User

I have to say, if there were two comma next to each other, Your code would generate the error result.

 


data work.want2;
   infile datalines dsd;
   input (itemkey x response) (: ~ $40.);

datalines;
"4615313059",,"contribution  "
"4615313067",,"   contribution"
;
run;

proc print;run;
kulbshar
Calcite | Level 5

Use ~ modifier along with informat $char25. Informat "Char" will preserve the leading spaces no matter if column is in quotes or not.

data work.want3;
   infile datalines dsd;
   input itemkey :$12. response ~ $char25.;
datalines;
"4615313059","contribution  "
"4615313067","   contribution"
"4615313067",   contribution
"4615313067",   contribution
;
run;

 

Thanks.

 

 

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 5612 views
  • 0 likes
  • 5 in conversation