DATA Step, Macro, Functions and more

SAS dataset variable values cleansing

Accepted Solution Solved
Reply
Contributor
Posts: 55
Accepted Solution

SAS dataset variable values cleansing

[ Edited ]

Hi,

 

I am trying to import data from excel into SAS dataset, I am facing issues with data contains double quotes and ampersand.

 

 

Code used:

 

libname xlsFile XLSX "/path/monthly.xlsm";
options validvarname=v7;
options SYMBOLGEN MPRINT;


PROC SQL;
    create table  work.raw_data  as 
	(select * from xlsFile.datal);
quit;

 

Excel data:

 

1. "Online system" and "Mobile data"

2,  Online system & Mobile data

 

SAS dataset:

 

1. "Online system" and "Mobile data"

2,  Online system & Mobile data

 

Expected data:

 

1. Online system and Mobile data

2, Online system & Mobile data

 


Accepted Solutions
Solution
‎09-26-2016 11:42 AM
Super User
Super User
Posts: 7,060

Re: SAS dataset variable values cleansing

Posted in reply to jayakumarmm

I cannot recreate your problem.  When I create an Excel spreadsheet with those values SAS reads them in the same as they are in Excel.  What version of SAS and Excel are you using?  Why are you using an XLSM file instead of an XLSX file?

 

Now if you want to remove the quotes from the middle of you string your best option is to just strip them out using the COMPRESS() function.

 

VAR1 = compress(VAR1,'"');

And if you really do want to translate HTML codes like & back into the characters they represent then use HTMLDECODE() function.

VAR1 = htmldecode(VAR1);

View solution in original post


All Replies
Super User
Posts: 19,822

Re: SAS dataset variable values cleansing

Posted in reply to jayakumarmm

How are you importing your data?

 

Contributor
Posts: 55

Re: SAS dataset variable values cleansing

Hi,

 

Given below is the code

 

Code used:

 

libname xlsFile XLSX "/path/monthly.xlsm";
options validvarname=v7;
options SYMBOLGEN MPRINT;


PROC SQL;
    create table  work.raw_data  as 
	(select * from xlsFile.datal);
quit;
Super User
Posts: 19,822

Re: SAS dataset variable values cleansing

[ Edited ]
Posted in reply to jayakumarmm

You can use dequote() to strip quotes or compress() to remove them from the text. 

 

If the text has quotes in Excel it will in SAS and that seems the correct behaviour to me. I also don't get the & converted to HTML (amp) so I think there's something else behind the data in Excel? Or the forum changed the value?

 

I'm using SAS 9.4 TS1M3 and Excel 2010

 

I get the following, which is exactly what I'd expect.

 

"Online system" and "Mobile data"

Online system & Mobile data

 

 

Contributor
Posts: 55

Re: SAS dataset variable values cleansing

Forum is displaying the same & values which I have updated.
Super User
Posts: 19,822

Re: SAS dataset variable values cleansing

Posted in reply to jayakumarmm

To remove the quotes use compress() on the field. 

 

 

Super User
Posts: 10,035

Re: SAS dataset variable values cleansing

Posted in reply to jayakumarmm

1) try to use PROC IMPORT .

2)
data have;
a=' "Online system" and "Mobile data" '; b=htmldecode(a);output;
a='  Online system & Mobile data  '; b=htmldecode(a);output;
run;

Solution
‎09-26-2016 11:42 AM
Super User
Super User
Posts: 7,060

Re: SAS dataset variable values cleansing

Posted in reply to jayakumarmm

I cannot recreate your problem.  When I create an Excel spreadsheet with those values SAS reads them in the same as they are in Excel.  What version of SAS and Excel are you using?  Why are you using an XLSM file instead of an XLSX file?

 

Now if you want to remove the quotes from the middle of you string your best option is to just strip them out using the COMPRESS() function.

 

VAR1 = compress(VAR1,'"');

And if you really do want to translate HTML codes like & back into the characters they represent then use HTMLDECODE() function.

VAR1 = htmldecode(VAR1);
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 7 replies
  • 698 views
  • 0 likes
  • 4 in conversation