Desktop productivity for business analysts and programmers

Excel file to SAS , alt 0010 and ACSII

Accepted Solution Solved
Reply
Contributor
Posts: 68
Accepted Solution

Excel file to SAS , alt 0010 and ACSII

Hello,

 

I own a Excel file with several columns to import in SAS .

One column contains a concatenation with several under part with this caracters " ; " follow by " alt 0010" like barrier. To separate severals data in the under bloc I use a " | " :


 blabla1|statut1|numeropersonne1;blabla2|statut2|numeropersonne2;blabla3|statut3|numeropersonne3

 
In Excel , a user adds " alt 0010 "  after " ; " to optimise the vision of data in the screen.
 
When the excel file is import in SAS I use a macro to cut the under part. If there is 1 " ; " I must see 2 lines , 2 " ; " I must see 3 lines ets ...
There is here a problem :
 
After this symbol " ; " there is " alt 0010 " in the starting of next line


After several checking :

 

* with rank function , rank(field,1) dive 13 for the ASCII caracters type 13.

* with the using of substr function in position 4 in the field start really here

* with a count function in the field SAS say that the "alt 0010" is place on 2 caracters ...

 

Can you help me to found a correct function to delete the "alt 0010 " in SAS ?

 

Thanks for your help.


Accepted Solutions
Solution
‎02-12-2018 04:07 AM
Contributor
Posts: 68

Re: Excel file to SAS , alt 0010 and ACSII

Hello,

 

After severals days to search the point of bug . This problem is fixed .

 

I explain :

 

1) The excel files are generated by an other departement of the company. To generate the data and Excel file there is this in the this code before a prox export in excel :

 

/* code sas */

proc transpose data=work.origine out=mytable(drop=_name_);
by firstkey ;
var secondkey ;
run;

data work.mytable;

set work.mytable;

length concat $7500. ;
format concat $7500. ;
informat concat $7500. ;

concat=catx('', of col:);

*drop col:;

run;

For the second key it's impossible to know the number of appearance. Due to this sitution when the proc transpose is run the number of columns is random. When can obtain 10 columns like it's possible to obtain 50 columns.

 

This line :

 

concat=catx('', of col:);

generates a "blank" or a "voide", and under Excel and SAS this it translate by a particular space.

 

To fix this bug it's necessary to write a true space follow by a function tranwrd :

 

concat=catx(' ', of col:);
concat=tranwrd(concat,'; ',';');

 

View solution in original post


All Replies
Super User
Posts: 22,823

Re: Excel file to SAS , alt 0010 and ACSII

Posted in reply to azertyuiop

COMPRESS()  - and look at the modifiers. There's one for blanks or invisible characters that may do what you need. 

 


azertyuiop wrote:

Hello,

 

I own a Excel file with several columns to import in SAS .

One column contains a concatenation with several under part with this caracters " ; " follow by " alt 0010" like barrier. To separate severals data in the under bloc I use a " | " :


 blabla1|statut1|numeropersonne1;blabla2|statut2|numeropersonne2;blabla3|statut3|numeropersonne3

 
In Excel , a user adds " alt 0010 "  after " ; " to optimise the vision of data in the screen.
 
When the excel file is import in SAS I use a macro to cut the under part. If there is 1 " ; " I must see 2 lines , 2 " ; " I must see 3 lines ets ...
There is here a problem :
 
After this symbol " ; " there is " alt 0010 " in the starting of next line


After several checking :

 

* with rank function , rank(field,1) dive 13 for the ASCII caracters type 13.

* with the using of substr function in position 4 in the field start really here

* with a count function in the field SAS say that the "alt 0010" is place on 2 caracters ...

 

Can you help me to found a correct function to delete the "alt 0010 " in SAS ?

 

Thanks for your help.


 

Super User
Super User
Posts: 9,194

Re: Excel file to SAS , alt 0010 and ACSII

Posted in reply to azertyuiop

"In Excel , a user adds " alt 0010 "  after " ; " to optimise the vision of data in the screen."

Sounds like that is where your problem lies.  Why do you have multiple components of data points delimited by a pipe, delimited by a semicolon.  Sounds to me like your going out of your way to make it as difficult as possible.  Excel has more than one cell, use them for separate elements.  More to the point however, why are you using Excel for this in the first place?  What is the benefit you gain from doing things this way.  What is this data your trying to get, and why can it not be in a standard format?  These are all questions I would be asking before I even bothered opening the file.  If you end up having to use Excel, then you need to put some rules on it, otherwise they could just post a picture into the cell, or a hyperlink somewhere else etc.

Contributor
Posts: 68

Re: Excel file to SAS , alt 0010 and ACSII

[ Edited ]

Hello ,

 

You can see the format like on this picture :

 

Sans titre.png

There are 5 departements in my company which work in a even structure of excel file.

 

Each departement check severals informations in physical paper file . For each file the departement own a list of information into the field " comment_statut_idclient " to control the paper file.

 

To reduce the quantity of line each department have decided to compact and group all information about one paper file in the same cell. In one cell it's a CSV format which is group.

 

When I receive all Excel fils in SAS I must place each information about " comment_statut_idclient " , in one line with a copy of the rest of all data in the line.

 

As I 'm not the leader of the process to decide the structure of data in Excel , I must adapt my process in Sas to cut the cell " statut " in several cells when it's necessary.

 

To "help me" in the structure of data , I must locate the symbol ";" but I must to locate " alt 0010" which is just after the symbol ";" . If I don't find a solution to delete " alt 0010" , I can't tranform my data.

Super User
Posts: 22,823

Re: Excel file to SAS , alt 0010 and ACSII

Posted in reply to azertyuiop

In other wording these are known as line breaks. If you search on here there are multiple examples on how to remove them.

And do you have a CSV or EXCEL file. 
If you have a CSV file you can remove them before you import the data. 

 

Contributor
Posts: 68

Re: Excel file to SAS , alt 0010 and ACSII

Hello ,

I my case I can use only an Excel file.
Super User
Super User
Posts: 7,845

Re: Excel file to SAS , alt 0010 and ACSII

Posted in reply to azertyuiop

Normally Excel will embed linefeeds into cells to simulate line breaks within a cell.  It sounds like from your description of how these are used you can just remove them. That is easy to do with the COMPRESS() function.

myvar = compress(myvar,'0A'x);

So you could just just use an ARRAY and a DO loop to do that for every character varaible.

data want;
  set have ;
  array _c _character_;
  do over _c ; _c=compress(_c,'0A'x); end;
run;

It is possible, but not likely, that they have also embedded carriage return characters, which is hex code '0D'x.

 

How are you reading the Excel file into your SAS dataset?  If you are using PROC IMPORT or the XLSX libname engine then the embedded linefeeds will not cause any trouble in the import.  If you are saving the file as a CSV file and then trying to read it with SAS then you can get it work easily IF they do NOT have embedded carriage returns AND you save the CSV using CR+LF as the end of line markers.  If you do that then SAS can read the CSV file without you having to make any modifications to it.

data want ;
  infile 'mydata.csv'  dsd truncover firstobs=2 termstr=crlf ;
  ....
run;

If there are embedded CR+LF character in one or more cells then you will need to pre-process the CSV file and remove them or replace them with something else.  There are examples of how to do that in solutions to other questions on this site.

Contributor
Posts: 68

Re: Excel file to SAS , alt 0010 and ACSII

Hello ,

I have trying your solution with "compresse , and with array do , for Excel with '0D'x and '0A'x , it's the same result. When I see my data , there is again a "empty" caractere after the ";" .

For CSV case I keep a note for a futur case if I have this case in the futur. In my case I'm limit with a Excel format , not a csv format.
Regular Contributor
Posts: 150

Re: Excel file to SAS , alt 0010 and ACSII

Posted in reply to azertyuiop

Ok, I 'hope' you are using Excel 2010 or later. If so, the Excel file is, in fact, zipped XML files. One option is to simply unzip the data and fix it in XML. Easy to do with a text editor , command line, SAS, etc. Then rezip it back up.

 

Solution
‎02-12-2018 04:07 AM
Contributor
Posts: 68

Re: Excel file to SAS , alt 0010 and ACSII

Hello,

 

After severals days to search the point of bug . This problem is fixed .

 

I explain :

 

1) The excel files are generated by an other departement of the company. To generate the data and Excel file there is this in the this code before a prox export in excel :

 

/* code sas */

proc transpose data=work.origine out=mytable(drop=_name_);
by firstkey ;
var secondkey ;
run;

data work.mytable;

set work.mytable;

length concat $7500. ;
format concat $7500. ;
informat concat $7500. ;

concat=catx('', of col:);

*drop col:;

run;

For the second key it's impossible to know the number of appearance. Due to this sitution when the proc transpose is run the number of columns is random. When can obtain 10 columns like it's possible to obtain 50 columns.

 

This line :

 

concat=catx('', of col:);

generates a "blank" or a "voide", and under Excel and SAS this it translate by a particular space.

 

To fix this bug it's necessary to write a true space follow by a function tranwrd :

 

concat=catx(' ', of col:);
concat=tranwrd(concat,'; ',';');

 

Super User
Super User
Posts: 9,194

Re: Excel file to SAS , alt 0010 and ACSII

Posted in reply to azertyuiop

The solution to the "problem" is to fix the "problem.  Nothing about what you have posted in any way a good process, in fact other than sending this file by email, it would appear to use every possible method to make this a process which will break 9/10 runs.

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 10 replies
  • 274 views
  • 3 likes
  • 5 in conversation