DATA Step, Macro, Functions and more

Forcing a numeric type on a predominately character column when importing from excel

Reply
Valued Guide
Posts: 505

Forcing a numeric type on a predominately character column when importing from excel

Forcing a numeric type on a predominately character column when importing from excel

Interesting excel issue on SAS-L

source
https://listserv.uga.edu/cgi-bin/wa?A2=SAS-L;440cf102.1701d

Passthru is the best solution because you can fix things on the excel side before
SAS transforms the data. MS SQL on MS excell.

A dumb example.

Cannot be absolutely sure of the results below because I have Access to PC Files

With Access to PC-Files

1. Use DBSASTYPE option
2. Use mix option (works with and without Access to PC Files
3. Use passthru to excel

* Without Access to PC files

4. SAS Use mix option and proc import as character then change in SAS (do not need Access to PC-Files?)
5. SAS Use addtics VBA macro and proc import to force all columns to be character (if predominately numeric change later)

Without SAS and no limit on number of obs in SAS dataset

6. WPS/R Forces a numeric type so all character values are converted to missing

HAVE excel workbook d:/xls/class.xlsx and sheet=class
========================================================

Up to 40 obs from d:/xls/class.xlsx total obs=19

AGEC NAME SEX AGE HEIGHT WEIGHT

AA Alfred M 14 69 112.5
AA Alice F 13 56.5 84
AA Barbara F 13 65.3 98
AA Carol F 14 62.8 102.5
AA Henry M 14 63.5 102.5
AA James M 12 57.3 83
AA Jane F 12 59.8 84.5
AA Janet F 15 62.5 112.5
AA Jeffrey M 13 62.5 84
AA John M 12 59 99.5
AA Joyce F 11 51.3 50.5
AA Judy F 14 64.3 90
AA Louise F 12 56.3 77
AA Mary F 15 66.5 112
AA Philip M 16 72 150
AA Robert M 12 64.8 128
AA Ronald M 15 67 133
AA Thomas M 11 57.5 85
99 William M 15 66.5 112


WANT SAS DATASET WORK.WANTWPS
==============================

Up to 40 obs from wantwps total obs=19

AGEC NAME SEX AGE HEIGHT WEIGHT

. Alfred M 14 69 112.5
. Alice F 13 56.5 84
. Barbara F 13 65.3 98
. Carol F 14 62.8 102.5
. Henry M 14 63.5 102.5
. James M 12 57.3 83
. Jane F 12 59.8 84.5
. Janet F 15 62.5 112.5
. Jeffrey M 13 62.5 84
. John M 12 59 99.5
. Joyce F 11 51.3 50.5
. Judy F 14 64.3 90
. Louise F 12 56.3 77
. Mary F 15 66.5 112
. Philip M 16 72 150
. Robert M 12 64.8 128
. Ronald M 15 67 133
. Thomas M 11 57.5 85
99 William M 15 66.5 112

WORKING CODE
=============

With Access to PC-Files

1. Use DBSASTYPE option
set xel.class(dbsastype=(agec='numeric'));

2. Use mix option (works with and without Access to PC Files
Comes in as character so you have to change in SAS

3. Use passthru to excel

* Without Access to PC files

4. SAS Use mix option and proc import as character then change in SAS (do not need Access to PC-Files?)
5. SAS Use addtics VBA macro and proc import to force all columns to be character (if predominately numeric change later)

Without SAS and no limit on number of obs in SAS dataset

6. WPS/R Forces a numeric type so all character values are converted to missing


* _ _ _
__| | |__ ___ __ _ ___| |_ _ _ _ __ ___
/ _` | '_ \/ __|/ _` / __| __| | | | '_ \ / _ \
| (_| | |_) \__ \ (_| \__ \ |_| |_| | |_) | __/
\__,_|_.__/|___/\__,_|___/\__|\__, | .__/ \___|
; |___/|_|

1. Use DBSASTYPE option

%utlfkil(d:\xls\class.xlsx);
libname xel "d:\xls\class.xlsx";

data xel.class;
retain agec;
set sashelp.class;
agec='AA';
if _n_=19 then agec='99';
run;quit;

;run;quit;

data class;
set xel.class(dbsastype=(agec='numeric'));
run;

libname xel clear;

14 libname xel "d:/xls/class.xlsx";
NOTE: Libref XEL was successfully assigned as follows:
Engine: EXCEL
Physical Name: d:/xls/class.xlsx
15 data xel.class;
16 set sashelp.class;
17 ;run;

NOTE: There were 19 observations read from the data set SASHELP.CLASS.
NOTE: The data set XEL.class has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.02 seconds
cpu time 0.00 seconds


17 ! quit;
18 data class;
19 set xel.class(dbsastype=(age='numeric'));
20 run;

NOTE: There were 19 observations read from the data set XEL.class.
NOTE: The data set WORK.CLASS has 19 observations and 5 variables.
NOTE: DATA statement used (Total process time):
real time 0.05 seconds
cpu time 0.03 seconds

_ _
_ __ ___ (_)_ _____ __| |
| '_ ` _ \| \ \/ / _ \/ _` |
| | | | | | |> < __/ (_| |
|_| |_| |_|_/_/\_\___|\__,_|


PROC IMPORT OUT= mydata
DATAFILE= "d:/xls/class.xlsx"
DBMS=EXCEL REPLACE;
SHEET="'CLASS'";
GETNAMES=YES;
MIXED=yes;
RUN;


_____ _
_ __ __ _ ___ __|_ _| |__ _ __ _ _
| '_ \ / _` / __/ __|| | | '_ \| '__| | | |
| |_) | (_| \__ \__ \| | | | | | | | |_| |
| .__/ \__,_|___/___/|_| |_| |_|_| \__,_|
|_|

proc sql dquote=ansi;
connect to excel (Path="d:\xls\class.xlsx" mixed=yes);
create
table classsql as
select * from connection to Excel
(
Select
*
,iif(isnumeric(agec),agec,'.') as agecfix
from
class
);
disconnect from Excel;
Quit;

%utl_optlen(inp=classsql, out=classsql);

Up to 40 obs from classsql total obs=19

Obs AGECFIX AGEC NAME SEX AGE HEIGHT WEIGHT

1 . AA Alfred M 14 69 112.5
2 . AA Alice F 13 56.5 84
3 . AA Barbara F 13 65.3 98
4 . AA Carol F 14 62.8 102.5
5 . AA Henry M 14 63.5 102.5
6 . AA James M 12 57.3 83
7 . AA Jane F 12 59.8 84.5
8 . AA Janet F 15 62.5 112.5
9 . AA Jeffrey M 13 62.5 84
10 . AA John M 12 59 99.5
11 . AA Joyce F 11 51.3 50.5
12 . AA Judy F 14 64.3 90
13 . AA Louise F 12 56.3 77
14 . AA Mary F 15 66.5 112
15 . AA Philip M 16 72 150
16 . AA Robert M 12 64.8 128
17 . AA Ronald M 15 67 133
18 . AA Thomas M 11 57.5 85
19 99 99 William M 15 66.5 112


_ _ _____ _ _
__ _ __| | __| |_ _(_) ___| | _____
/ _` |/ _` |/ _` | | | | |/ __| |/ / __|
| (_| | (_| | (_| | | | | | (__| <\__ \
\__,_|\__,_|\__,_| |_| |_|\___|_|\_\___/


/* T000225 CONVERTING ALL EXCEL CELLS TO CHARACTER */
Latest VBA macros Addticks - forces all excel cells to be character
before importing excel sheets to SAS this macro should be used

/* this will stop when cell has TIUQ which is quit backwards */
/* this is probably the best macro */
Sub AddTicks()
Dim LastPlace, Z As Variant, X As Variant
LastPlace = ActiveCell.SpecialCells(xlLastCell).Address
ActiveSheet.Range(Cells(1, 1), LastPlace).Select
Z = Selection.Address 'Get the address
For Each X In ActiveSheet.Range(Z) 'Do while
If Len(X) > 0 Then 'Find cells with something
X.FormulaR1C1 = Chr(39) & Mid(X, 1, Len(X)) '39 is code for tick
Else
X.FormulaR1C1 = "" 'If empty do not put tick
End If
If X = "TIUQ" Then
Exit Sub
End If
Next
End Sub

/* T000226 REMOVES TICKS FROM EXCEL CELLS */
Sub RemoveTicks()
For Each currentcell In Selection
If currentcell.HasFormula = False Then
'Verifies that procedure does not change the
'cell with the active formula so that it contains
'only the value.
currentcell.Formula = currentcell.Value
End If
Next
End Sub

__ ______ ____ ____
\ \ / / _ \/ ___| | _ \
\ \ /\ / /| |_) \___ \ _____| |_) |
\ V V / | __/ ___) |_____| _ <
\_/\_/ |_| |____/ |_| \_\

 

Note you can do this with IML interface to R
%utl_submit_wps64('
options set=R_HOME "C:/Program Files/R/R-3.3.2";
libname wrk "%sysfunc(pathname(work))";
proc r;
submit;
source("C:/Program Files/R/R-3.3.2/etc/Rprofile.site", echo=T);
library(XLConnect);
wb <- loadWorkbook("d:/xls/class.xlsx");
agen<-readWorksheet(wb, "class", keep="AGEC", colTypes="numeric", forceConversion=TRUE);
shet<-readWorksheet(wb, "class", drop="AGEC");
df<-cbind(agen,shet);
class(df);
saveWorkbook(wb)
endsubmit;
import r=df data=wrk.wantwps;
run;quit;
');

 

 

Ask a Question
Discussion stats
  • 0 replies
  • 153 views
  • 0 likes
  • 1 in conversation