Hi all SAS Users,
Because learning CSV is long progress so I still keep continuing with my importing excel along with learning csv at the same time.
I import my data from xlsx file to SAS as below( this file only contains one sheet named Sheet1)
proc import datafile= "E:\Harv 27th_Feb_2021\Report final\exchange_rate_Datastream_adjusted_xlsx"
out= exc_rate
dbms= xlsx
replace;
range= "Sheet1$A:BC";
getnames= yes;
run;
Afterward, I used the PROC CONTENTS to see if there is any variable not in numeric (because my dataset is all about exchange rate, so must be numeric)
proc contents data=exc_rate out=abc;
run;
The result shows that there are some columns is under character type. And of course, it is not what I want, so I want to ask is there any chance to tell SAS from import step that I want to convert all variables to numeric types ?
A part of my data is as below
DATECUR AUD EUR CAD DKK HKD ILS JPY NZD
12/30/1998 1.63132 0.8548 1.5511 6.3655 7.7477 4.1654 114.715 1.90223
12/31/1998 1.63026 0.85157 1.536 6.3645 7.7473 4.1601 112.8 1.89286
01/01/1999 1.63026 0.85157 1.536 6.3645 7.7473 4.1601 112.8 1.89286
And the summary of datatype is as below after using PROC CONTENT
Alphabetic List of Variables and Attributes | ||||||
---|---|---|---|---|---|---|
# | Variable | Type | Len | Format | Informat | Label |
17 | ARS | Num | 8 | BEST. | ARS | |
42 | ATS | Num | 8 | BEST. | ATS | |
2 | AUD | Num | 8 | BEST. | AUD | |
18 | BRL | Num | 8 | BEST. | BRL | |
4 | CAD | Num | 8 | BEST. | CAD | |
14 | CHF | Num | 8 | BEST. | CHF | |
19 | CLP | Num | 8 | BEST. | CLP | |
20 | CNY | Num | 8 | BEST. | CNY | |
21 | COP | Num | 8 | BEST. | COP | |
23 | CZK | Num | 8 | BEST. | CZK | |
1 | DATECUR | Num | 8 | MMDDYY10. | DATECUR | |
43 | DEM | Char | 8 | $8. | $8. | DEM |
5 | DKK | Num | 8 | BEST. | DKK | |
44 | EEK | Num | 8 | BEST. | EEK | |
24 | EGP | Num | 8 | BEST. | EGP | |
45 | ESP | Num | 8 | BEST. | ESP | |
3 | EUR | Num | 8 | BEST. | EUR | |
16 | GBP | Num | 8 | BEST. | GBP | |
46 | GRD | Num | 8 | BEST. | GRD | |
6 | HKD | Num | 8 | BEST. | HKD | |
22 | HRK | Num | 8 | BEST. | HRK | |
25 | HUF | Num | 8 | BEST. | HUF | |
27 | IDR | Num | 8 | BEST. | IDR | |
7 | ILS | Num | 8 | BEST. | ILS | |
26 | INR | Num | 8 | BEST. | INR | |
8 | JPY | Num | 8 | BEST. | JPY | |
28 | KES | Num | 8 | BEST. | KES | |
12 | KRW | Num | 8 | BEST. | KRW | |
38 | LKR | Num | 8 | BEST. | LKR | |
47 | LTL | Num | 8 | BEST. | LTL | |
48 | LVL | Num | 8 | BEST. | LVL | |
31 | MAD | Num | 8 | BEST. | MAD | |
30 | MXN | Num | 8 | BEST. | MXN | |
29 | MYR | Num | 8 | BEST. | MYR | |
49 | NLG | Num | 8 | BEST. | NLG | |
10 | NOK | Num | 8 | BEST. | NOK | |
9 | NZD | Num | 8 | BEST. | NZD | |
33 | PEN | Num | 8 | BEST. | PEN | |
34 | PHP | Num | 8 | BEST. | PHP | |
32 | PKR | Num | 8 | BEST. | PKR | |
35 | PLN | Num | 8 | BEST. | PLN | |
50 | ROL | Num | 8 | BEST. | ROL | |
36 | RON | Char | 8 | $8. | $8. | RON |
13 | SEK | Num | 8 | BEST. | SEK | |
11 | SGD | Num | 8 | BEST. | SGD | |
51 | SIT | Num | 8 | BEST. | SIT | |
52 | SKK | Num | 8 | BEST. | SKK | |
39 | THB | Num | 8 | BEST. | THB | |
53 | TRL | Char | 6 | $6. | $6. | TRL |
40 | TRY | Num | 8 | BEST. | TRY | |
15 | TWD | Num | 8 | BEST. | TWD | |
54 | VEB | Char | 7 | $7. | $7. | VEB |
41 | VES | Char | 8 | $8. | $8. | VES |
37 | ZAR | Num | 8 | BEST. | ZAR | |
55 | ZWD | Char | 8 | $8. | $8. | ZWD |
As can be seen from the Table: RON, TRL, VEB receive the Char type.
Warmest regards.
It would be more helpful to show one of the columns that comes in as Character that should be numeric. Proc Import with Excel files examines very few rows to set properties so you should see what is in your data that is causing problems relatively easily. Hint: If you are getting $1. for the informat and format then the first rows were all blank.
Any field that is manually entered has the issues with what people place in the fields. If a value is supposed to 0.123 and they enter 0.12.3 the second . makes the value character (or possibly a date if the numbers are just right). Sometimes you get people entering a range: 1.2 - 4.7, not a number. Or "null" "N/A" or similar values.
With Proc import the only force I am aware of sets the values as text with the MIXED=YES option.
There are a number of approaches that involve editing the SAS registry to change options. I don't bother as 1) some of these vary by which actual Excel version the file comes from XLS is not the same as XLSX and Office versions change the value settings and 2) I've worked around them for so long I'm not going to try to fight a new set of problems.
Look in the documentation for SAS/Access to PC Files and the various file types (Excel alone has I think 8 that SAS can sort of handle).
Another approach is the LIBNAME EXCEL or LIBNAME XLSX option. But if you have multiple header rows before actual data almost everything will be character and you end up writing a data step to fix it anyway. Which I have done when I had to pull data from 200 spreadsheets each with 50+ sheets. At least that data was consistent in layout and easy to parse.
No, there isn't a way to force specific types in Excel, primarily because Excel does not enforce types within a column itself.
Importing from Excel files is painful and the usual recommendation (in any system) is to use a CSV instead, which allows full control. But as you've seen full control does mean you need to actually specify all the correct options.
@Phil_NZ wrote:
Hi all SAS Users,
Because learning CSV is long progress so I still keep continuing with my importing excel along with learning csv at the same time.
I import my data from xlsx file to SAS as below( this file only contains one sheet named Sheet1)
proc import datafile= "E:\Harv 27th_Feb_2021\Report final\exchange_rate_Datastream_adjusted_xlsx" out= exc_rate dbms= xlsx replace; range= "Sheet1$A:BC"; getnames= yes; run;
Afterward, I used the PROC CONTENTS to see if there is any variable not in numeric (because my dataset is all about exchange rate, so must be numeric)
proc contents data=exc_rate out=abc; run;
The result shows that there are some columns is under character type. And of course, it is not what I want, so I want to ask is there any chance to tell SAS from import step that I want to convert all variables to numeric types ?
A part of my data is as below
DATECUR AUD EUR CAD DKK HKD ILS JPY NZD 12/30/1998 1.63132 0.8548 1.5511 6.3655 7.7477 4.1654 114.715 1.90223 12/31/1998 1.63026 0.85157 1.536 6.3645 7.7473 4.1601 112.8 1.89286 01/01/1999 1.63026 0.85157 1.536 6.3645 7.7473 4.1601 112.8 1.89286
And the summary of datatype is as below after using PROC CONTENT
Alphabetic List of Variables and Attributes # Variable Type Len Format Informat Label 17 ARS Num 8 BEST. ARS 42 ATS Num 8 BEST. ATS 2 AUD Num 8 BEST. AUD 18 BRL Num 8 BEST. BRL 4 CAD Num 8 BEST. CAD 14 CHF Num 8 BEST. CHF 19 CLP Num 8 BEST. CLP 20 CNY Num 8 BEST. CNY 21 COP Num 8 BEST. COP 23 CZK Num 8 BEST. CZK 1 DATECUR Num 8 MMDDYY10. DATECUR 43 DEM Char 8 $8. $8. DEM 5 DKK Num 8 BEST. DKK 44 EEK Num 8 BEST. EEK 24 EGP Num 8 BEST. EGP 45 ESP Num 8 BEST. ESP 3 EUR Num 8 BEST. EUR 16 GBP Num 8 BEST. GBP 46 GRD Num 8 BEST. GRD 6 HKD Num 8 BEST. HKD 22 HRK Num 8 BEST. HRK 25 HUF Num 8 BEST. HUF 27 IDR Num 8 BEST. IDR 7 ILS Num 8 BEST. ILS 26 INR Num 8 BEST. INR 8 JPY Num 8 BEST. JPY 28 KES Num 8 BEST. KES 12 KRW Num 8 BEST. KRW 38 LKR Num 8 BEST. LKR 47 LTL Num 8 BEST. LTL 48 LVL Num 8 BEST. LVL 31 MAD Num 8 BEST. MAD 30 MXN Num 8 BEST. MXN 29 MYR Num 8 BEST. MYR 49 NLG Num 8 BEST. NLG 10 NOK Num 8 BEST. NOK 9 NZD Num 8 BEST. NZD 33 PEN Num 8 BEST. PEN 34 PHP Num 8 BEST. PHP 32 PKR Num 8 BEST. PKR 35 PLN Num 8 BEST. PLN 50 ROL Num 8 BEST. ROL 36 RON Char 8 $8. $8. RON 13 SEK Num 8 BEST. SEK 11 SGD Num 8 BEST. SGD 51 SIT Num 8 BEST. SIT 52 SKK Num 8 BEST. SKK 39 THB Num 8 BEST. THB 53 TRL Char 6 $6. $6. TRL 40 TRY Num 8 BEST. TRY 15 TWD Num 8 BEST. TWD 54 VEB Char 7 $7. $7. VEB 41 VES Char 8 $8. $8. VES 37 ZAR Num 8 BEST. ZAR 55 ZWD Char 8 $8. $8. ZWD
As can be seen from the Table: RON, TRL, VEB receive the Char type.
Warmest regards.
It would be more helpful to show one of the columns that comes in as Character that should be numeric. Proc Import with Excel files examines very few rows to set properties so you should see what is in your data that is causing problems relatively easily. Hint: If you are getting $1. for the informat and format then the first rows were all blank.
Any field that is manually entered has the issues with what people place in the fields. If a value is supposed to 0.123 and they enter 0.12.3 the second . makes the value character (or possibly a date if the numbers are just right). Sometimes you get people entering a range: 1.2 - 4.7, not a number. Or "null" "N/A" or similar values.
With Proc import the only force I am aware of sets the values as text with the MIXED=YES option.
There are a number of approaches that involve editing the SAS registry to change options. I don't bother as 1) some of these vary by which actual Excel version the file comes from XLS is not the same as XLSX and Office versions change the value settings and 2) I've worked around them for so long I'm not going to try to fight a new set of problems.
Look in the documentation for SAS/Access to PC Files and the various file types (Excel alone has I think 8 that SAS can sort of handle).
Another approach is the LIBNAME EXCEL or LIBNAME XLSX option. But if you have multiple header rows before actual data almost everything will be character and you end up writing a data step to fix it anyway. Which I have done when I had to pull data from 200 spreadsheets each with 50+ sheets. At least that data was consistent in layout and easy to parse.
Hi @ballardw
It is very useful knowledge to me, many thanks!!
I've had some success using the Text to Columns feature in Excel. Try highlighting the columns you want to convert to numbers in Excel, select the Data menu, then Text to columns. Use the General data type. Its not guaranteed to work but is worth a shot.
@SASKiwi wrote:
I've had some success using the Text to Columns feature in Excel. Try highlighting the columns you want to convert to numbers in Excel, select the Data menu, then Text to columns. Use the General data type. Its not guaranteed to work but is worth a shot.
One of the OP's other threads shows a CSV with values of NA. So the approach would have to be a document-wide search and replace of "NA" to blank.
Hi @ballardw , @SASKiwi , and @Reeza
I know it is not the optimal solution, but follow what @ballardw suggested, I replace all the "NA" by blank in this sheet (it is a very simple thing, just Ctr+H, fill in the form and enter), and run, now all column becomes Numeric.
I appreciate all your contribution, I am still learning CSV but still need to keep my progress.
Warmest regards.
P/S:
Hi @ballardw
Can I ask what does this mean, I can not read your abbreviation here, warm regards.
One of the OP's other threads shows
@Phil_NZ - OP = Original Poster, that is you!
SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.
Ready to level-up your skills? Choose your own adventure.