DATA Step, Macro, Functions and more

Importing specific Excel .xlsx rows

Reply
Occasional Contributor
Posts: 15

Importing specific Excel .xlsx rows

I posted this in the statistical procedures section by accident, but I think it is more appropriate here. I'm new to the board. I apologize for the duplicate posts.

I'm relatively new to the DATA and PROC IMPORT steps of SAS (am more statistics oriented and have not managed importing data into SAS, specifically) and I'm wondering if there is a way to import specific rows into SAS. The problem with my Excel sheets is that they contain a lot of data in rows that I don't need. Also, the first line of the Excel sheet is not a list of variable names (I didn't design these spreadsheets, obviously!).

Specifically, this is what I need to do:

Import cell B2 (this is study ID#) and then import A10, B10 and A11, B11. Then I would like B12 - Y12 (these are time variables) and then B24 - Y24 (the values that correspond to B12-Y12 variables).

Anything in an A cell is a variable, but I don't necessarily NEED it as I could figure out what variable I'm looking at later on.

There are multiple worksheets in each workbook, and I am willing to read each workbook/worksheet in by hand if necessary.

Ordinarily I would run a macro/VBA in Excel to get this info but our security policies are such that we aren't allowed to run macros in Excel or Access!

All suggestions appreciated!

Respected Advisor
Posts: 4,644

Re: Importing specific Excel .xlsx rows

Assuming each part of your data has a uniform data type (character or numeric), you can read all the parts (ranges) in separate datasets with the LIBNAME Engine :

libname xl Excel "&sasforum\datasets\copony.xlsx" header=no access=readonly;

proc sql;

create table id as select F1 as ID from xl.'Sheet1$B2:B2'n;

create table A10B10 as select F1 as A10, F2 as B10 from xl.'Sheet1$A10:B10'n;

create table A11B11 as select F1 as A11, F2 as B11 from xl.'Sheet1$A11:B11'n;

create table B12Y12 as select * from xl.'Sheet1$B12:Y12'n;

create table B24Y24 as select * from xl.'Sheet1$B24:Y24'n;

quit;

libname xl clear;

When you read Excel data without column names (header=no) the variables take the names F1, F2, etc.

You can then combine the parts (SAS datasets id, A10B10, A11B11, B12Y12, B24Y24) with other SQL or data steps.

PG

PG
Occasional Contributor
Posts: 15

Re: Importing specific Excel .xlsx rows

Thank you very much PGStats - the import worked. The only thing is that the data in cells B12-Y12 is 24 hour style time (e.g., 0:00, 1:00, 2:00 through 24:00). The time converts to "30Dec1899" for all the values in the SAS Table. Do you know of a code to transform this back into time points?

Trusted Advisor
Posts: 1,300

Re: Importing specific Excel .xlsx rows

You can do this at the input stage using the dbsastype option.

proc sql;

create table b12y12 as select * from xl.'Sheet1$B12:Y12'n (dbsastype=(b12="time" ...));

quit;

Occasional Contributor
Posts: 15

Re: Importing specific Excel .xlsx rows

@FriedEgg, when I input exactly as you wrote I get "ERROR: Invalid syntax for option DBSASTYPE'

I also tried inputting without the ellipses and got "ERROR: Invalid column name specified DBSASTYPE option: b12'

Am I omitting something?

Respected Advisor
Posts: 4,644

Re: Importing specific Excel .xlsx rows

It might be less trouble to use libname options telling SAS to watch for time values:

libname xl Excel "&sasforum\datasets\copony.xlsx" header=no access=readonly usedate=yes scantime=yes;

proc sql;

create table id as select F1 as ID from xl.'Feuil1$B2:B2'n;

create table A10B10 as select F1 as A10, F2 as B10 from xl.'Feuil1$A10:B10'n;

create table A11B11 as select F1 as A11, F2 as B11 from xl.'Feuil1$A11:B11'n;

create table B12Y12 as select * from xl.'Feuil1$B12:Y12'n;

create table B24Y24 as select * from xl.'Feuil1$B24:Y24'n;

quit;

libname xl clear;

PG

PG
Occasional Contributor
Posts: 15

Re: Importing specific Excel .xlsx rows

Awesome , that was the trick! Thanks so much!

Now, since there are multiple worksheets, how do I combine all the tables so I can get one table of all the values?

Respected Advisor
Posts: 4,644

Re: Importing specific Excel .xlsx rows

Well, first, how do you want the 5 ranges extracted from a single worksheet to be combined into a dataset? - PG

PG
Occasional Contributor
Posts: 15

Re: Importing specific Excel .xlsx rows

Well...

--B12Y12 is the clock time 00:00 through 24:00 and it is consistent throughout all sheets

--B2 is a unique ID specific to the following variables:

    --A10:B10 is variable1:string value for variable 1

    --A11:B11 is variable2:string value for variable 2

    --B24:Y24 are numeric values that correspond to the time in B12:Y12

    --B25:Y25 are numeric values that correspond to the time in B12:Y12

    --B35:Y35 are numeric values that correspond to the time in B12:Y12

There are roughly 100 unique IDs (identified in B2 in each sheet), each ID has an average of 3 corresponding worksheets of data. Each sheet has the same layout, but it represents a different day of values recorded for the unique ID. Did I explain this clearly? Please let me know if I can clarify further.

Respected Advisor
Posts: 4,644

Re: Importing specific Excel .xlsx rows

How can you tell apart the dates corresponding to a given ID?

PG
Occasional Contributor
Posts: 15

Re: Importing specific Excel .xlsx rows

B10 in each sheet is actually a date/time it's a string value due to the way it was entered in the original Excel worksheet, example: 21October2010/2345.

Respected Advisor
Posts: 4,644

Re: Importing specific Excel .xlsx rows

I think something like this should work.

/* Define a macro to read one worksheet */

%macro getdata(file,ds);

libname xl Excel "&sasforum\datasets\&file..xlsx" header=no access=readonly usedate=yes scantime=yes;

proc sql;

create table id as select F1 as ID from xl.'Sheet1$B2:B2'n;

create table A10B10 as select catx(" ", F1, F2) as date from xl.'Sheet1$A10:B10'n;

create table A11B11 as select catx(" ", F1, F2) as var2 from xl.'Sheet1$A11:B11'n;

create table t as

select * from xl.'Sheet1$B12:Y12'n

union all

select * from xl.'Sheet1$B24:Y24'n

union all

select * from xl.'Sheet1$B25:Y25'n

union all

select * from xl.'Sheet1$B35:Y35'n;

quit;

libname xl clear;

proc transpose data=t out=tl; run;

proc sql;

create table &ds as

select

    id,

    date,

    var2,

    col1 as t,

    col2 as val1 format=best.,

    col3 as val2 format=best.,

    col4 as val3 format=best.

from id, A10B10, A11B11, tl;

drop table id, A10B10, A11B11, tl, t;

quit;

%mend;

/* Call the macro */

%getdata(copony,ds001);

/* ... repeat for all data files. Give all output datasets names starting with ds */

/* Assemble all datasets that have names starting with ds */

data allDs;

set ds:;

run;

PG

PG
Occasional Contributor
Posts: 15

Re: Importing specific Excel .xlsx rows

Thanks for your code (sorry for my delayed response - I had a death in my family last month and am just getting back into this SAS project) conceptually it seems sound but I'm getting many errors... can you (or anyone) help me figure out why I'm getting so many errors when I run the code? Errors are as follows:

65   from id, A10C10, A11B11, tl;

66

67   drop table id, A10C10, A11B11, tl, t;

68

69   quit;

70

71   %mend;

72

73

74

75   /* Call the macro */

76

77   %getdata(copony,ds001);

NOTE: Libref XL was successfully assigned as follows:

      Engine:        EXCEL

      Physical Name: "&sasforum\datasets\&file.xlsx

NOTE: Table WORK.ID created, with 1 rows and 1 columns.

NOTE: Table WORK.A10C10 created, with 1 rows and 1 columns.

NOTE: Table WORK.A11B11 created, with 1 rows and 1 columns.

ERROR: Column 1 from the first contributor of UNION ALL is not the same type as its counterpart

       from the second.

ERROR: Column 2 from the first contributor of UNION ALL is not the same type as its counterpart

       from the second.

ERROR: Column 3 from the first contributor of UNION ALL is not the same type as its counterpart

       from the second.

(and so on through row 24...)

and:

NOTE: The SAS System stopped processing this step because of errors.

NOTE: PROCEDURE SQL used (Total process time):

      real time           2.16 seconds

      cpu time            0.43 seconds

NOTE: Libref XL has been deassigned.

ERROR: File WORK.T.DATA does not exist.

NOTE: The SAS System stopped processing this step because of errors.

WARNING: The data set WORK.TL may be incomplete.  When this step was stopped there were 0

         observations and 0 variables.

NOTE: PROCEDURE TRANSPOSE used (Total process time):

      real time           0.14 seconds

      cpu time            0.04 seconds

ERROR: Table WORK.TL doesn't have any columns. PROC SQL requires each of its tables to have at

       least 1 column.

ERROR: Character expression requires a character format.

ERROR: Character expression requires a character format.

ERROR: Character expression requires a character format.

ERROR: The following columns were not found in the contributing tables: col1, col2, col3, col4.

NOTE: Table WORK.ID has been dropped.

NOTE: Table WORK.A10C10 has been dropped.

NOTE: Table WORK.A11B11 has been dropped.

NOTE: Table WORK.TL has been dropped.

WARNING: File WORK.T.DATA does not exist.

WARNING: Table WORK.T has not been dropped.

NOTE: The SAS System stopped processing this step because of errors.

78


And finally:

86

87   set ds:;

ERROR: The data set list (WORK.dsSmiley Happy does not contain any members.

88

89   run;

NOTE: The SAS System stopped processing this step because of errors.

WARNING: The data set WORK.ALLDS may be incomplete.  When this step was stopped there were 0

         observations and 0 variables.

NOTE: DATA statement used (Total process time):

      real time           0.51 seconds

      cpu time            0.04 seconds



Respected Advisor
Posts: 4,644

Re: Importing specific Excel .xlsx rows

Welcome back. Sorry for your loss.

&sasforum\datasets\ isn't a path that is meaningful in your envirionment. It is defined on my machine to point to a particular folder. You should replace it with the location of your xlsx files; something like


libname xl Excel "C:\temp\&file..xlsx" ...

PG

PG
Occasional Contributor
Posts: 15

Re: Importing specific Excel .xlsx rows

thanks for responding so quickly!

&sasforum\datasets\isn't a path that is meaningful in your envirionment. It is defined on my machine to point to a particular folder. You should replace it with the location of your xlsx files; something like


libname xl Excel "C:\temp\&file..xlsx" ...

Yes - I definitely know this and had replaced that path with one that is meaningful in my environment - I just didn't want to post it on a public forum :-). Even with the correct path, I still got all those errors...

Ask a Question
Discussion stats
  • 23 replies
  • 3258 views
  • 7 likes
  • 4 in conversation