BookmarkSubscribeRSS Feed
LB3
Calcite | Level 5 LB3
Calcite | Level 5

I have an Excel file (a1:vk119) that looks like this:

 

Term1/1/20191/2/20191/3/20191/4/2019...
122.5%2.6%2.3%2.5% 
133.2%3.4%3.6%3.1%...
142.75%3.2%2.7%3.15%...
..................

 

I need to tranpose the file so that I have Term and Date with corresponding values for each Date/Term combination. I've tried two methods for importing the file. I tried Using:

proc import

datafile="\\filepath\excelFile.xlsx"

Out=XLFile

DBMS=Excel

replace;

SHEET="Sheet1";

Getnames=YES;

This gives me some char fields with underscores that I can compress to get the dates, however I don't get all of the columns a:vx. So I used, DBMS=XLSX but I strangely end up with a 2 sets of columns names which I don't know what I can do with. 

TermDay_Name_1Day_Name_2..._43476_43483...
122.5%2.6%2.3%2.4%2.65%...
133.2%3.4%3.6%3.1%3.35%...
.....................

 

I can transpose the table like this:

proc transpose data=XLFile out=XLFileOut;

by TERM;

var _:;

but I end up missing a large number of dates from the file. I'm in a large company and the file is updated frequently and used by other departments, so I can't change the excel file directly. Any thoughts as to why the column names are coming into SAS like this and any ideas for workarounds?

6 REPLIES 6
ballardw
Super User

Show us proc contents of the SAS data set.

SAS will have a variable name for every column that imported. If column VX was imported then ALL of the columns are in the data somewhere. But you may need to read the log about duplicate names or such.

 

And then show the code you used to "compress to get the dates". I suspect that you need to watch your log for the step where you compress things as well.

 

Part of what you are seeing likely comes from "e is updated frequently and used by other departments" and the typical Excel user does not pay any attention to what Excel does for entering values.So you get mixes of column headings with "numeric" and character dates depending on who edits them.

I might suggest saving the xlsx file to CSV and import the file that way. At least then you will see which columns get read how as a data step will be generated.

 

Values like 43476 are likely to be the numeric dates with an Excel date format applied. Excel dates are the number of days since 1 Jan 1900.

 

LB3
Calcite | Level 5 LB3
Calcite | Level 5
 

It's fairly large, but here's a sample of proc contents:

ProcContents.png

 

 

 

LB3
Calcite | Level 5 LB3
Calcite | Level 5
This actually does a fairly good job of transposing the data and formating the dates, but again, I'm missing a number of records/dates:

proc transpose data=XLFileOut out=XLTrans;
by TERM;
var _:;
run;


proc sql;
create table FinalOuput as
select
Term
,COL1
,input(_LABEL_,best.)-21916 as DATE format=date9.
FROM XLTrans;
ballardw
Super User

Proc transpose really does not do well with a mix of numeric and character variables on a VAR statement.

So if any of the variables you want from the VAR are character that would be one cause. Since Proc Import does not guarantee the type of any specific column then you would likely have columns that are character if the first twenty or so rows of values for the column are missing.

 

Second your transpose will not include any of the "day_term" variables because the names do not start with _. So are those the "missing" variables?

Your proc contents tells me that your original XLSX file likely has mixed values in the names. I see that column 9 and 14 apparently have actual date numeric values but the other columns from 2 to 326 do not.

If this file is supposed to have one column per day of the year I would likely look at reading it with using a data step and either a CSV version of the data or PC File server.

If my one column per date idea is correct a CSV could be read with something like:

data want;
  infile "<path>\file.csv" dlm=',' lrecl=32000 obs=2;
  input term @;
  do date= '01Jan2019'd to '31Dec2019'd ; /* or what ever the last column should be*/
   input value @;
   output;
  end;
  format date date9.;
run;

 

LB3
Calcite | Level 5 LB3
Calcite | Level 5
This seems to give me the Term and Date columns ok, but the value column is either null or 0. Am I missing a step?
ballardw
Super User

Yes, you are missing any actual data.

Instructions here: https://communities.sas.com/t5/SAS-Communities-Library/How-to-create-a-data-step-version-of-your-dat... will show how to turn an existing SAS data set into data step code that can be pasted into a forum code box using the </> icon or attached as text to show exactly what you have and that we can test code against.

 

Probably the example data should include about 15 of the "date" columns.

 

First your shown transpose is throwing away every variable that does not start with _. So if you actually have any of those in FinalOutput there is more code you haven't shared. Second, does XLTrans have Any Col2, Col3 (or higher numbered).

 

I suggest, if you haven't already, printing the first 3 rows of the XLFile data set. You may be surprised to find a lot of missing values. This could happen from people entering 2.3% as strings instead of 0.023 and assigning a percent display in Excel. The other thing to check with your 0 values is what Format is currently assigned. If the format is something like f3.0 then values less than 0.05 (less than 5%) will display as 0.

Run this code a look in the log for the result as one example.

data _null_;
   x=0.023;
   put 'F3.0 format: ' x= f3.0  'with Percent8.1 format: ' x= percent8.1;
run;

 

hackathon24-white-horiz.png

The 2025 SAS Hackathon Kicks Off on June 11!

Watch the live Hackathon Kickoff to get all the essential information about the SAS Hackathon—including how to join, how to participate, and expert tips for success.

YouTube LinkedIn

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 1514 views
  • 0 likes
  • 2 in conversation