Hi Guys ! Please help me figure this out
I have a payments dataset which stores the payments collected from different customers (ID column represent each customer) on different dates
The dataset which looks like this (in excel) :-
ID 31-Mar-19 30-Apr-19 31-May-13 30-Jun-13
1 1000 1200 1000 1000
2 2000 2000 2000 2200
3 3000 3000 3000 3030
4 1500 1600 1650 1600
For customer ID "1", the amount collected on 31-May-19 was 1000, on 30-Apr-19 was 1200 and so on....
The problem with the dataset is that it does not have any CHARATCER headers for the payments.
I need to perform some tasks on this dataset in SAS, but for that I need the dataset to be in this form
ID Date1 Date2 Date3 Date4
31-Mar-19 30-Apr-19 31-May-13 30-Jun-13
1 1000 1200 1000 1000
2 2000 2000 2000 2200
3 3000 3000 3000 3030
4 1500 1600 1650 1600
So basically I need to shift the dates to a new row and add custom headers (Date1 to Date4) to the dataset. I have to add these headers in SAS as I am prohibited to do any changes in the excel data.
Please help me out !
There is no need for an extra row to make TRANSPOSE work from your dataset, you just need to retrieve the names of the columns that have to be transposed:
data have;
input ID '31-Mar-19'n '30-Apr-19'n '31-May-13'n '30-Jun-13'n;
datalines;
1 1000 1200 1000 1000
2 2000 2000 2000 2200
3 3000 3000 3000 3030
4 1500 1600 1650 1600
;
proc sql noprint;
select nliteral(name) into :varlist separated by " "
from dictionary.columns
where libname = "WORK" and memname = "HAVE" and upcase(name) ne "ID";
quit;
proc transpose
data=have
out=long (rename=(col1=payment));
;
by id;
var &varlist.;
run;
data want;
set long;
date = input(_name_,date9.);
format date yymmdd10.;
drop _name_;
run;
If ID ends up as numeric in your data, I would also use the last data step to convert it to a character value; if it is already character after import, the transpose would even be simpler, as you could use
var _numeric_;
without a specific list.
Note: You do not want a "header row". Since a very common activity with data sets is sorting anything on a "row" would get shifted. Plus, SAS will not allow you to place character values into a numeric field.
SAS has something called a LABEL that you associate with variables to provide more information.
You can assign labels 1) permanently when the set is created or using Proc Datasets on an existing data set or 2) provide Label statements in almost any procedure to override the default.
Labels can be quite a bit longer , up to 256 characters, to provide more information in output.
Example:
data example; input ID Date1 Date2 Date3 Date4; label Date1='31-Mar-19' Date2='30-Apr-19' Date3='31-May-13' Date4='30-Jun-13' ; datalines; 1 1000 1200 1000 1000 2 2000 2000 2000 2200 3 3000 3000 3000 3030 4 1500 1600 1650 1600 ; proc print data=example label; run; /*providing labels at use*/ proc print data=example noobs label; label date1='Report date: 31 Mar 2019' date2='Sale date: 30 Apr 2019' ; run;
You can, in some procedures control some aspects of the display by adding special characters in telling those procedures to use the special character to "split" text rows:
proc print data=example noobs label split='*'; label date1='Report* date:* 31 Mar 2019' date2='Sale* date:* 30 Apr 2019' ; run;
If I understand what is happening here is you have data in a spreadsheet where the DATE value is stored in the column header.
To use such data for most things in SAS (or really any type of analysis) you probably will want to convert it to a "tall" structure where the DATE value is stored in a variable.
So perhaps something like this:
Obs id col date value
1 1 1 31MAR2019 1000
2 1 2 30APR2019 1200
3 1 3 31MAY2013 1000
4 1 4 30JUN2013 1000
5 2 1 31MAR2019 2000
6 2 2 30APR2019 2000
7 2 3 31MAY2013 2000
8 2 4 30JUN2013 2200
9 3 1 31MAR2019 3000
10 3 2 30APR2019 3000
11 3 3 31MAY2013 3000
12 3 4 30JUN2013 3030
13 4 1 31MAR2019 1500
14 4 2 30APR2019 1600
15 4 3 31MAY2013 1650
16 4 4 30JUN2013 1600
If you wanted to look at the data in a way that was similar to your original spreadsheet you could just make a report from the more useful tall structure.
proc report ;
column id value,date ;
define id / group;
define date / across;
define value / ' ';
run;
Results
Hi Tom ! I do not want the data to look similar to my original spreadsheet. My end goal is to convert this data (the data without date headers) to TALL structure (like you have mentioned). To achive that I want headers on the date columns. I have made the TALL data structure using the "data with date headers" in it by using proc transpose
PROC TRANSPOSE data=.....
by ID
VAR Date1-Date4
and doing some cleaning steps on the obtained data.
The problem is that in the actual data (the data which doesn't have the date headers) there are no headers on dates, and I can not write actual dates in the 'VAR statement' as these dates might change in the next data. So, I need a method which can add these headers for me.
Use
options validvarname=v7;
before importing the excel-file, this will give you something like
In proc transpose you can then use
var _:;
to select all variables starting with an underscore.
There is no need for an extra row to make TRANSPOSE work from your dataset, you just need to retrieve the names of the columns that have to be transposed:
data have;
input ID '31-Mar-19'n '30-Apr-19'n '31-May-13'n '30-Jun-13'n;
datalines;
1 1000 1200 1000 1000
2 2000 2000 2000 2200
3 3000 3000 3000 3030
4 1500 1600 1650 1600
;
proc sql noprint;
select nliteral(name) into :varlist separated by " "
from dictionary.columns
where libname = "WORK" and memname = "HAVE" and upcase(name) ne "ID";
quit;
proc transpose
data=have
out=long (rename=(col1=payment));
;
by id;
var &varlist.;
run;
data want;
set long;
date = input(_name_,date9.);
format date yymmdd10.;
drop _name_;
run;
If ID ends up as numeric in your data, I would also use the last data step to convert it to a character value; if it is already character after import, the transpose would even be simpler, as you could use
var _numeric_;
without a specific list.
Thank You everyone and especially @Kurt_Bremser , This solves the issue !
Do you really have a dataset with variable names that look like dates? Or are those the variable labels?
Anyway just use PROC TRANPOSE and then generate the DATE from the either the name or the label.
proc transpose data=have out=tall ;
by id;
run;
data want;
set tall;
date = input(_name_,date11.);
format date date9.;
run;
If you really converted an EXCEL spreadsheet into a SAS dataset and the header row had actual date values in Excel then most likely the variable names have been converted into the number of days since 1900 (as Excel counts them). So you might need to work a little harder to convert them into actual date values.
date = input(compress(_name_,'_'),32.)+'30DEC1899'd ;
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.