BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
tumul
Calcite | Level 5

Hi Guys ! Please help me figure this out

I have a payments dataset which stores the payments collected from different customers (ID column represent each customer) on different dates

 

The dataset which looks like this (in excel) :-

 

ID 31-Mar-19 30-Apr-19 31-May-13 30-Jun-13

1      1000         1200          1000          1000

2      2000          2000         2000           2200

3      3000          3000         3000           3030

4      1500          1600         1650           1600

 

For customer ID "1", the amount collected on 31-May-19 was 1000, on 30-Apr-19 was 1200 and so on....

 

The problem with the dataset is that it does not have any CHARATCER headers for the payments. 

I need to perform some tasks on this dataset in SAS, but for that I need the dataset to be in this form

 

ID     Date1         Date2            Date3            Date4

      31-Mar-19    30-Apr-19    31-May-13     30-Jun-13

1      1000           1200             1000              1000

2      2000            2000            2000               2200

3      3000            3000            3000               3030

4      1500            1600           1650               1600

 

So basically I need to shift the dates to a new row and add custom headers (Date1 to Date4) to the dataset. I have to add these headers in SAS as I am prohibited to do any changes in the excel data.

 

Please help me out !

1 ACCEPTED SOLUTION

Accepted Solutions
Kurt_Bremser
Super User

There is no need for an extra row to make TRANSPOSE work from your dataset, you just need to retrieve the names of the columns that have to be transposed:

data have;
input ID '31-Mar-19'n '30-Apr-19'n '31-May-13'n '30-Jun-13'n;
datalines;
1      1000         1200          1000          1000
2      2000          2000         2000           2200
3      3000          3000         3000           3030
4      1500          1600         1650           1600
;

proc sql noprint;
select nliteral(name) into :varlist separated by " "
from dictionary.columns
where libname = "WORK" and memname = "HAVE" and upcase(name) ne "ID";
quit;

proc transpose
  data=have
  out=long (rename=(col1=payment));
;
by id;
var &varlist.;
run;

data want;
set long;
date = input(_name_,date9.);
format date yymmdd10.;
drop _name_;
run;

If ID ends up as numeric in your data, I would also use the last data step to convert it to a character value; if it is already character after import, the transpose would even be simpler, as you could use

var _numeric_;

without a specific list.

View solution in original post

8 REPLIES 8
ballardw
Super User

Note: You do not want a "header row". Since a very common activity with data sets is sorting anything on a "row" would get shifted. Plus, SAS will not allow you to place character values into a numeric field.

SAS has something called a LABEL that you associate with variables to provide more information.

 

You can assign labels 1) permanently when the set is created or using Proc Datasets on an existing data set or 2) provide Label statements in almost any procedure to override the default.

Labels can be quite a bit longer , up to 256 characters, to provide more information in output.

 

Example:

data example;
   input ID     Date1         Date2            Date3            Date4;
   label
      Date1='31-Mar-19'    
      Date2='30-Apr-19'
      Date3='31-May-13'     
      Date4='30-Jun-13'
   ;
datalines;
1      1000           1200             1000              1000
2      2000            2000            2000               2200
3      3000            3000            3000               3030
4      1500            1600           1650               1600
;

proc print data=example label;
run;

/*providing labels at use*/
proc print data=example noobs label;
   label date1='Report date: 31 Mar 2019'
         date2='Sale date:  30 Apr 2019'
   ;
run;

You can, in some procedures control some aspects of the display by adding special characters in telling those procedures to use the special character to "split" text rows:

proc print data=example noobs label split='*';
   label date1='Report* date:* 31 Mar 2019'
         date2='Sale* date:*  30 Apr 2019'
   ;
run;
Tom
Super User Tom
Super User

If I understand what is happening here is you have data in a spreadsheet where the DATE value is stored in the column header.

To use such data for most things in SAS (or really any type of analysis) you probably will want to convert it to a "tall" structure where the DATE value is stored in a variable.

So perhaps something like this:

Obs    id    col         date    value

  1     1     1     31MAR2019     1000
  2     1     2     30APR2019     1200
  3     1     3     31MAY2013     1000
  4     1     4     30JUN2013     1000
  5     2     1     31MAR2019     2000
  6     2     2     30APR2019     2000
  7     2     3     31MAY2013     2000
  8     2     4     30JUN2013     2200
  9     3     1     31MAR2019     3000
 10     3     2     30APR2019     3000
 11     3     3     31MAY2013     3000
 12     3     4     30JUN2013     3030
 13     4     1     31MAR2019     1500
 14     4     2     30APR2019     1600
 15     4     3     31MAY2013     1650
 16     4     4     30JUN2013     1600

If you wanted to look at the data in a way that was similar to your original spreadsheet you could just make a report from the more useful tall structure.

proc report ;
  column id value,date ;
  define id / group;
  define date / across;
  define value / ' ';
run;

Results

image.png

 

tumul
Calcite | Level 5

Hi Tom ! I do not want the data to look similar to my original spreadsheet. My end goal is to convert this data (the data without date headers) to TALL structure (like you have mentioned). To achive that I want headers on the date columns. I have made the TALL data structure using the "data with date headers" in it by using proc transpose

 

PROC TRANSPOSE data=.....

by ID

VAR Date1-Date4

 

and doing some cleaning steps on the obtained data.

 

The problem is that in the actual data (the data which doesn't have the date headers) there are no headers on dates, and I can not write actual dates in the 'VAR statement' as these dates might change in the next data. So, I need a method which can add these headers for me.

andreas_lds
Jade | Level 19

Use

options validvarname=v7;

before importing the excel-file, this will give you something like

header.jpg

In proc transpose you can then use

var _:;

to select all variables starting with an underscore.

Kurt_Bremser
Super User

There is no need for an extra row to make TRANSPOSE work from your dataset, you just need to retrieve the names of the columns that have to be transposed:

data have;
input ID '31-Mar-19'n '30-Apr-19'n '31-May-13'n '30-Jun-13'n;
datalines;
1      1000         1200          1000          1000
2      2000          2000         2000           2200
3      3000          3000         3000           3030
4      1500          1600         1650           1600
;

proc sql noprint;
select nliteral(name) into :varlist separated by " "
from dictionary.columns
where libname = "WORK" and memname = "HAVE" and upcase(name) ne "ID";
quit;

proc transpose
  data=have
  out=long (rename=(col1=payment));
;
by id;
var &varlist.;
run;

data want;
set long;
date = input(_name_,date9.);
format date yymmdd10.;
drop _name_;
run;

If ID ends up as numeric in your data, I would also use the last data step to convert it to a character value; if it is already character after import, the transpose would even be simpler, as you could use

var _numeric_;

without a specific list.

tumul
Calcite | Level 5

Thank You everyone and especially @Kurt_Bremser , This solves the issue !

Tom
Super User Tom
Super User

Do you really have a dataset with variable names that look like dates?  Or are those the variable labels?

Anyway just use PROC TRANPOSE and then generate the DATE from the either the name or the label.

 

proc transpose data=have out=tall ;
  by id;
run;

data want;
  set tall;
  date = input(_name_,date11.);
  format date date9.;
run;

 

If you really converted an EXCEL spreadsheet into a SAS dataset and the header row had actual date values in Excel then most likely the variable names have been converted into the number of days since 1900 (as Excel counts them).  So you might need to work a little harder to convert them into actual date values.

date = input(compress(_name_,'_'),32.)+'30DEC1899'd ;
tumul
Calcite | Level 5
The column headers were actually dates only. They were not strings, Yes ! those value were converted innto no_of_days since 1960 (after reading into SAS) and I had to transform them using offset 1900 (as this is what excel takes as reference date)

thank you for you support !

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 8 replies
  • 4500 views
  • 0 likes
  • 5 in conversation