Desktop productivity for business analysts and programmers

How to "compress" a dataset

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 10
Accepted Solution

How to "compress" a dataset

Hi Guru's

 

In Enterprise Guide my dataset looks like this

 

              Date1    Date2    Date3

A                     01/01/17 

A                             01/02/17

A            01/03/17

A            01/04/17

B            01/01/17

B                             02/01/17

 

I want to "compress" the dataset to this:

 

              Date1    Date2    Date3

A             01/03/17 01/01/17 01/02/17

A             01/04/17

B             01/01/17          02/01/17

 

or alternative:

 

              Date1    Date2    Date3

A             01/03/17 01/01/17 01/02/17

A             01/04/17 01/01/17 01/02/17

B             01/01/17          02/01/17

 

Any good ideas?

 

/Ole


Accepted Solutions
Solution
‎12-14-2017 06:05 AM
Super User
Posts: 9,548

Re: How to "compress" a dataset

Using your original structure for an example, the following should do it:

data have;
infile cards dlm=',' dsd truncover;
input customer $ (date1-date3) (:mmddyy8.);
format date1-date3 mmddyy10.;
cards;
A,,01/01/17 
A,,,01/02/17
A,01/03/17
A,01/04/17
B,01/01/17
B,,,02/01/17
;
run;

data
  date1 (keep=customer date1)
  date2 (keep=customer date2)
  date3 (keep=customer date3)
;
set have;
if date1 ne . then output date1;
if date2 ne . then output date2;
if date3 ne . then output date3;
run;

data want;
if _n_ = 0 then set have; * this does nothing except defining the variables, so that "of _all_" in the next line works;
call missing(of _all_); 
merge
  date1
  date2
  date3
;
by customer;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code

View solution in original post


All Replies
Super User
Posts: 5,824

Re: How to "compress" a dataset

What does the dates mean?
Why are they stored like that?
And what would your new structure tell from a business perspective?
Data never sleeps
Occasional Contributor
Posts: 10

Re: How to "compress" a dataset

Hi

 

Here's some business context (very simplified)

 

Date1 = Date for received phone call

Date2 = Date for outgoing phone call

Date3 = Date for received email

A = Customer A

B = Customer B

 

Dates are SAS dates.

 

/Ole

Respected Advisor
Posts: 4,536

Re: How to "compress" a dataset

@obertel

From a data organisation point of view it would be much better to re-structure your data like {customer, date, type}. 

Is that an option for you and if yes do you need help with the code for that?

Super User
Posts: 9,548

Re: How to "compress" a dataset

I'd much rather have the fields customer, date and comm_type, where comm_type has values for incoming/outgoing call, received/sent email, and so on. A new communication type won't need an expansion of the structure, everything is done via data.

As a side-effect, you will always (by sorting by date) have a clear picture of the history of a customer; the wide format will quite effectively prevent that.

---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 10

Re: How to "compress" a dataset

Posted in reply to KurtBremser

Hi

 

Thanks for your suggestions.

 

I have no problem structuring the data in customer, type and date.

The challenge is the presentation layer, where I want to present the dates of three specific types in columns.

 

/Ole

 

 

Solution
‎12-14-2017 06:05 AM
Super User
Posts: 9,548

Re: How to "compress" a dataset

Using your original structure for an example, the following should do it:

data have;
infile cards dlm=',' dsd truncover;
input customer $ (date1-date3) (:mmddyy8.);
format date1-date3 mmddyy10.;
cards;
A,,01/01/17 
A,,,01/02/17
A,01/03/17
A,01/04/17
B,01/01/17
B,,,02/01/17
;
run;

data
  date1 (keep=customer date1)
  date2 (keep=customer date2)
  date3 (keep=customer date3)
;
set have;
if date1 ne . then output date1;
if date2 ne . then output date2;
if date3 ne . then output date3;
run;

data want;
if _n_ = 0 then set have; * this does nothing except defining the variables, so that "of _all_" in the next line works;
call missing(of _all_); 
merge
  date1
  date2
  date3
;
by customer;
run;
---------------------------------------------------------------------------------------------
Maxims of Maximally Efficient SAS Programmers
How to convert datasets to data steps
How to post code
Occasional Contributor
Posts: 10

Re: How to "compress" a dataset

Posted in reply to KurtBremser

Hi Kurt

 

Your solution works perfect. Though, I had to sort date1, date2 and date3 dataset before the merge.

Thank you for your help and thanks to all for your input.

 

/Ole

Super User
Posts: 5,824

Re: How to "compress" a dataset

What is your presentation layer?

Most tools/many SAS PROCs have ability to to cross tabulation using categorical values (but you don't seem to have any measures, this is more of a list?).

Also, even if you are transposing up the values on date category, you'll stil don't have a clear key in the table, and your result table with have an unclear structure/granularity.

 

Data never sleeps
Super User
Posts: 10,609

Re: How to "compress" a dataset

data have;
infile cards dlm=',' dsd truncover;
input customer $ (date1-date3) (:mmddyy8.);
format date1-date3 mmddyy10.;
cards;
A,,01/01/17 
A,,,01/02/17
A,01/03/17
A,01/04/17
B,01/01/17
B,,,02/01/17
;
run;
data want;
 update have(obs=0) have;
 by customer;
 if last.customer or nmiss(of date:)=0 then output;
run;
☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 9 replies
  • 232 views
  • 2 likes
  • 5 in conversation