BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
cxkev
Fluorite | Level 6

Hi,

I have a dataset that contains a record for each account number for each day and what queue the account currently sits in.

Sample Data:

AccNum          Queue          ExtractDate

A001               1                    27APR2015

A002               1                    27APR2015

A003               2                    27APR2015

A001               1                    28APR2015

A002               1                    28APR2015

A003               2                    28APR2015

A001               3                    29APR2015

A002               3                    29APR2015

A003               4                    29APR2015

If possible, I would like to cerate an SCD dataset that would reduce the size of this dataset and would not take as long to query as it would only capture changes and not store an observation for every account every day, I do not have DI studio. see below:

AccNum          Queue          RowStartDate          RowEndDate

A001               1                    27APR2015          28APR2015

A002               1                    27APR2015          28APR2015

A003               2                    27APR2015          28APR2015          

A001               3                    29APR2015          31DEC9999

A002               3                    29APR2015          31DEC9999

A003               4                    29APR2015          31DEC9999

31dec999 indicates current record.

Once the initial dataset is created, I will continue to receive extracts as in the first piece of sample data above and I will need to use them to update the SCD dataset.

Many thanks for any help.

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

You could refer to the paper

http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf

Me ,Matt and Arthur.T written at this year SGF2015 ,it is about MERGE skill to transpose data.

data have;
input AccNum  $        Queue          ExtractDate : date9.;
cards;
A001               1                    27APR2015
A002               1                    27APR2015
A003               2                    27APR2015
A001               1                    28APR2015
A002               1                    28APR2015
A003               2                    28APR2015
A001               3                    29APR2015
A002               3                    29APR2015
A003               4                    29APR2015
;
run;
data have;
 set have;
 length name $ 20;
 if AccNum='A001' then n+1;
 if mod(n,2)=1 then name='RowStartDate';
  else name='RowEndDate';
drop n;
run;

proc sql;
 select distinct catt('have(where=(name="',name,'") rename=(ExtractDate=',name,'))') into : list separated by ' '
  from have;
quit;

data want;
 merge &list ;
 if missing(RowEndDate) then RowEndDate='31DEC9999'd;
 format RowStartDate  RowEndDate date9.;
 drop name;
run; 

Xia Keshan

View solution in original post

6 REPLIES 6
Ksharp
Super User

You could refer to the paper

http://support.sas.com/resources/papers/proceedings15/2785-2015.pdf

Me ,Matt and Arthur.T written at this year SGF2015 ,it is about MERGE skill to transpose data.

data have;
input AccNum  $        Queue          ExtractDate : date9.;
cards;
A001               1                    27APR2015
A002               1                    27APR2015
A003               2                    27APR2015
A001               1                    28APR2015
A002               1                    28APR2015
A003               2                    28APR2015
A001               3                    29APR2015
A002               3                    29APR2015
A003               4                    29APR2015
;
run;
data have;
 set have;
 length name $ 20;
 if AccNum='A001' then n+1;
 if mod(n,2)=1 then name='RowStartDate';
  else name='RowEndDate';
drop n;
run;

proc sql;
 select distinct catt('have(where=(name="',name,'") rename=(ExtractDate=',name,'))') into : list separated by ' '
  from have;
quit;

data want;
 merge &list ;
 if missing(RowEndDate) then RowEndDate='31DEC9999'd;
 format RowStartDate  RowEndDate date9.;
 drop name;
run; 

Xia Keshan

cxkev
Fluorite | Level 6

Thanks for that Xia, that is really helpful. It works on the sample data, but should it work when adding more data?

For example, I tried the following sample data (adding data for 26APR2015, where accounts were in the same queues):

data have;

input AccNum  $        Queue          ExtractDate : date9.;

cards;

A001               1                    26APR2015

A002               1                    26APR2015

A003               2                    26APR2015

A001               1                    27APR2015

A002               1                    27APR2015

A003               2                    27APR2015

A001               1                    28APR2015

A002               1                    28APR2015

A003               2                    28APR2015

A001               3                    29APR2015

A002               3                    29APR2015

A003               4                    29APR2015

;

run;

But it didn't quite yield the expected results, For example, I would want to see 2 observations for A001, one with the row start date 26APR2015 & Row end date 28APR2015 in queue 1 then the second observation for A001 to show row start date 29APR2015 and row end date 31DEC9999.

Thanks again for your help

LinusH
Tourmaline | Level 20

First, you don't know how to build your SCD type 2 logic. You don't have DI Studio.

My guess is that you wish ti implement this logic for more than one table. And perhaps make it dynamic so you could reuse your code.

Now it's getting a bit complex. And any implemented code needs maintenance.

Bottom line is, there's reason for people to use ETL tools. By the time and money spent at each any any site by developing they own logic, I'm quite certain that an investment in a proven tool will be more cost effective in the long run.

Data never sleeps
cxkev
Fluorite | Level 6

Hi,

Only one table but I take your point. I'll close off the discussion.

Thanks for your help.

Regards,

Ksharp
Super User

Then what is your output ?

santosh_pat69
Quartz | Level 8
Hi all,
Even I'm trying to achieve the same using Base SAS programmer Ksharp I have tried using your code but I'm getting missing values for the start dates.
My Sample data has around 14800 records,a free the first run the count was 7300 which is fine.
Kindly Suggest

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 6 replies
  • 4291 views
  • 4 likes
  • 4 in conversation