BookmarkSubscribeRSS Feed
Ronein
Meteorite | Level 14

Hello

For each customer ID there are columns with information of debt.

The name of the fields is with structure tYYMM.

There is also information of start and end periods .

There are  2 required tasks:

1- Calculate for each customer the total of debt between start and end months

For example: 

ID=999 , start=1904 ,end=1906 so Total=sum(t1904,t1905,t1906)

ID=888 , start=1906 ,end=1908 so Total=sum(t1906,t1907,t1908)

 

2- Create a new data set called "New" with new fields called L1,L2,L3

Each field will contain information of tYYMM between start and end

For example:

ID=999 , start=1904 ,end=1906 so

L1=value in t1904

L2=value in t1905

L3=value in t1906

 

Data Rawtb;
input ID start end t1901 t1902 t1903 t1904 t1905 t1906 t1907 t1908 t1909 t1910 t1911 t1912;
cards;
999 1904 1906 1 3 5 7 9 2 4 6 2 7 9 4
888 1906 1908 3 5 2 1 6 1 2 8 9 1 8 7
444 1901 1903 8 5 9 1 4 1 2 6 9 1 2 3
;
run;

/*Task1:Add a new column called "Total" in data set "Rawtb" .
The new field "Total" will be calculated by sum of tYYMM columns  between start and end */
For ID=999 Total=sum(t1904,t1905,t1906);
For ID=888 Total=sum(t1906,t1907,t1908);
For ID=444 Total=sum(t1901,t1902,t1903);

/*Task2:Create a new data set with columns L1 L2 L3 with information of tYYMM between start and End*/
Data Rawtb;
input ID start end L1 L2 L3;
cards;
999 1904 1906 7 9 2
888 1906 1908 1 2 8
444 1901 1903 8 5 9
;
run;

 
7 REPLIES 7
Ronein
Meteorite | Level 14
The structure of data is in wide because this the the war data that I receive and I need to work in this situation...
Kurt_Bremser
Super User

When you receive the data in a sub-optimal format/layout, your FIRST step has always to be to bring the data into usable shape. This is the "T" in ETL (Extract, Transform, Load).

This involves working on the dataset structure, but also transforming variable attributes to what makes later work easier. Any minute you spend doing this saves hours of wasted time later. This advice comes from 4 decades in IT and 2+ decades of working with SAS, so you might contemplate heeding it.

Kurt_Bremser
Super User

And here's the whole shebang:

/* Extract */
Data Rawtb;
input ID start end t1901 t1902 t1903 t1904 t1905 t1906 t1907 t1908 t1909 t1910 t1911 t1912;
cards;
999 1904 1906 1 3 5 7 9 2 4 6 2 7 9 4
888 1906 1908 3 5 2 1 6 1 2 8 9 1 8 7
444 1901 1903 8 5 9 1 4 1 2 6 9 1 2 3
;

/* Transform */
proc transpose
  data=rawtb
  out=trans
;
by id notsorted;
var t:;
run;

data transaction;
set trans;
period = input(cats("20",substr(_name_,2)),yymmn6.);
format period yymmd7.;
drop _name_;
rename col1=value;
run;

data dimension;
set rawtb (rename=(start=_start end=_end));
start = input(cats("20",_start),yymmn6.);
end = input(cats("20",_end),yymmn6.);
format start end yymmd7.;
keep id start end;
run;

/* Load (done concurrently with sort) */
proc sort data=transaction;
by id period;
run;

proc sort data=dimension;
by id;
run;

With this data (made intelligent by using a superior layout and proper data types (dates!)), the two wanted tables can be created in one step with easy code that will adapt itself seamlessly to the input data:

data
  want1 (keep=id start end total)
  want2 (keep=id start end l value)
;
merge
  dimension
  transaction
;
by id;
if first.id
then do;
  total = 0;
  l = 0;
end;
if start le period le end
then do;
  total + value;
  l + 1;
  output want2;
end;
if last.id then output want1;
run;

Note that I kept a long layout for want2; a wide report for human consumption including export to Excel can easily be done with PROC REPORT:

proc report data=want2;
column id start end l,value;
define id / group;
define start / group;
define end / group;
define l / "" across;
define value / sum;
run;
Kurt_Bremser
Super User

Note that @PeterClemmensen's code will fail as soon as start and end are more than 2 periods apart. The code I posted will be completely independent from this and need no modification. This is what you want to achieve, or you'll spend close to 100% of your future time adapting existing code instead of solving new issues.

andreas_lds
Jade | Level 19

If you want a solution that works even if the t-variables change, you have to transpose the data or write ugly code.

PeterClemmensen
Tourmaline | Level 20
Data Rawtb;
input ID start end t1901 t1902 t1903 t1904 t1905 t1906 t1907 t1908 t1909 t1910 t1911 t1912;
cards;
999 1904 1906 1 3 5 7 9 2 4 6 2 7 9 4
888 1906 1908 3 5 2 1 6 1 2 8 9 1 8 7
444 1901 1903 8 5 9 1 4 1 2 6 9 1 2 3
;
run;

data want1 (keep = ID start end t:) 
     want2 (keep = ID start end L1-l3);
   set Rawtb;
   array t {*} t:;
   array L L1 - L3;
   _I_ = 1;
   do i = mod(start, 10) by 1 while (_I_ < 4);
      Total + t [i];
      L = t [i];
      _I_ + 1;
   end;
run;

SAS Innovate 2025: Call for Content

Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!

Submit your idea!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 880 views
  • 1 like
  • 4 in conversation