BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
adi121
Fluorite | Level 6

I have a dataset spend with numeric values -

 

data spend;
input custid week spends;
cards;
1 1 4365
1 2 457
1 3 124
1 4 35547
2 5 4678
2 6 21253
3 7 32534
3 1 23423
3 2 23567
;
run;

 

i want output to look like

 

1 1 4365
1 2 457
1 3 124
1 4 35547

1 5  .

1 6  .

1 7  .

2 1  .

2 2 .

2 3 .

2 4.
2 5 4678
2 6 21253

2 7 .

3 1 23423

3 2 23567

3 3 .

3 4 .

3 5 .

3 6
3 7 32534

So basically i want to increment the values of week to 7 and those which don't have any values in spend should be missing.

 

data a;
set spend;
do j = 1 to 7;
week = j;
output;
end;
drop j;
run;

 

in above code cust id 1 and 3 are getting repeated 2 times and spend values are also not correct.

 

Please help.

 

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

The right solution depends on how to determine the wanted list of weeks.  

If you know the range of weeks is always 1 to 7 then use that to generate the missing records.

data skeleton;
  set spend(keep=custid week);
  by custid;
  if first.custid then do week=1 to 7;
    output;
  end;
run;
data want;
  merge skeleton spend;
  by custid week;
run;

If the list is more flexible then you will need to do more work to generate the skeleton dataset.  For example you might want to find the maximum values of week.  Or perhaps the min and the max.  Or just the list of weeks that appear for any custid.

View solution in original post

7 REPLIES 7
novinosrin
Tourmaline | Level 20
data spend;
input custid week spends;
cards;
1 1 4365
1 2 457
1 3 124
1 4 35547
2 5 4678
2 6 21253
3 7 32534
3 1 23423
3 2 23567
;
run;
%let num_of_weeks=7;
data want ;
if _n_=1 then do;
	if 0 then set spend;
   dcl hash H (dataset: "spend") ;
   h.definekey  ("custid","week") ;
   h.definedata ("spends") ;
   h.definedone () ;
   end;
set spend(keep=custid);
by custid;
if first.custid;
do week=1 to &num_of_weeks;
if h.find() ne 0 then call missing(spends);
output;
end;
run;
ErikLund_Jensen
Rhodochrosite | Level 12

Hi @adi121 @novinosrin 

A very elegant solution - i am always impressed by people who has hash objects as their first choice in the tool box. This is an interesting problem, because it can be solved in so many ways, and I had great fun experimenting with different techniques. Here are two more traditional solutions:

 

data spend;
input custid week spends;
cards;
1 1 4365
1 2 457
1 3 124
1 4 35547
2 5 4678
2 6 21253
3 7 32534
3 1 23423
3 2 23567
;
run;

/* first method - requires input sorted by custid */
data want1 (drop= _:); 
	set spend; by custid;
	array w _wk1-_wk7;
	array s _sp1-_sp7;
	retain _wk1-_wk7 _sp1-_sp7;

	* initiate arrays for weeks and spends with missing values for given customer;
	if first.custid then do;
		call missing(of _wk1-_wk7);
		call missing(of _sp1-_sp7);
	end;

	* place actual week/spends in array variables;
	w{week} = week;
	s{week} = spends;

	* output all weeks after all observations for given customer are processed;
	if last.custid then do _i = 1 to 7;
		week = w{_i};
		spends = s{_i};
		output;
	end;
run;

/* second method - requires input sorted by custid and week */
proc sort data=spend; by custid week;
run;

data want2 (drop= last week spends rename=(nweek=week nspends=spends)); 
	set spend; by custid;
	retain last;
	last = lag(week);
	if first.custid then last = 0;

	* add missing weeks before/between weeks in input;
	do i = last + 1 to week - 1;
		nweek = i;
		nspends = .;
		output;
	end;

	* actual observation; 
	nweek = week;
	nspends = spends;
	output;

	* add missing weeks after last week in input;
	if last.custid and week < 7 then do i = week + 1 to 7;
		nweek = i;
		nspends = .;
		output;
	end;
run;

 

novinosrin
Tourmaline | Level 20

Thank you Sir @ErikLund_Jensen 

 

here's another one

 


data spend;
input custid week spends;
cards;
1 1 4365
1 2 457
1 3 124
1 4 35547
2 5 4678
2 6 21253
3 7 32534
3 1 23423
3 2 23567
;
run;

proc freq data=spend(keep=custid week) noprint;
tables custid*week/sparse out=temp(keep=custid week);
run;

proc sort data=spend out=_spend;
by custid week;
run;

data want;
merge temp _spend;
by custid week;
run;

This assumes OP wants the maximum week i.e 7 in one of the by group

noling
SAS Employee

Here's a less elegant but simple set of steps:

 

data spend;
input custid week spends;
cards;
1 1 4365
1 2 457
1 3 124
1 4 35547
2 5 4678
2 6 21253
3 7 32534
3 1 23423
3 2 23567
;
run;
proc sort data=spend; by custid week; run; 
proc sql noprint;
	select distinct custid into :custids separated by ","
	from spend;
quit;
/* create empty table*/
data empty (drop=i);
	length custid week 8.;
	do i = 1 to countc("&custids.",",") +1;
		do week=1 to 7;
			custid = scan("&custids.",i,",");
			output;
		end;
	end;
run;
/*populate values into formerly empty table*/
data want;
	merge empty spend;
	by custid week;
run;

Register today and join us virtually on June 16!
sasglobalforum.com | #SASGF

View now: on-demand content for SAS users

Ksharp
Super User
data spend;
input custid week spends;
cards;
1 1 4365
1 2 457
1 3 124
1 4 35547
2 5 4678
2 6 21253
3 7 32534
3 1 23423
3 2 23567
;
proc sql;
create table want as
 select a.*,spends
  from (select * from
(select distinct custid from spend),(select distinct week from spend)
) as a left join spend as b
on a.custid=b.custid and a.week=b.week
order by 1,2;
quit;
Tom
Super User Tom
Super User

The right solution depends on how to determine the wanted list of weeks.  

If you know the range of weeks is always 1 to 7 then use that to generate the missing records.

data skeleton;
  set spend(keep=custid week);
  by custid;
  if first.custid then do week=1 to 7;
    output;
  end;
run;
data want;
  merge skeleton spend;
  by custid week;
run;

If the list is more flexible then you will need to do more work to generate the skeleton dataset.  For example you might want to find the maximum values of week.  Or perhaps the min and the max.  Or just the list of weeks that appear for any custid.

adi121
Fluorite | Level 6

Thanks everyone for quick response. all solutions are working properly.

But if i had to chose one , i would chose the solution given by Tom which was very easy to understand .

 

Thanks to noling ,Eriklund ,novinsorin

 

 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 7 replies
  • 5389 views
  • 0 likes
  • 6 in conversation