BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
SK_11
Obsidian | Level 7

I have the following


data have1;
input Grp cum ;
datalines;
1 2
2 4
3 6
4 8
5 10
6 12
;
run;


data have2;
input id ;
datalines;
1
2
3
4
5
6
7
8
9
10
11
12
;
run;

proc sql;
Create table want as
Select a.Id,
b.Grp
from have2 as a
left join have1 as b
on a.id=b.Cum
;

The above gives the output as 

idGrp
1.
21
3.
42
5.
63
7.
84
9.
105
11.
126

 

I would like to get this:

idGrp
11
21
32
42
53
63
74
84
95
105
116
126

 

Thanks for your help in advance.

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

If you want to "join" HAVE1 and HAVE2 then first convert HAVE1 to something with a range of values of ID.

data groups;
  set have1;
  by cum;
  min=sum(lag(cum),1);
  max=cum;
  keep grp min max;
run;

Now you can join HAVE2 with GROUPS and assign the GRP value to each observation in HAVE2.

proc sql ;
create table want as
  select * from have2 a inner join groups b
  on a.id between b.min and b.max
;
quit;

View solution in original post

13 REPLIES 13
qoit
Pyrite | Level 9

SQL is not really the solution for this but here you go:

data have1;
	input Grp cum;
	datalines;
1 2
2 4
3 6
4 8
5 10
6 12
;
run;

data have2;
	input id;
	datalines;
1
2
3
4
5
6
7
8
9
10
11
12
;
run;

proc sql;
	Create table want (keep=id grp) as
		Select a.Id,
			b.Grp as grp_,
		case 
			when b.grp is missing then monotonic() - 1 
		end 
	as col,
		case 
			when b.grp is missing then sum(1,calculated col) 
			else b.grp 
		end 
	as grp
		from have2 as a
			left join have1 as b
				on a.id=b.Cum;
quit;
SK_11
Obsidian | Level 7
Thanks for your quick reply, I have updated the dataset to clear any confusion.
Tom
Super User Tom
Super User

What is the reasoning that says ID=1 should be matched to GRP=1? I do not see the connection.

SK_11
Obsidian | Level 7

There is no relationship between group and id. This is a mere representation of more than 100K rows.  

SK_11
Obsidian | Level 7

To remove the confusion here is the updated data set.


data have1;
input Grp cum ;
datalines;
1 3252
2 6752
3 10252
4 13752
5 17252
6 20752
;
run;


data have2;
do i=1 to 20000;
id=i;output;
end;
drop i;
run;

Tom
Super User Tom
Super User

You must have had some algorithm in your mind when went from your two example input datasets to your desired output dataset.

What was the logic that said ID=1 was part of GRP=1 instead of GRP=2 or 3 or 3457?

SK_11
Obsidian | Level 7

I am trying to fill up the missing group values when I left join. The desired output will be all ids less than cum should have the same group value.

Tom
Super User Tom
Super User

@SK_11 wrote:

I am trying to fill up the missing group values when I left join. The desired output will be all ids less than cum should have the same group value.


There must be more to it than that.  If HAVE1 is 

data have1;
  input Grp cum;
datalines;
1 2
2 4
3 6
4 8
5 10
6 12
;

The ID=1 will be a member of all 6 GRP values since one is less than 2 and also less than 4 and less then 6 etc.

 

Why would you want to try to force a solution to this problem using SQL?

Tom
Super User Tom
Super User

Do you just want to do this?

Given this input:

data have1;
  input Grp cum;
datalines;
1 2
2 4
3 6
4 8
5 10
6 12
;

Run this data step:

data want;
  set have1;
  do id=sum(lag(cum),1) to cum;
     output;
  end;
run;

Result:

Obs    Grp    cum    id

  1     1       2     1
  2     1       2     2
  3     2       4     3
  4     2       4     4
  5     3       6     5
  6     3       6     6
  7     4       8     7
  8     4       8     8
  9     5      10     9
 10     5      10    10
 11     6      12    11
 12     6      12    12
SK_11
Obsidian | Level 7
Thanks for your quick reply. Any solutions would be fine. I have given a SQL example to show the missing values when I left join. I am interested to fill up the missing values of the group column where ids are less than the matched ids. I have updated the example data set as
data have1;
input Grp cum ;
datalines;
1 3252
2 6752
3 10252
4 13752
5 17252
6 20752
;
run;


data have2;
do i=1 to 20000;
id=i;output;
end;
drop i;
run;
Tom
Super User Tom
Super User

What does HAVE2 add to the problem?  Are you just trying to set an upper bound on the value of ID?

data have1;
input Grp cum ;
datalines;
1 3252
2 6752
3 10252
4 13752
5 17252
6 20752
;
data want;
  set have1;
  do id=sum(lag(cum),1) to min(cum,20000);
     output;
  end;
run;
proc freq data=want;
  tables grp ;
run;
The FREQ Procedure

                                Cumulative    Cumulative
Grp    Frequency     Percent     Frequency      Percent
--------------------------------------------------------
  1        3252       16.26          3252        16.26
  2        3500       17.50          6752        33.76
  3        3500       17.50         10252        51.26
  4        3500       17.50         13752        68.76
  5        3500       17.50         17252        86.26
  6        2748       13.74         20000       100.00

 

Tom
Super User Tom
Super User

If you want to "join" HAVE1 and HAVE2 then first convert HAVE1 to something with a range of values of ID.

data groups;
  set have1;
  by cum;
  min=sum(lag(cum),1);
  max=cum;
  keep grp min max;
run;

Now you can join HAVE2 with GROUPS and assign the GRP value to each observation in HAVE2.

proc sql ;
create table want as
  select * from have2 a inner join groups b
  on a.id between b.min and b.max
;
quit;
SK_11
Obsidian | Level 7
Thanks a lot Tom

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 2920 views
  • 1 like
  • 3 in conversation