BookmarkSubscribeRSS Feed
Feefee
Calcite | Level 5

Hi guys,

 

Need some help with this please!

 

If I have the following data 

 

User IDDates
2357608/10/1998
2357608/11/1998
2357608/12/2000
2357609/10/1998
8502807/09/1976
8502807/10/1967
8502810/07/1967
1385719/03/2001
1385718/03/2001

 

What I want is all possible pairs of the dates associated with that user ID. 

 

So for ID 23576 I would want table like this:

User IDDate ADate B
2357608/10/199808/11/1998
2357608/10/199808/12/2000
2357608/10/199809/10/1998
2357608/11/199808/12/2000
2357608/11/1998 09/10/1998
2357608/12/200009/10/1998

 

Hope that makes sense.

 

Thank you so much!

13 REPLIES 13
EyalGonen
Lapis Lazuli | Level 10

See if this helps:

data in;
	userid=1; value=1; output;
	userid=1; value=2; output;
	userid=1; value=3; output;
	userid=2; value=10; output;
	userid=2; value=11; output;
run;

data outds;
	set in;
	by userid;
	array vals(10) _temporary_;
	retain cnt 0;
	if first.userid then do;
		cnt = 0;
	end;
	cnt + 1;
	vals(cnt) = value;
	if last.userid then do;
		do i = 1 to cnt-1;
			do j = i+1 to cnt;
				value1 = vals(i);
				value2 = vals(j);
				output;
				keep userid value1 value2;
			end;
		end;
	end;
run;
EyalGonen
Lapis Lazuli | Level 10

You could also join the table with itself but this would produce more rows than you want (per your example) 

data in;
	userid=1; value=1; output;
	userid=1; value=2; output;
	userid=1; value=3; output;
	userid=2; value=10; output;
	userid=2; value=11; output;
run;

proc sql;
	create table outsql as 
	select a.*, b.value as value2
	from in a, in b
	where a.userid = b.userid and a.value ne b.value
	;
quit;
Feefee
Calcite | Level 5

Hi, 

 

Thank you for your response. How would I do this in data steps instead of proc sql please?

AMSAS
SAS Super FREQ
Nice, although word of warning your array needs to be big enough to hold all the values.
Change the number of array elements to 2 and this example will error.
Feefee
Calcite | Level 5
Hi thanks for the info.

I am getting a java.lang.ArrayIndexoutofboundsexception: 10 error 😕

How can i fix that please?
EyalGonen
Lapis Lazuli | Level 10

You are correct, the size of the array should be set to the maximum number of rows possible for a single userid. I provided just an example to illustrate the solution.

Kurt_Bremser
Super User

SQL is good at building cartesian joins, you just need the correct comparison:

data have;
infile datalines dlm="09"x;
input User_ID $ Dates:ddmmyy10.;
datalines;
23576	08/10/1998
23576	08/11/1998
23576	08/12/2000
23576	09/10/1998
85028	07/09/1976
85028	07/10/1967
85028	10/07/1967
13857	19/03/2001
13857	18/03/2001
;

proc sql;
create table want as
  select
    a.user_id,
    a.dates as date_a format=yymmdd10.,
    b.dates as date_b format=yymmdd10.
  from have a inner join have b
  on a.user_id = b.user_id and a.dates lt b.dates
;
quit;

Also note how data is presented in a data step with datalines, this makes it much easier to develop and test code. Please do so in the future.

Feefee
Calcite | Level 5
Hi,

Thank you for your response. How would I do this in data steps instead of proc sql please?

I am not using sas directly and proc sql isn't working properly unfortunately,
Feefee
Calcite | Level 5
I'm getting a redefinition of local variable error.

Also, I am not using sas but a similar product and the sql isnt very liked. So I am trying to write it in a datastep instead but struggling. Any help would be appreciated please 🙂
Kurt_Bremser
Super User

@Feefee wrote:
I'm getting a redefinition of local variable error.

Also, I am not using sas but a similar product and the sql isnt very liked. So I am trying to write it in a datastep instead but struggling. Any help would be appreciated please 🙂

Buying cheap means buying double, says my grandma 😉

 

One modern data step solution is the use of a hash object, which grows as needed (using the same "have" I used earlier):

data want2;
if 0 then set have;
format date_a date_b yymmdd10.;
if _n_ = 1
then do;
  declare hash h (multidata:"yes");
  declare hiter hi ("h");
  h.definekey("user_id");
  h.definedata("user_id","dates");
  h.definedone();
end;
do until (last.user_id);
  set have;
  by user_id notsorted;
  rc = h.add();
end;
do until (last.user_id);
  set have (rename=(dates=date_a));
  by user_id notsorted;
  rc = hi.first();
  do while (rc = 0);
    date_b = dates;
    if date_a < date_b then output;
    rc = hi.next();
  end;
end;
rc = h.clear();
keep user_id date_a date_b;
run;
Feefee
Calcite | Level 5
data have;
input element$ 1-2 qt;
datalines;
A 1
B 2
Cc3
D 4
Ee5
;

data want;
 set have end=last nobs=nobs;
 array x{999} $ 32 _temporary_;
 x{_n_}=element;
 if last then do;
   do i=1 to nobs-1;
     var1=x{i};
     do j=i+1 to nobs;
       var2=x{j};output;
	 end;
   end;
 end;
 keep var1 var2;
run;

I found this, and it works but I want to apply this to each User_ID and not the entire list. I am trying method with first and last but I am getting null pointer exception.

This is what I have so far:

 

data want;
 set have;
 retain total_rows 0 a 1;
 by User_ID;
 
total_rows = total_rows +1;
a = a +1;

 if first.User_ID then do;
 	a=1;
 	array x{999} $ 32 _temporary_;
	x{_n_}=dates;
end;

if last.User_ID then do;
do i = 1 to total_rows-1;
  dates=x{i};
       do j=rownum+1 to total_rows;
       var2=x{j};
       output;
	 end;
   end;
end;
run;

It works find until the do = i bit..

I have formatted etc its just the last bit I am struggling to get 😞 

EyalGonen
Lapis Lazuli | Level 10

Have you seen my sample code that I posted a while ago at https://communities.sas.com/t5/SAS-Programming/Find-pairs-of-a-single-variable-from-first-and-last/m... ?

Isn't it what you are looking for?

The only thing "wrong" with my code is that you need to increase the "vals" array size (in my example I set to "10") to some big number that will never be reached of events per userid

SAS Innovate 2025: Register Today!

 

Join us for SAS Innovate 2025, our biggest and most exciting global event of the year, in Orlando, FL, from May 6-9. Sign up by March 14 for just $795.


Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 13 replies
  • 1724 views
  • 3 likes
  • 4 in conversation