Solved: Re: Match observations to make new variable

jrleighty · Posted 03-08-2024 10:03 AM

A sample dataset is listed below:

data _null_;
	input V $ W $ X Y date;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;
run;

I am trying to create a new variable by subtracting two variables from one another and then dividing by another variable in the first variables observation (i.e. [x1-x2]/y1). I also need to match the observations using the same date. I'll be using the same set of observations to subtract with but many more observations for the other two numbers in the equation. How can I apply the above equation to create a new variable for my entire dataset? If we look at the second line in the sample dataset, the new variables will equal 0.5 since the date matches the second observation of Z (300-50/500). I will not need the Z observations in the new dataset.

SASJedi · Posted 03-08-2024 10:21 AM

I think I see what you want here. Something like this:

data have;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql;
select a.*, round((a.x-z.x)/a.y,.1) as NewVar
	from 
	(select * from have where v<>'Z') as a
	left  join 
	(select * from have where v='Z') as z
	on a.Date=z.Date
;
quit;

V	W	X	Y	date	NewVar
A	1	400	700	2020-01-01	0.5
B	0	200	600	2020-01-01	0.2
B	1	400	700	2020-01-01	0.5
A	0	200	600	2020-01-01	0.2
B	0	300	500	2020-02-01	0.5
A	0	300	500	2020-02-01	0.5
A	1	100	800	2020-02-01	0.1
B	1	100	800	2020-02-01	0.1

This would have been easier if you had two separate tables to start with:

data a z;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
	if v='Z' then output z;
	else output a;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql;
select a.*, round((a.x-z.x)/a.y,.1) as NewVar
	from a left join z
	on a.Date=z.Date
;
quit;

V	W	X	Y	date	NewVar
A	1	400	700	2020-01-01	0.5
B	0	200	600	2020-01-01	0.2
B	1	400	700	2020-01-01	0.5
A	0	200	600	2020-01-01	0.2
B	0	300	500	2020-02-01	0.5
A	0	300	500	2020-02-01	0.5
A	1	100	800	2020-02-01	0.1
B	1	100	800	2020-02-01	0.1

Check out my Jedi SAS Tricks for SAS Users

View solution in original post

PaigeMiller · Posted 03-08-2024 10:10 AM

There is no x1 or x2 in your data, but your formula contains x1 and x2. There is no Z in your data set, but your description includes a variable Z.

You say you want to "match the observations using the same date", this doesn't really mean anything to me.

Please explain in a lot more detail. Please show us the desired output for this small data set.

--
Paige Miller

ballardw · Posted 03-08-2024 10:17 AM

First thing: you data set does not read the date as shown. You need to provide an informat that will treat the values as a date (and then assign a format so humans can read it as needed). Unfortunately the values you picked for an example we cannot tell if that is month/day/year or day/month/ year (and having worked on Y2K problems I detest any value with 2 digit years). And since the name of the set you provide is _null_ then no set is made anyway so nothing can be used for later steps.

This gives a usable set:

data have;
	input V $ W $ X Y date :mmddyy8.;
   format date date9.;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;
run;

Now, very carefully explain exactly what an X1 X2 and Y1 are as those are not variables in the data as shown.

Does "first" have anything to do with the values of V or W?

Match what on date? That you will have to show some worked example with some very detailed explanation as you only provided 2 dates across 10 observations. The date on your "second line" also matches the date on the 4th line, 6th line and 8th line as well as the one on Z. So the "example" calculation does not really make any sense without a lot more details.

SASJedi · Posted 03-08-2024 10:21 AM

I think I see what you want here. Something like this:

data have;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql;
select a.*, round((a.x-z.x)/a.y,.1) as NewVar
	from 
	(select * from have where v<>'Z') as a
	left  join 
	(select * from have where v='Z') as z
	on a.Date=z.Date
;
quit;

V	W	X	Y	date	NewVar
A	1	400	700	2020-01-01	0.5
B	0	200	600	2020-01-01	0.2
B	1	400	700	2020-01-01	0.5
A	0	200	600	2020-01-01	0.2
B	0	300	500	2020-02-01	0.5
A	0	300	500	2020-02-01	0.5
A	1	100	800	2020-02-01	0.1
B	1	100	800	2020-02-01	0.1

This would have been easier if you had two separate tables to start with:

data a z;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
	if v='Z' then output z;
	else output a;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql;
select a.*, round((a.x-z.x)/a.y,.1) as NewVar
	from a left join z
	on a.Date=z.Date
;
quit;

V	W	X	Y	date	NewVar
A	1	400	700	2020-01-01	0.5
B	0	200	600	2020-01-01	0.2
B	1	400	700	2020-01-01	0.5
A	0	200	600	2020-01-01	0.2
B	0	300	500	2020-02-01	0.5
A	0	300	500	2020-02-01	0.5
A	1	100	800	2020-02-01	0.1
B	1	100	800	2020-02-01	0.1

Check out my Jedi SAS Tricks for SAS Users

Match observations to make new variable

Re: Match observations to make new variable

Re: Match observations to make new variable

Re: Match observations to make new variable

Re: Match observations to make new variable

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away