BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
jrleighty
Fluorite | Level 6

A sample dataset is listed below:

data _null_;
	input V $ W $ X Y date;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;
run;

 

I am trying to create a new variable by subtracting two variables from one another and then dividing by another variable in the first variables observation (i.e. [x1-x2]/y1). I also need to match the observations using the same date. I'll be using the same set of observations to subtract with but many more observations for the other two numbers in the equation. How can I apply the above equation to create a new variable for my entire dataset? If we look at the second line in the sample dataset, the new variables will equal 0.5 since the date matches the second observation of Z (300-50/500). I will not need the Z observations in the new dataset.

 

 

 

1 ACCEPTED SOLUTION

Accepted Solutions
SASJedi
SAS Super FREQ

I think I see what you want here. Something like this:

data have;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql; select a.*, round((a.x-z.x)/a.y,.1) as NewVar from (select * from have where v<>'Z') as a left join (select * from have where v='Z') as z on a.Date=z.Date ; quit;
V W X Y date NewVar
A 1 400 700 2020-01-01 0.5
B 0 200 600 2020-01-01 0.2
B 1 400 700 2020-01-01 0.5
A 0 200 600 2020-01-01 0.2
B 0 300 500 2020-02-01 0.5
A 0 300 500 2020-02-01 0.5
A 1 100 800 2020-02-01 0.1
B 1 100 800 2020-02-01 0.1

 

This would have been easier if you had two separate tables to start with:

data a z;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
	if v='Z' then output z;
	else output a;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql;
select a.*, round((a.x-z.x)/a.y,.1) as NewVar
	from a left join z
	on a.Date=z.Date
;
quit;
V W X Y date NewVar
A 1 400 700 2020-01-01 0.5
B 0 200 600 2020-01-01 0.2
B 1 400 700 2020-01-01 0.5
A 0 200 600 2020-01-01 0.2
B 0 300 500 2020-02-01 0.5
A 0 300 500 2020-02-01 0.5
A 1 100 800 2020-02-01 0.1
B 1 100 800 2020-02-01 0.1

 

Check out my Jedi SAS Tricks for SAS Users

View solution in original post

3 REPLIES 3
PaigeMiller
Diamond | Level 26

There is no x1 or x2 in your data, but your formula contains x1 and x2. There is no Z in your data set, but your description includes a variable Z.

 

You say you want to "match the observations using the same date", this doesn't really mean anything to me. 

 

Please explain in a lot more detail. Please show us the desired output for this small data set.

--
Paige Miller
ballardw
Super User

First thing: you data set does not read the date as shown. You need to provide an informat that will treat the values as a date (and then assign a format so humans can read it as needed). Unfortunately the values  you picked for an example we cannot tell if that is month/day/year or day/month/ year (and having worked on Y2K problems I detest any value with 2 digit years). And since the name of the set you provide is _null_ then no set is made anyway so nothing can be used for later steps.

This gives a usable set:

data have;
	input V $ W $ X Y date :mmddyy8.;
   format date date9.;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;
run;

Now, very carefully explain exactly what an X1 X2 and Y1 are as those are not variables in the data as shown.

Does "first" have anything to do with the values of V or W?

Match what on date? That you will have to show some worked example with some very detailed explanation as you only provided 2 dates across 10 observations. The date on your "second line" also matches the date on the 4th line, 6th line and 8th line as well as the one on Z. So the "example" calculation does not really make any sense without a lot more details.

 

 

 

SASJedi
SAS Super FREQ

I think I see what you want here. Something like this:

data have;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql; select a.*, round((a.x-z.x)/a.y,.1) as NewVar from (select * from have where v<>'Z') as a left join (select * from have where v='Z') as z on a.Date=z.Date ; quit;
V W X Y date NewVar
A 1 400 700 2020-01-01 0.5
B 0 200 600 2020-01-01 0.2
B 1 400 700 2020-01-01 0.5
A 0 200 600 2020-01-01 0.2
B 0 300 500 2020-02-01 0.5
A 0 300 500 2020-02-01 0.5
A 1 100 800 2020-02-01 0.1
B 1 100 800 2020-02-01 0.1

 

This would have been easier if you had two separate tables to start with:

data a z;
	input V $ W $ X Y date:mmddyy8.;
	format date yymmddd10.;
	if v='Z' then output z;
	else output a;
datalines;
	A 0 200 600 01/01/20
	A 0 300 500 02/01/20
	A 1 400 700 01/01/20
	A 1 100 800 02/01/20
	B 0 200 600 01/01/20
	B 0 300 500 02/01/20
	B 1 400 700 01/01/20
	B 1 100 800 02/01/20
	Z 0 80 500 01/01/20
	Z 0 50 600 02/01/20
;

proc sql;
select a.*, round((a.x-z.x)/a.y,.1) as NewVar
	from a left join z
	on a.Date=z.Date
;
quit;
V W X Y date NewVar
A 1 400 700 2020-01-01 0.5
B 0 200 600 2020-01-01 0.2
B 1 400 700 2020-01-01 0.5
A 0 200 600 2020-01-01 0.2
B 0 300 500 2020-02-01 0.5
A 0 300 500 2020-02-01 0.5
A 1 100 800 2020-02-01 0.1
B 1 100 800 2020-02-01 0.1

 

Check out my Jedi SAS Tricks for SAS Users

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 3 replies
  • 473 views
  • 1 like
  • 4 in conversation