Solved: Selecting Variables

thb · Posted 09-27-2018 07:19 PM

Hello,

I have the following data table:

ID	Measure	Number	Date	Flag	Output
1	A	5	11/1/2014	1	3
1	A	1	12/1/2014	1	3
1	A	3	1/1/2015	1	3
1	A	5	10/1/2015	1	5
1	A	6	11/1/2015	1	5
1	A	8	12/1/2015	1	5
1	A	3	1/1/2016	1	5
1	A	6	2/1/2016	1	5
1	A	1	8/1/2017	1	6
1	A	6	9/1/2017	1	6
1	A	10	10/1/2017	1	6
1	A	2	11/1/2017	1	6
1	A	14	12/1/2017	1	6
1	A	5	1/1/2018	1	6

I am trying to select the output variable with the greater output number, but earlier time period. For example, the output '5' is greater than '3', but the date is earlier than the output '6', so I would select that. However, I'm not sure how to write this coding. Eventually, I want to be able to select the following data only from the above table.

ID	Measure	Number	Date	Flag	Output
1	A	5	10/1/2015	1	5
1	A	6	11/1/2015	1	5
1	A	8	12/1/2015	1	5
1	A	3	1/1/2016	1	5
1	A	6	2/1/2016	1	5

Thank you for any help in advance.

PGStats · Posted 09-28-2018 12:36 AM

The only way I can interpret your request is: "Find the OUTPUT group which is not the smallest OUTPUT number and not the latest mean (or min, or max, etc) date." A solution would then be:


proc sql;
select id, measure, number, date, flag, output
from (select *, mean(date) as meanDate from have group by id, measure, output)
group by id, measure
having output > min(output) and meanDate < max(meanDate);
quit;

PG

View solution in original post

ballardw · Posted 09-27-2018 07:51 PM

What role does the variable named NUMBER play in this process?

Why are none of the OUTPUT values of 6 selected? You seem to have an unstated rule other than larger value of output and date sequence.

I can get you desired data from that example with

data want;

set have;

where output=5;

run;

but that isn't exercising any rules. Perhaps you could provide example start and end data sets where the OUTPUT variable is a different value. Best is to provide examples as data step so we can be sure we are using data similar to yours. Paste it into a code box opened with the forum's {I} such as:

data have;
input ID Measure $ Number Date :mmddyy10. Flag Output;
format date mmddyy10.;
datalines;
1 A 5 11/1/2014 1 3 
1 A 1 12/1/2014 1 3 
1 A 3 1/1/2015 1 3 
1 A 5 10/1/2015 1 5 
1 A 6 11/1/2015 1 5 
1 A 8 12/1/2015 1 5 
1 A 3 1/1/2016 1 5 
1 A 6 2/1/2016 1 5 
1 A 1 8/1/2017 1 6 
1 A 6 9/1/2017 1 6 
1 A 10 10/1/2017 1 6 
1 A 2 11/1/2017 1 6 
1 A 14 12/1/2017 1 6 
1 A 5 1/1/2018 1 6 
run;

By the way I am guessing that the first number in the date is the month.

PGStats · Posted 09-28-2018 12:36 AM

The only way I can interpret your request is: "Find the OUTPUT group which is not the smallest OUTPUT number and not the latest mean (or min, or max, etc) date." A solution would then be:


proc sql;
select id, measure, number, date, flag, output
from (select *, mean(date) as meanDate from have group by id, measure, output)
group by id, measure
having output > min(output) and meanDate < max(meanDate);
quit;

PG

Selecting Variables

Re: Selecting Variables

Re: Selecting Variables

Re: Selecting Variables

Catch up on SAS Innovate 2026

Selecting Variables

Re: Selecting Variables

Re: Selecting Variables

Re: Selecting Variables

Catch up on SAS Innovate 2026

SAS Training: Just a Click Away