Solved: Variable subsetting

Anthony_eng · Posted 12-27-2021 08:10 PM

Hello everyone. I have included my SAS program below. The code is working at this point, but I need to add a step that will only keep the ExperimentIDs that contain both experiment types (A42 and AC). So if I have 100 observations for ExperimentID '1' but they all contain only type A42, I don't want that ExperimentID in my data. If there are 100 observations for ExperimentID '2' with some being of type A42 and others being type AC, then I would like all of those observations to show. Is this something that can be done in PROC SQL? Any advice/help regarding how I can make this happen? Thank you.

PROC SQL;
	CREATE TABLE WORK.Experiment_VOLT AS
	SELECT
		t1.ExperimentID,
		t1.STATION,
		t2.TYPE,
		t2.TIMESTAMP,
		t2.VALUE,
		t1.TESTRESULT,
		t2.ADDRESS
	FROM WORK.Experiment t1
		INNER JOIN WORK.DATA_FILTERED t2
			ON (t1.ExperimentID = t2.ExperimentID)
	WHERE TYPE = 'A42' OR TYPE ='AC'
	ORDER BY t1.ExperimentID, t2.TIMESTAMP;  
QUIT;

s_lassen · Posted 12-28-2021 04:28 AM

This will not work in other SQL dialects, but in SAS SQL I think you can get away with this:

PROC SQL;
	CREATE TABLE WORK.Experiment_VOLT AS
	SELECT
		t1.ExperimentID,
		t1.STATION,
		t2.TYPE,
		t2.TIMESTAMP,
		t2.VALUE,
		t1.TESTRESULT,
		t2.ADDRESS
	FROM WORK.Experiment t1
		INNER JOIN WORK.DATA_FILTERED t2
			ON (t1.ExperimentID = t2.ExperimentID)
	WHERE TYPE = 'A42' OR TYPE ='AC'
    group by t1.ExperimentID
    having count(distinct t2.TYPE)=2
	ORDER BY t1.ExperimentID, t2.TIMESTAMP;  
QUIT;

View solution in original post

Patrick · Posted 12-27-2021 08:49 PM

You help us help you if you provide sample data in the form of a data have step as done below.

Is the following giving you the pointers you need? I'd be using option 1.

data have;
  input experiment_id type $;
  datalines;
1 A42
1 XX
1 AC
2 XX
2 XX
3 A42
3 AC
;

/* option 1 */
proc sql;
  select
    o.experiment_id,
    o.type
  from
    have o
    inner join
    (select distinct experiment_id from have where type in ('A42','AC') ) s
    on o.experiment_id=s.experiment_id
  ;
quit;

/* option 2 */
proc sql;
  select
    experiment_id,
    type
  from
    have
 where experiment_id in
    (select experiment_id from have where type in ('A42','AC') )
  ;
quit;

s_lassen · Posted 12-28-2021 04:28 AM

This will not work in other SQL dialects, but in SAS SQL I think you can get away with this:

PROC SQL;
	CREATE TABLE WORK.Experiment_VOLT AS
	SELECT
		t1.ExperimentID,
		t1.STATION,
		t2.TYPE,
		t2.TIMESTAMP,
		t2.VALUE,
		t1.TESTRESULT,
		t2.ADDRESS
	FROM WORK.Experiment t1
		INNER JOIN WORK.DATA_FILTERED t2
			ON (t1.ExperimentID = t2.ExperimentID)
	WHERE TYPE = 'A42' OR TYPE ='AC'
    group by t1.ExperimentID
    having count(distinct t2.TYPE)=2
	ORDER BY t1.ExperimentID, t2.TIMESTAMP;  
QUIT;

Variable subsetting

Re: Variable subsetting

Re: Variable subsetting

Re: Variable subsetting

Registration is open

Variable subsetting

Re: Variable subsetting

Re: Variable subsetting

Re: Variable subsetting

Registration is open

SAS Training: Just a Click Away