BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
sasgorilla
Pyrite | Level 9

I have a data set that has IDs, dates, (other variables excluded below), and a "type" variable:

data A;
input id date :yymmdd10.  type;
format date yymmdd10.;
datalines;
1 2020-01-01 0
1 2018-07-06 0
2 2015-02-15 1
3 2020-02-01 0
3 2021-07-05 1
3 2021-08-09 0
4 2020-01-30 0
5 2018-10-10 1
5 2019-10-11 1
;
RUN;

If an ID ONLYhas a type value of "0", I want those id's observations to go to one dataset. If an id EVER has values of "1" I want those observations to go to another data set. See below: 

data zerotype;
input id date :yymmdd10. type;
format date yymmdd10.;
datalines;
1 2020-01-01 0
1 2018-07-06 0
4 2020-01-30 0
;
run;

data onetype;
input id date :yymmdd10.  type;
format date yymmdd10.;
datalines;
2 2015-02-15 1
3 2020-02-01 0
3 2021-07-05 1
3 2021-08-09 0
5 2018-10-10 1
5 2019-10-11 1
;
run;

How could I do this in a datastep?  Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Patrick
Opal | Level 21

And here another option.

data zerotype onetype;
	if _n_=1 then
		do;
			dcl hash h1(dataset:'have(where=(type=1))');
			h1.defineKey('id');
			h1.defineDone();
		end;
	set have;
	if h1.check()=0 then output onetype;
	else output zerotype;
run;

View solution in original post

8 REPLIES 8
SASKiwi
PROC Star

There are probably more elegant ways but you could try this:

data Have;
input id date :yymmdd10.  type;
format date yymmdd10.;
datalines;
1 2020-01-01 0
1 2018-07-06 0
2 2015-02-15 1
3 2020-02-01 0
3 2021-07-05 1
3 2021-08-09 0
4 2020-01-30 0
5 2018-10-10 1
5 2019-10-11 1
;
RUN;
data onetype
     zerotype;
  keep id date type;
  merge Have (in = H1)
        Have (in = H2 
              where = (type2 = 1)
              keep = id type
              rename = (type = type2))
        ;
  by id;
  retain onetype;
  if first.id then do;
     if H2 then onetype = 1;
     else onetype = 0;
  end;
  if onetype then output onetype;
  else output zerotype; 
run;

 

Tom
Super User Tom
Super User

That works, but your data step is too complex. 

 

You can just use the IN= flag (or the TYPE2 copy of TYPE) to drive the decision.

data onetype
     zerotype
;
  keep id date type;
  merge Have (in = H1)
        Have (in = H2 
              keep = id type
              rename = (type = type2)
              where = (type2 = 1)
             )
  ;
  by id;
  if H2 then output onetype;
  else output zerotype;
  drop type2;
run;
Astounding
PROC Star

@Tom ,

 

Will this work with the WHERE= that you have:

 where = (type2 = 1)

or will it require referring to the original variable name?

 where = (type = 1)

 

Tom
Super User Tom
Super User

@Astounding wrote:

@Tom ,

 

Will this work with the WHERE= that you have:

 where = (type2 = 1)

or will it require referring to the original variable name?

 where = (type = 1)

 


Short answer is No.   

 

The Drop=/Keep=,Rename=, and Where= dataset options are processed in alphabetical order.  So if you have the RENAME= option then there is no longer a variable named TYPE to reference in the WHERE= option.   You need to reference it with the new name.

 

And if you do not have the RENAME= option so that the original name of TYPE is still used then the values of TYPE in the resulting dataset will be jumbled. However you could fix that by reordering the datasets in the SET statement so that the copy that has all of the observations is listed last, so the real value of TYPE will overwrite the copy that only has 1's.

data onetype zerotype ;
  merge have(in=type1 keep=id type where=(type=1)) have;
  by id;
  if type1 then output onetype;
  else output zerotype;
run;

 

Reeza
Super User

Sort by ID and DESCENDING type. Then if there's any 1's it will be at the top and you can decide where to output the record.

 

data A;
input id date :yymmdd10.  type;
format date yymmdd10.;
datalines;
1 2020-01-01 0
1 2018-07-06 0
2 2015-02-15 1
3 2020-02-01 0
3 2021-07-05 1
3 2021-08-09 0
4 2020-01-30 0
5 2018-10-10 1
5 2019-10-11 1
;
RUN;

proc sort data=a;
by id descending type;
run;

data zerotype onetype;
set a;
by id descending type;

retain output_type;
length output_type $8.;

if first.id and type=1 then output_type='one';
else if first.id and type=0 then output_type='zero';

if output_type = 'zero' then output zerotype;
if output_type = 'one' then output onetype;

drop output_type;

run;

proc sort data=zerotype; by id date; run;
proc sort data=onetype; by id date; run;

@sasgorilla wrote:

I have a data set that has IDs, dates, (other variables excluded below), and a "type" variable:

data A;
input id date :yymmdd10.  type;
format date yymmdd10.;
datalines;
1 2020-01-01 0
1 2018-07-06 0
2 2015-02-15 1
3 2020-02-01 0
3 2021-07-05 1
3 2021-08-09 0
4 2020-01-30 0
5 2018-10-10 1
5 2019-10-11 1
;
RUN;

If an ID ONLYhas a type value of "0", I want those id's observations to go to one dataset. If an id EVER has values of "1" I want those observations to go to another data set. See below: 

data zerotype;
input id date :yymmdd10. type;
format date yymmdd10.;
datalines;
1 2020-01-01 0
1 2018-07-06 0
4 2020-01-30 0
;
run;

data onetype;
input id date :yymmdd10.  type;
format date yymmdd10.;
datalines;
2 2015-02-15 1
3 2020-02-01 0
3 2021-07-05 1
3 2021-08-09 0
5 2018-10-10 1
5 2019-10-11 1
;
run;

How could I do this in a datastep?  Thank you!


 

Patrick
Opal | Level 21

And here another option.

data zerotype onetype;
	if _n_=1 then
		do;
			dcl hash h1(dataset:'have(where=(type=1))');
			h1.defineKey('id');
			h1.defineDone();
		end;
	set have;
	if h1.check()=0 then output onetype;
	else output zerotype;
run;
sasgorilla
Pyrite | Level 9

This worked perfectly and was the briefest way to do. Admittedly, this is the most difficult code for me to understand as I don't recognize things like dcl, hash, h1 and the definekey/definedone. I see it is known as the hash object and I will look into it further. 

 

Thanks again!

Ksharp
Super User

If you are not familiar with Hash Table, try PROC SQL:

 

data A;
input id date :yymmdd10.  type;
format date yymmdd10.;
datalines;
1 2020-01-01 0
1 2018-07-06 0
2 2015-02-15 1
3 2020-02-01 0
3 2021-07-05 1
3 2021-08-09 0
4 2020-01-30 0
5 2018-10-10 1
5 2019-10-11 1
;
RUN;

proc sql;
create table zero_type as
select * from A where id not in (select id from A where type=1);

create table one_type as
select * from A where id  in (select id from A where type=1);
quit;

hackathon24-white-horiz.png

The 2025 SAS Hackathon has begun!

It's finally time to hack! Remember to visit the SAS Hacker's Hub regularly for news and updates.

Latest Updates

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 8 replies
  • 2128 views
  • 1 like
  • 7 in conversation