I have a data set were few variables are repeating but i only want to keep info for a specific variable
Have:
Name ID Code code_description type
Sarah 0007 A19 DM test mining
Sarah 0007 B45 DM test list
Jack 1111 F56 HPY type 1 mining
Jack 1111 H78 HPY type 1 list
Want
Name ID Code code_description type
.. .. A19 DM test mining
Sarah 0007 B45 DM test list
.. ... F56 HPY type 1 mining
Jack 1111 H78 HPY type 1 list
so i want to delete the name and ID or just place a zero or none for mining when that data is repeated like this.
Repeating , ok. How many repeating or is it just 2 like in your sample? Which record within a by group do you want to keep(some logic for that) plz?
data have;
input Name $ ID $ Code $ code_description & $15. type $;
cards;
Sarah 0007 A19 DM test mining
Sarah 0007 B45 DM test list
Jack 1111 F56 HPY type 1 mining
Jack 1111 H78 HPY type 1 list
;
data want;
set have;
if type='mining' then call missing(name,id);
run;
or
data want;
set have;
if type ne 'list' then call missing(name,id);
run;
thank you replying but i have other 'mining' type data in my set and they dont have 'list' type repeating for them so when that happens i want to keep the info as it is
Does that mean those others have just 1 record for an id? or is it 2 records both with type='mining' ?
Can you modify the sample
data have;
input Name $ ID $ Code $ code_description & $15. type $;
cards;
Sarah 0007 A19 DM test mining
Sarah 0007 B45 DM test list
Jack 1111 F56 HPY type 1 mining
Jack 1111 H78 HPY type 1 list
Bob 1222 F56 HPY type 1 mining
Rob 3333 A19 DM test mining
;so some ids only have mining type but i dont wwant to remove their info. I only want to remover mining info if they have list type and mining type
data have;
input Name $ ID $ Code $ code_description & $15. type $;
cards;
Sarah 0007 A19 DM test mining
Sarah 0007 B45 DM test list
Jack 1111 F56 HPY type 1 mining
Jack 1111 H78 HPY type 1 list
Bob 1222 F56 HPY type 1 mining
Rob 3333 A19 DM test mining
;
proc sort data=have;
by id;
run;
data want;
call missing(f);
do until(last.id);
set have;
by id;
if type='list' then f=1;
end;
do until(last.id);
set have;
by id;
if f and type='mining' then call missing(name,id);
output;
end;
drop f;
run;
It sounds like that you have multiple observations per NAME,ID group and when that happens you want to remove the NAME and ID values from the observations that has TYPE='mining' and keep them for the observations that have TYPE='list'.
This sounds like a bad idea. What is the reason you want to do this? Perhaps you are just trying to produce a report?
If so then perhaps it is better to leave the values in place and use a reporting method that display the id variables just once per group?
data have ;
input Name $ ID $ Code $ code_description &:$15. type $ ;
cards;
Sarah 0007 A19 DM test mining
Sarah 0007 B45 DM test list
Jack 1111 F56 HPY type 1 mining
Jack 1111 H78 HPY type 1 list
;
proc sort; by Name ID type ; run;
proc print ;
by Name Id;
id Name Id;
run;
code_
Name ID Code description type
Jack 1111 H78 HPY type 1 list
F56 HPY type 1 mining
Sarah 0007 B45 DM test list
A19 DM test mining
April 27 – 30 | Gaylord Texan | Grapevine, Texas
Walk in ready to learn. Walk out ready to deliver. This is the data and AI conference you can't afford to miss.
Register now and lock in 2025 pricing—just $495!
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.