Experts,
Below is the output result of proc compare which compares two data sets and tells whether observations have matching values for variables. It the values are matching it gives the difference as 0 otherwise the calculated value.
Now I want to limit this data to only those "C_name" for which all "_OBS_" are having "Value"=0 .
Kindly see c_name "onn" is having one zero for its set of data and I dont want data relating to ONN to be in my new dataset. I only want those C_name for which all obs have 0 values; so the condition need to check whether the "value" is 0 for all the unique "C_names" .
One more thing the _OBS_ reference may change i.e here I have posted only 6 but originally this number may be huge so its better if we have something which is dynamic and automatically COUNTS the number of _OBS_ for unique c_names which will be same for all c_name.
_TYPE_ | _OBS_ | C_name | metric | value |
DIF | 1 | hero | ar_y1 | 88 |
DIF | 2 | hero | ar_y2 | -22 |
DIF | 3 | hero | ar_y3 | -22 |
DIF | 4 | hero | cash_y1 | 79 |
DIF | 5 | hero | cash_y2 | -10 |
DIF | 6 | hero | cash_y3 | -10 |
DIF | 1 | onn | ar_y1 | 0 |
DIF | 2 | onn | ar_y2 | -20 |
DIF | 3 | onn | ar_y3 | -20 |
DIF | 4 | onn | cash_y1 | 0 |
DIF | 5 | onn | cash_y2 | -22 |
DIF | 6 | onn | cash_y3 | 75 |
DIF | 1 | yu | ar_y1 | 0 |
DIF | 2 | yu | ar_y2 | 0 |
DIF | 3 | yu | ar_y3 | 0 |
DIF | 4 | yu | cash_y1 | 0 |
DIF | 5 | yu | cash_y2 | 0 |
DIF | 6 | yu | cash_y3 | 0 |
Requirement in NUTSHELL:
ALL the C_names which have all the values zero for all the obs (here we have only one C_name which fits our criteria and that is "YU").
Thanks a lot...!
or, possibly even better yet:
proc sql;
create table nullNames as
select c_name
from have
group by c_name
having min(value)=0 and max(value)=0 and nmiss(value)=0
;
quit;
Of course there is a function for that, but you have to use proc sql. e.g.:
proc sql;
select distinct c_name
from have
group by c_name
having sum(value) eq 0
;
quit;
You are better off with:
proc sql;
create table nullNames as
select c_name
from have
group by c_name
having min(value)=0 and max(value)=0;
quit;
PG
or, possibly even better yet:
proc sql;
create table nullNames as
select c_name
from have
group by c_name
having min(value)=0 and max(value)=0 and nmiss(value)=0
;
quit;
Thanks I used mix of both options.
That's a bad subject line.
There's a way to do it, you don't know how, big difference.
Apology for that but it is partially true if we seek some datastep functions to the job.
Nevertheless thanks everyone for motivating to participate and correcting newbies where we go wrong.
Or, if your dataset is already sorted by c_name:
data nullNames(keep=c_name);
do until(last.c_name);
set have; by c_name;
if value ne 0 then flag=1;
end;
if not flag then output;
drop flag;
run;
PG
I'd use a different approach all together, using the outstate dataset from proc compare which lists all variables that are equal in both data sets and then isolate the one where the NDIF=0.
Proc Compare also has listequalvars option that will list the variables that are equal though that's partially useless since you can't capture it in a table.
I think SAS should enable a way to capture that information directly
https://communities.sas.com/ideas/1734
data class;
set sashelp.class;
if age=13 then weight=weight+3;
if age=16 then BMI=weight*height;
run;
proc compare data=sashelp.class compare=class outstats=part1 listequalvars;
run;
data want;
set part1;
where _type_='NDIF' and _base_=0 and _comp_=0;
keep _var_;
run;
Sample output:
Variables with All Equal Values
Variable Type Len1 Len2
Name CHAR 8 8
Sex CHAR 1 20
Age NUM 8 8
Height NUM 8 8
Variables with Unequal Values
Variable Type Len1 Len2 Ndif MaxDif
Weight NUM 8 8 3 3.000
data have; input _TYPE_ $ _OBS_ C_name $ metric $ value ; cards; DIF 1 hero ar_y1 88 DIF 2 hero ar_y2 -22 DIF 3 hero ar_y3 -22 DIF 4 hero cash_y1 79 DIF 5 hero cash_y2 -10 DIF 6 hero cash_y3 -10 DIF 1 onn ar_y1 0 DIF 2 onn ar_y2 -20 DIF 3 onn ar_y3 -20 DIF 4 onn cash_y1 0 DIF 5 onn cash_y2 -22 DIF 6 onn cash_y3 75 DIF 1 yu ar_y1 0 DIF 2 yu ar_y2 0 DIF 3 yu ar_y3 0 DIF 4 yu cash_y1 0 DIF 5 yu cash_y2 0 DIF 6 yu cash_y3 0 ; run; proc sql; create table nullNames as select c_name from have group by c_name having sum(value ne 0)=0 ; quit;
Xia Keshan
Are you ready for the spotlight? We're accepting content ideas for SAS Innovate 2025 to be held May 6-9 in Orlando, FL. The call is open until September 25. Read more here about why you should contribute and what is in it for you!
Learn how use the CAT functions in SAS to join values from multiple variables into a single value.
Find more tutorials on the SAS Users YouTube channel.