Building models with SAS Enterprise Miner, SAS Factory Miner, SAS Visual Data Mining and Machine Learning or just with programming

Duplicate corresponding variable within a group

Reply
Regular Contributor
Posts: 215

Duplicate corresponding variable within a group

Can anyone please tell me the code that will return the observations where a corresponding variable occurs more than one time within that group.

Here is my data:

Name  Result

John   pass

John   pass

Albert

Alice   fail

Julia   pass

Nadia

Julia   pass

Jack   pass


The code will return John and Julia as both of those group has 'pass' that occurred more than one time. Output will be:


Name  Result

John   pass

John   pass

Julia   pass

Julia   pass


Thank you,


Respected Advisor
Posts: 3,124

Re: Duplicate corresponding variable within a group

Proc sort data=have out=want NOUNIQUEKEY;

by name result;

run;

Regular Contributor
Posts: 215

Re: Duplicate corresponding variable within a group

Hi hai.Kuo,

My data is actually look like this:

Name  Result Course

John   pass     Math

John   pass     Physics

John   fail        Chemistry

Albert             Math

Alice   fail       Physics

Julia    pass    Physics

Nadia  pass    Chemistry

Julia   pass     Math

Jack   pass    Math


The code should return John and Julia as both of those group has 'pass' that occurred more than one time. Output will be:

Notice that John has 3 observations, but it discarded 'fail' and returned where there is pass within the John group.

Name  Result Course

John   pass     Math

John   pass     Physics

Julia    pass    Physics

Julia   pass     Math

can you help me a bit more please.

Respected Advisor
Posts: 3,124

Re: Duplicate corresponding variable within a group

Then add a dataset option:

Proc sort data=have(where=(result='pass')) out=want NOUNIQUEKEY;

by name result;

run;

Respected Advisor
Posts: 3,124

Re: Duplicate corresponding variable within a group

Tested:

data have;

     input (Name  Result Course) (:$10.);

     cards;

John   pass Math

John   pass Physics

John   fail Chemistry

Albert  . Math

Alice   fail Physics

Julia    pass Physics

Nadia  pass Chemistry

Julia   pass Math

Jack   pass Math

;

Proc sort data=have(where=(result='pass')) out=want NOUNIQUEKEY;

     by name result;

run;

proc print;run;

    

1JohnpassMath
2JohnpassPhysics
3JuliapassPhysics
4JuliapassMath

Super Contributor
Posts: 336

Re: Duplicate corresponding variable within a group

Hi,

I don't remember if we have a quick way to do this in Enterprise Miner. The quick alternative is to use proc sort. This would also work on a SAS Code Node in Enterprise Miner.

Personal favorite, is to use NODUP option, and by _all_. Example:

In a SAS Code in EM you might want to use

proc sort data=mydata NODUP;

by _all_;

run;

In a SAS Code Node in EM you might want to use:

proc sort data=&em_import_data out=&em_export_train NODUP;

by _all_;

run;

I hope this helps!

-Miguel

Regular Contributor
Posts: 215

Re: Duplicate corresponding variable within a group

Hi MiguelMaldonado,

I think you are very close to what I wanted.

My data is actually look like this:

Name  Result Course

John   pass     Math

John   pass     Physics

John   fail        Chemistry

Albert             Math

Alice   fail       Physics

Julia    pass    Physics

Nadia  pass    Chemistry

Julia   pass     Math

Jack   pass    Math


The code should return John and Julia as both of those group has 'pass' that occurred more than one time. Output will be:

Notice that John has 3 observations, but it discarded 'fail' and returned where there is pass within the John group.

Name  Result Course

John   pass     Math

John   pass     Physics

Julia    pass    Physics

Julia   pass     Math

can you help me a bit more please.

Contributor
Posts: 56

Re: Duplicate corresponding variable within a group

May I ask the further issue about duplicate data?
If now, I'd like to select the duplicate data in macro and I do not know how many variables in each dataset. However, proc sort need to indicate the variables... So, how should I do to show up those duplicates?
Ask a Question
Discussion stats
  • 7 replies
  • 664 views
  • 2 likes
  • 4 in conversation