BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
GreggB
Pyrite | Level 9

I’m looking at 1st sem grades and I’ve noticed that occasionally a student has 2 different marks for the same subject in the same term.  I sorted this by student_num (which is unique for each student), crstitle, coursenum and mark.  Is there some way I can find records where all of these variables would be exactly the same except for the mark?  I need to eliminate those kinds of records  Thanks.

The SAS System

Obs

Student_Num

Crstitle

CourseNum

Mark

1

10031

Lang Arts 7 GT

2001G700

91

2

10031

Life Science 7

22210700

97

3

10031

Social Studies 7

23100700

98

4

10031

Advanced Art 7

2501A700

100

5

10031

Strings 7

25500700

100

6

10031

Algebra 1 H

4111MSHW

96

7

10041

Lang Arts 7

20010700

94

8

10041

Pre-Algebra

2110PA00

90

9

10041

Life Science 7

22210700

89

10

10041

Social Studies 7

23100700

93

11

10041

Art 7

2501S700

100

12

10041

Chorus 7

25410700

99

13

10069

Lang Arts 6 GT

1001G600

95

14

10069

Pre-Algebra 6

1110P600

97

15

10069

Science 6

12010600

97

16

10069

Social Studies 6

13010600

97

17

10069

Strings 6

15510600

99

18

10069

PE/Health 6

2460S600

100

1 ACCEPTED SOLUTION

Accepted Solutions
Haikuo
Onyx | Level 15

There are many ways to solve your problem. I assume you only to get rid of the duplicates, not all of obs with duplicates.

Proc sort data=have out=want nodupkey;

by  student_num crstitle coursenum;

run;

Regards,

Haikuo

View solution in original post

3 REPLIES 3
Haikuo
Onyx | Level 15

There are many ways to solve your problem. I assume you only to get rid of the duplicates, not all of obs with duplicates.

Proc sort data=have out=want nodupkey;

by  student_num crstitle coursenum;

run;

Regards,

Haikuo

Linlin
Lapis Lazuli | Level 10

How about:

options nocenter;

data have;

input Student_Num $ Crstitle $ CourseNum Mark;

cards;

1 aa 100 90

1 bb 200 100

1 aa 100 98

2 aa 100 90

2 bb 200 100

2 aa 100 98

3 aa 100 90

3 bb 200 100

4 aa 100 98

;

proc sort;

by Student_Num  Crstitle  CourseNum;

data want;

  set have;

  by Student_Num  Crstitle  CourseNum;

  if first.CourseNum and last.CourseNum then delete;

proc print;

title duplicate records by Student_Num  Crstitle  CourseNum;

run;

title Unique records by Student_Num  Crstitle  CourseNum;

proc sql;

  select *,count(*) as total from have

    group by Student_Num,  Crstitle,  CourseNum

        having total=1;

quit;

title;

duplicate records by Student_Num  Crstitle  CourseNum          

       Student_                Course

Obs      Num       Crstitle      Num     Mark

1        1           aa         100      90

2        1           aa         100      98

3        2           aa         100      90

4        2           aa         100      98

Unique records by Student_Num  Crstitle  CourseNum             

Student_Num  Crstitle  CourseNum      Mark     total

ƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒƒ

1            bb              200       100         1

2            bb              200       100         1

3            aa              100        90         1

3            bb              200       100         1

4            aa              100        98         1

Linlin

LarryWorley
Fluorite | Level 6

Gregg,

How do you want to handle the different values for Mark?

If you do not care,  then HaiKuo's suggestion will work fine.  Proc sort with nodupkeys will return an arbitrary record.

If you want to restrict to largest or smallest then a variation of LinLin's code will work. 

proc sort;

by Student_Num  Crstitle  CourseNum Mark;

data want;

  set have;

  by Student_Num  Crstitle  CourseNum;

  * keeps lowest value of Mark

    if you want largest value then substitute

     last.CourseNum for first.courseNum in the

     locigal portion of the if statement ;

  if not first.CourseNum and last.CourseNum then delete;

LinLin's code could also be modified to handle other criteria.

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 3 replies
  • 903 views
  • 3 likes
  • 4 in conversation