Hi,
I have this example data set (see xlsx file attached) in sheet "have". and want to get the count like in sheet "want".
Can someone please help me with this?
I tried this within a sas data set with "if FIRST.STUDENT_ID and FIRST.COURSE_ID and FIRST.Exam_Date then count = 1; etc.
But I dont seem to get it right with this option in SAS?... 😞
Thanks in advance all!
data test;
infile datalines expandtabs;
input (STUDENT_ID COURSE_ID) ($) GRADE_POINTS EXAM_DATE :yymmdd10.;
format EXAM_DATE yymmdd10.;
datalines;
1111 1000T 6 2017-07-05
1111 2000T 7 2017-10-31
2222 1000T 8 2017-10-30
2222 2000T 8 2017-01-10
2222 3000T 8 2017-01-05
2222 4000T 9 2017-03-22
2222 5000T 9 2017-03-20
2222 6000T 8 2017-05-29
2222 7000T 9 2017-06-06
2222 8000T 7 2017-05-31
3333 4000T 9 2015-10-26
4444 1000T 7.5 2015-10-09
4444 2000T 8 2015-10-06
4444 3000T 7 2016-11-23
4444 4000T 7 2015-11-19
4444 5000T 7.5 2015-12-14
4444 6000T 6 2017-03-02
4444 7000T 7 2016-06-10
4444 8000T 6 2016-04-14
4444 8000T 8 2016-05-14
;
run;
data want;
set test;
by STUDENT_ID COURSE_ID;
if first.COURSE_ID then count=.;
count+GRADE_POINTS>0;
run;
Please post data in usable form: a data step using datalines and explain the logic you want to apply to that data. Then show what you expect as result, again as data step with datalines or as table in the post.
Hi Andreas,
Sorry I don't use datalines, and never had, as I use SAS EG here. Not Base SAS. 🙂
Please just download the attachment and use this code like I do to import the data.
%let path=P:\Example _data.xlsx; /* replace with your path and file name! */
proc import datafile="&path"
out=RES dbms=xlsx replace;
getnames=yes;
run;
proc sort data=res out=test;
by Student_ID COURSE_ID exam_date;
run;
data result;
set test;
by Student_ID COURSE_ID exam_date;
count=0;
if FIRST.Student_ID then do;
if FIRST.COURSE_ID then do;
if FIRST.exam_date then Count=1;
end;
end;
count+1;
if LAST.Student_ID then do;
if LAST.Course_ID then do;
if LAST.exam_date then output;
end;
end;
run;
So what I want (as you can see in the xlsx file on sheet "want" is:
See sceenshot:
I hope this is okay for you to test the data. If you still want datalines, please let me know!
TiA.
EDIT:
is this a good datalines set for this data?
data test;
infile datalines;
input STUDENT_ID COURSE_ID GRADE_POINTS EXAM_DATE;
datalines;
1111 1000T 6 2017-07-05
1111 2000T 7 2017-10-31
2222 1000T 8 2017-10-30
2222 2000T 8 2017-01-10
2222 3000T 8 2017-01-05
2222 4000T 9 2017-03-22
2222 5000T 9 2017-03-20
2222 6000T 8 2017-05-29
2222 7000T 9 2017-06-06
2222 8000T 7 2017-05-31
3333 4000T 9 2015-10-26
4444 1000T 7,5 2015-10-09
4444 2000T 8 2015-10-06
4444 3000T 7 2016-11-23
4444 4000T 7 2015-11-19
4444 5000T 7,5 2015-12-14
4444 6000T 6 2017-03-02
4444 7000T 7 2016-06-10
4444 8000T 6 2016-04-14
4444 8000T 8 2016-05-14
;
run;
data test;
infile datalines expandtabs;
input (STUDENT_ID COURSE_ID) ($) GRADE_POINTS EXAM_DATE :yymmdd10.;
format EXAM_DATE yymmdd10.;
datalines;
1111 1000T 6 2017-07-05
1111 2000T 7 2017-10-31
2222 1000T 8 2017-10-30
2222 2000T 8 2017-01-10
2222 3000T 8 2017-01-05
2222 4000T 9 2017-03-22
2222 5000T 9 2017-03-20
2222 6000T 8 2017-05-29
2222 7000T 9 2017-06-06
2222 8000T 7 2017-05-31
3333 4000T 9 2015-10-26
4444 1000T 7.5 2015-10-09
4444 2000T 8 2015-10-06
4444 3000T 7 2016-11-23
4444 4000T 7 2015-11-19
4444 5000T 7.5 2015-12-14
4444 6000T 6 2017-03-02
4444 7000T 7 2016-06-10
4444 8000T 6 2016-04-14
4444 8000T 8 2016-05-14
;
run;
data want;
set test;
by STUDENT_ID COURSE_ID;
if first.COURSE_ID then count=.;
count+GRADE_POINTS>0;
run;
Thanks @novinosrin it works!!! Why was I thinking so hard?! 🙂 🙂 🙂
If you care to explain this to me:
-what you did in datalines and why
-and what this peace of code means/does:
count+GRADE_POINTS>0;
Then I Will accept your reply as solution to my question!
Thanks again! 🙂
count+GRADE_POINTS>0;
is a boolean expression i.e
when true=1, so count+(1)
when false=0,so count+(0)
And the value is reset to missing when we process records of new courseid making it distinct counts of couseid in sequence when true
Well , WRT datalines, our request like @andreas_lds pointed for members to post in the form that we don't have to type line by line in creating a dataset and then work on it. It can be time consuming. So we thank you for that. All I did was formatted with the right type and date formats to read the datalines correctly. No Biggie.
So going forward, help us with the following 3(all in plain text)
1. Sample of HAVE dataset
2. Sample of WANT dataset
3. A brief explanation of logic relating to the WANT dataset i.e how to derive it.
Of course with the above 3, leave the technical part to us
Thanks guys!
@novinosrin thank you for the explanation of your code! Very neat! I like that! 🙂
And will keep the posting rules in mind for my future questions here! Thanks again all!
Save $250 on SAS Innovate and get a free advance copy of the new SAS For Dummies book! Use the code "SASforDummies" to register. Don't miss out, May 6-9, in Orlando, Florida.
SAS' Charu Shankar shares her PROC SQL expertise by showing you how to master the WHERE clause using real winter weather data.
Find more tutorials on the SAS Users YouTube channel.