BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
dakshu92
Calcite | Level 5

Data have

S. id          Subject

1                      Math

1                      English

1                      Biology

2                      Math  

2                      Math

2                      Math  

3                      English

3                      Math

3                      French

4                      Science

4                      English

4                      History

5                      Math

6                      Business

6                      French

7                      Math

7                      Math

 

Want

Students taking only math, no other subjects

Subject       N        

Math               3         

 

And how to get this:

ID                     Math

1                      0

2                      1

3                      0

4                      0

5                      1

6                      0

7                      1

1 ACCEPTED SOLUTION

Accepted Solutions
Quentin
Super User

@dakshu92 wrote:

What I have done till now:

 

data want;

set have;

if subject=math then math=1;

else math=0;

run;

 

For count:

PROC SQL;

create table math_only as 

select id

min(math) as onlymath

from have

group by id;

quit;

 

proc freq data=math_only;

tables only math;

run;

 

With this I am getting the count but not sure if it only taking the ids taking only math.

 


I see some typos in your code that should be generating errors, but I believe the logic is correct.  

 

data want;
  set have;
  if subject='Math' then math=1;     /*math needs to be in quotes, and capitalized*/
  else math=0;
run;

PROC SQL;
  create table math_only as 
  select id
  ,min(math) as onlymath  /*need a comma between columns*/
  from want
  group by id;
quit;

proc freq data=math_only;
 tables onlymath;
run;
The Boston Area SAS Users Group is hosting free webinars!
Next up: Joe Madden & Joseph Henry present Putting Power into the Hands of the Programmer with SAS Viya Workbench on Wednesday Nov 6.
Register now at https://www.basug.org/events.

View solution in original post

10 REPLIES 10
Patrick
Opal | Level 21

This feels very much like an exercise and as such we shouldn't just provide the full answer. What have you tried so far? What approaches can you think of?

It's also o.k. to post some not yet working code and ask for help.   

dakshu92
Calcite | Level 5

What I have done till now:

 

data want;

set have;

if subject=math then math=1;

else math=0;

run;

 

For count:

PROC SQL;

create table math_only as 

select id

min(math) as onlymath

from have

group by id;

quit;

 

proc freq data=math_only;

tables only math;

run;

 

With this I am getting the count but not sure if it only taking the ids taking only math.

 

Tom
Super User Tom
Super User

Your IF statement is not going to work right as posted.

if subject=math

is testing if the value of the variable named SUBJECT matches the value of the variable named MATH.

But your dataset does not have a variable named MATH.  It does have some values of SUBJECT that contain the string Math.  But it does not have any values of SUBJECT that would match the string math.

So code:

if subject='Math' then math=1;
else math=0;

Or since SAS will evaluate boolean expressions to 1 for TRUE and 0 for FALSE you could just use:

math = (subject='Math');
Quentin
Super User

@dakshu92 wrote:

What I have done till now:

 

data want;

set have;

if subject=math then math=1;

else math=0;

run;

 

For count:

PROC SQL;

create table math_only as 

select id

min(math) as onlymath

from have

group by id;

quit;

 

proc freq data=math_only;

tables only math;

run;

 

With this I am getting the count but not sure if it only taking the ids taking only math.

 


I see some typos in your code that should be generating errors, but I believe the logic is correct.  

 

data want;
  set have;
  if subject='Math' then math=1;     /*math needs to be in quotes, and capitalized*/
  else math=0;
run;

PROC SQL;
  create table math_only as 
  select id
  ,min(math) as onlymath  /*need a comma between columns*/
  from want
  group by id;
quit;

proc freq data=math_only;
 tables onlymath;
run;
The Boston Area SAS Users Group is hosting free webinars!
Next up: Joe Madden & Joseph Henry present Putting Power into the Hands of the Programmer with SAS Viya Workbench on Wednesday Nov 6.
Register now at https://www.basug.org/events.
dakshu92
Calcite | Level 5

Thank you everyone! I was able to get the count of the subject, however how do I make a new dummy variable?

What I tried:

data want;

set have;

if math=max(math) then math_first = 1;

else math_first=0;

run;

 

 

Tom
Super User Tom
Super User

Your current code is just going to set MATH_FIRST to 1 on every observation.  

 

That is because you are comparing the current value of the variable MATH (which will be created as missing if it does not already exist in HAVE) to the current value of the variable MATH.  That is because the MAX() function is for taking the largest value from the list of values you are passing it.  For example the if you called MAX() like this:

biggest = max(10,20,30,40);

then BIGGEST will be set to 40 since it larger than any of 10 , 20 or 30.

 

Since you only passed in the value of the variable MATH then by definition the largest value of that single value you passed in is going to be the same single value.

 

What do you want the new variable to indicate?  Can you describe in words what you want?  Can you create an example input dataset and show the values you want for MATH_FIRST on every observation of that input data?

 

dakshu92
Calcite | Level 5

Thanks for your help! I actually figured it out. The code I used earlier worked. 

mkeintz
PROC Star

 

data have;
  input id  Subject :$9. ;
datalines;
1                      Math
1                      English
1                      Biology
2                      Math  
2                      Math
2                      Math  
3                      English
3                      Math
3                      French
4                      Science
4                      English
4                      History
5                      Math
6                      Business
6                      French
7                      Math
7                      Math
run;

data math_only;
  merge  have (where=(subject='Math') in=inmath )  
         have (where=(subject^='Math') in=notmath)  ;
  by id;
  if last.id;
  math= (inmath=1 and notmath=0);
run;

 

 

--------------------------
The hash OUTPUT method will overwrite a SAS data set, but not append. That can be costly. Consider voting for Add a HASH object method which would append a hash object to an existing SAS data set

Would enabling PROC SORT to simultaneously output multiple datasets be useful? Then vote for
Allow PROC SORT to output multiple datasets

--------------------------
andreas_lds
Jade | Level 19

There are many ways to solve this problem:

  • one data step with by-group processing and retain
  • @mkeintz suggestion
  • two proc freqs and a merge
  • ....
proc freq data=have noprint;
   table Id * Subject / out=Math(drop= Percent where= (Subject = 'Math') rename= (Count = Math));
   table Id / out=Total(drop= Percent rename= (Count = Total));
run;

data want2;
   merge Total Math;
   by Id;

   Math = Total = Math;

   drop Total Subject;
run;
Ksharp
Super User
data have;
input  id          Subject $;
cards;
1                      Math
1                      English
1                      Biology
2                      Math  
2                      Math
2                      Math  
3                      English
3                      Math
3                      French
4                      Science
4                      English
4                      History
5                      Math
6                      Business
6                      French
7                      Math
7                      Math
;

proc sql;
create table want as
select id,count(Subject)=sum(Subject='Math') as Math
 from have
  group by id;
quit;

SAS Innovate 2025: Save the Date

 SAS Innovate 2025 is scheduled for May 6-9 in Orlando, FL. Sign up to be first to learn about the agenda and registration!

Save the date!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

SAS Training: Just a Click Away

 Ready to level-up your skills? Choose your own adventure.

Browse our catalog!

Discussion stats
  • 10 replies
  • 5660 views
  • 0 likes
  • 7 in conversation