BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
kmardinian
Quartz | Level 8

Hi, I am working with a dataset that looks like this

 

Subject    Base1 Base2 CHG1 CHG2   VisitID

1                4           5                                  1

1                4           5         2            3         2

2                4           5                                  1

3                4           5                                  1

 

I'd like to create a new dataset from this data where I can isolate the CHG variables without losing Subject 1 VisitID 1 observation, so that it would look like this

 

Subject    Base1 Base2 CHG1 CHG2   VisitID

1                4           5                                  1

1                4           5         2            3         2

 

Any help would be much appreciated. Thank you!

1 ACCEPTED SOLUTION

Accepted Solutions
Tom
Super User Tom
Super User

Still not at all clear what you want. Your words say you just want to include records that have visitid=1 or non-missing CHGx values.

    if CHG1 ne . or CHG2 ne . or visitid=1;

But that is not the output that you posted.

To get that you would change the last condition to be just the VISITID=1 to for Subject 1

   if CHG1 ne . or CHG2 ne . or (subject=1 and visitid=1);

Note you didn't specify if VISITID or SUBJECT are numbers or character strings.  I have coded as if they are numbers, but in most instances such identifiers would be character strings since you never need to run statistics on them.

View solution in original post

13 REPLIES 13
SuryaKiran
Meteorite | Level 14

What is it your trying to do here?

Thanks,
Suryakiran
kmardinian
Quartz | Level 8

I would like to create a dataset with all the subjects that have observations for the CHG variables without losing my VisitID=1 observations.

 

I've been running this code

 

data final;
   set t4;
    if CHG1 ne .;
run;

 

which obviously doesn't work because then I am losing some of the observations for those subjects, but I'm not sure how else to do it

Tom
Super User Tom
Super User

Still not at all clear what you want. Your words say you just want to include records that have visitid=1 or non-missing CHGx values.

    if CHG1 ne . or CHG2 ne . or visitid=1;

But that is not the output that you posted.

To get that you would change the last condition to be just the VISITID=1 to for Subject 1

   if CHG1 ne . or CHG2 ne . or (subject=1 and visitid=1);

Note you didn't specify if VISITID or SUBJECT are numbers or character strings.  I have coded as if they are numbers, but in most instances such identifiers would be character strings since you never need to run statistics on them.

novinosrin
Tourmaline | Level 20
data have;
infile cards truncover;
input Subject    Base1 Base2 CHG1 CHG2   VisitID;
cards;
1                4           5           . .                      1
1                4           5         2            3         2
2                4           5         . .                         1
3                4           5           .  .                       1
;


proc sql;
create table want as
select *
from have
group by subject
having max((n(CHG1 ,CHG2)>0))=1;
quit;

@kmardinian I think the above will work 

kmardinian
Quartz | Level 8

Thank you! Do  you mind explaining what this statement is doing?

max((n(CHG1 ,CHG2)>0))=1;

 

novinosrin
Tourmaline | Level 20

n function returns the number of non missing values, that's a check and the boolean value is returned is 0 or 1 basing on true of false. And then we filter using having clause to output only true (1) records

kmardinian
Quartz | Level 8

I am trying to do it with one CHG variable at a time, but for some reason it is giving me this error 😕


536
537  proc sql;
538  create table final_all as select * from t4
539  group by subnum having max((n(chg1)>0))=1;
ERROR: Summary functions nested in this way are not supported.
540  quit;

novinosrin
Tourmaline | Level 20

Not sure what's your requirement , now you seem to change from your initial. Can you clarify your have and want  plz and make it comprehensive for the community

kmardinian
Quartz | Level 8

Sorry, it's difficult to properly explain what I was looking for, but Tom's line of code below is exactly what I needed.

 

Thank you so much for all your help!

 

   if CHG1 ne . or CHG2 ne . or (subject=1 and visitid=1);
SuryaKiran
Meteorite | Level 14

Maybe a subquery might help

 

proc sql;
select * from have
where subject in (select subject from have where chg1 is not null and chg2 is not null);
quit;
Thanks,
Suryakiran
kmardinian
Quartz | Level 8

Thank you, I will try it this way as well!

FreelanceReinh
Jade | Level 19

@kmardinian wrote:

I am trying to do it with one CHG variable at a time, but for some reason it is giving me this error 😕


536
537  proc sql;
538  create table final_all as select * from t4
539  group by subnum having max((n(chg1)>0))=1;
ERROR: Summary functions nested in this way are not supported.
540  quit;


The reason is: The N function is both a SAS function (to be applied to values within an observation) and an SQL summary function (to be applied to values within a column). As long as there were two arguments it was clear that you meant the SAS function. Now with only one argument it could be the SQL function as well and indeed this is the preferred interpretation by the compiler. Therefore, n(chg1) is already the number of non-missing CHG1 values in the group and the additional SQL summary function MAX is unnecessary and even inappropriate (hence the error).


So, one of several possible corrections is:

having n(chg1);

(Similarly, the original HAVING clause could have been written as having n(CHG1) | n(CHG2).)

Ksharp
Super User
data have;
infile cards truncover;
input Subject    Base1 Base2 CHG1 CHG2   VisitID;
cards;
1                4           5           . .                      1
1                4           5         2            3         2
2                4           5         . .                         1
3                4           5           .  .                       1
;

proc sort data=have out=want nodupkey;
by  Base1 Base2 CHG1 CHG2   VisitID;
run;
proc print;run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 13 replies
  • 1112 views
  • 0 likes
  • 6 in conversation