BookmarkSubscribeRSS Feed
☑ This topic is solved. Need further help from the community? Please sign in and ask a new question.
jcis7
Pyrite | Level 9

G'day,

 

I've a question -- I'm trying to find observations below a certain cutoff value across any four years for the same ID.

 

Here's the data below and what I tried.  I realized it doesn't work.  what i want is to have identify in  a list that has ids which are below 90 for any four years (Yr), something like the table below.  Any help you can give is much appreciated. Or is there a better way to do it in proc sql?  Can this be done in the DATA step? Thank you!

 

id pctvax2000 pctvax2001 pctvax2002 pctvax2003 pctvax2004 pctvax2005 pctLT90_2000 pctLT90_2001 pctLT90_2002 pctLT90_2003 pctLT90_2004 pctLT90_2005 any4
1234567 88 90 57 95 77 62 1 0 1 0 1 1 4
1234568 82 91 53 78 90 95 1 0 1 1 0 0 5

 


*--2017-18 ;
DATA vax;
 INPUT id $ pctvax  yr $;
CARDS;
1234567 88 2000
1234567 90 2001 
1234565 57 2002
1234567 95 2003 
1234567 77 2004
1234567 62 2005 
1234568 82 2000
1234568 91 2001 
1234568 53 2002
1234568 78 2003 
1234568 90 2004
1234568 45 2005 

;
RUN; 

data vax1;
  set vax;
   if yr='2000' then do;
     pctvax2000=pctvax;
     if pctvax2000<90 then pctLT90_2000=1; 
   end;

   if yr='2001' then do;
      pctvax2001=pctvax;
     if pctvax2001<90 then pctLT90_2001=1; 
   end;
   
    if yr='2002' then do;
    pctvax2002=pctvax;
     if pctvax2002<90 then pctLT90_2002=1;      
   end;

      if yr='2003' then do;
     pctvax2003=pctvax;
     if pctvax2003<90 then pctLT90_2003=1; 
   end;

      if yr='2004' then do;
    pctvax2004=pctvax;
     if pctvax2004<90 then pctLT90_2004=1; 
   end;

      if yr='2005' then do;
     pctvax2005=pctvax;
     if pctvax2005<90 then pctLT90_2005=1; 
   end;

   if sum(pctLT90_2000, pctLT90_2001, pctLT90_2002, pctLT90_2003, pctLT90_2004, pctLT90_2005)=4 then  any4=1;
   else any4=0;

   run;

 

1 ACCEPTED SOLUTION

Accepted Solutions
Reeza
Super User

You say any of the 4 years but you're looking for all 4 years. Is the criteria, any of the 4 years is less than 90% or all of the four years less than 90%?

Two different approaches below - SQL to generate a single list of IDs or transpose and data step to generate a table more similar to what you have below. 

 

proc sql;
create listID_lt90 as
select distinct ID
from vax where pctvax <90;
quit;

proc transpose data=vax out=vax_wide prefix=pctvax;
by id;
id year;
var pctvax;
run;

data want;
set vax_wide;
if min(of pctvax:) < 90 then any4 = 1;
else any4=0;
run;

Also, you may be able to modify our code (untested) by changing this criteria:

 if sum(pctLT90_2000, pctLT90_2001, pctLT90_2002, pctLT90_2003, pctLT90_2004, pctLT90_2005)=4 then  any4=1;
   else any4=0;


 if sum(pctLT90_2000, pctLT90_2001, pctLT90_2002, pctLT90_2003, pctLT90_2004, pctLT90_2005)>0 then  any4=1;
   else any4=0;

 


@jcis7 wrote:

G'day,

 

I've a question -- I'm trying to find observations below a certain cutoff value across any four years for the same ID.

 

Here's the data below and what I tried.  I realized it doesn't work.  what i want is to have identify in  a list that has ids which are below 90 for any four years (Yr), something like the table below.  Any help you can give is much appreciated. Or is there a better way to do it in proc sql?  Can this be done in the DATA step? Thank you!

 

id pctvax2000 pctvax2001 pctvax2002 pctvax2003 pctvax2004 pctvax2005 pctLT90_2000 pctLT90_2001 pctLT90_2002 pctLT90_2003 pctLT90_2004 pctLT90_2005 any4
1234567 88 90 57 95 77 62 1 0 1 0 1 1 4
1234568 82 91 53 78 90 95 1 0 1 1 0 0 5

 


*--2017-18 ;
DATA vax;
 INPUT id $ pctvax  yr $;
CARDS;
1234567 88 2000
1234567 90 2001 
1234565 57 2002
1234567 95 2003 
1234567 77 2004
1234567 62 2005 
1234568 82 2000
1234568 91 2001 
1234568 53 2002
1234568 78 2003 
1234568 90 2004
1234568 45 2005 

;
RUN; 

data vax1;
  set vax;
   if yr='2000' then do;
     pctvax2000=pctvax;
     if pctvax2000<90 then pctLT90_2000=1; 
   end;

   if yr='2001' then do;
      pctvax2001=pctvax;
     if pctvax2001<90 then pctLT90_2001=1; 
   end;
   
    if yr='2002' then do;
    pctvax2002=pctvax;
     if pctvax2002<90 then pctLT90_2002=1;      
   end;

      if yr='2003' then do;
     pctvax2003=pctvax;
     if pctvax2003<90 then pctLT90_2003=1; 
   end;

      if yr='2004' then do;
    pctvax2004=pctvax;
     if pctvax2004<90 then pctLT90_2004=1; 
   end;

      if yr='2005' then do;
     pctvax2005=pctvax;
     if pctvax2005<90 then pctLT90_2005=1; 
   end;

   if sum(pctLT90_2000, pctLT90_2001, pctLT90_2002, pctLT90_2003, pctLT90_2004, pctLT90_2005)=4 then  any4=1;
   else any4=0;

   run;

 


 

View solution in original post

10 REPLIES 10
Reeza
Super User

You say any of the 4 years but you're looking for all 4 years. Is the criteria, any of the 4 years is less than 90% or all of the four years less than 90%?

Two different approaches below - SQL to generate a single list of IDs or transpose and data step to generate a table more similar to what you have below. 

 

proc sql;
create listID_lt90 as
select distinct ID
from vax where pctvax <90;
quit;

proc transpose data=vax out=vax_wide prefix=pctvax;
by id;
id year;
var pctvax;
run;

data want;
set vax_wide;
if min(of pctvax:) < 90 then any4 = 1;
else any4=0;
run;

Also, you may be able to modify our code (untested) by changing this criteria:

 if sum(pctLT90_2000, pctLT90_2001, pctLT90_2002, pctLT90_2003, pctLT90_2004, pctLT90_2005)=4 then  any4=1;
   else any4=0;


 if sum(pctLT90_2000, pctLT90_2001, pctLT90_2002, pctLT90_2003, pctLT90_2004, pctLT90_2005)>0 then  any4=1;
   else any4=0;

 


@jcis7 wrote:

G'day,

 

I've a question -- I'm trying to find observations below a certain cutoff value across any four years for the same ID.

 

Here's the data below and what I tried.  I realized it doesn't work.  what i want is to have identify in  a list that has ids which are below 90 for any four years (Yr), something like the table below.  Any help you can give is much appreciated. Or is there a better way to do it in proc sql?  Can this be done in the DATA step? Thank you!

 

id pctvax2000 pctvax2001 pctvax2002 pctvax2003 pctvax2004 pctvax2005 pctLT90_2000 pctLT90_2001 pctLT90_2002 pctLT90_2003 pctLT90_2004 pctLT90_2005 any4
1234567 88 90 57 95 77 62 1 0 1 0 1 1 4
1234568 82 91 53 78 90 95 1 0 1 1 0 0 5

 


*--2017-18 ;
DATA vax;
 INPUT id $ pctvax  yr $;
CARDS;
1234567 88 2000
1234567 90 2001 
1234565 57 2002
1234567 95 2003 
1234567 77 2004
1234567 62 2005 
1234568 82 2000
1234568 91 2001 
1234568 53 2002
1234568 78 2003 
1234568 90 2004
1234568 45 2005 

;
RUN; 

data vax1;
  set vax;
   if yr='2000' then do;
     pctvax2000=pctvax;
     if pctvax2000<90 then pctLT90_2000=1; 
   end;

   if yr='2001' then do;
      pctvax2001=pctvax;
     if pctvax2001<90 then pctLT90_2001=1; 
   end;
   
    if yr='2002' then do;
    pctvax2002=pctvax;
     if pctvax2002<90 then pctLT90_2002=1;      
   end;

      if yr='2003' then do;
     pctvax2003=pctvax;
     if pctvax2003<90 then pctLT90_2003=1; 
   end;

      if yr='2004' then do;
    pctvax2004=pctvax;
     if pctvax2004<90 then pctLT90_2004=1; 
   end;

      if yr='2005' then do;
     pctvax2005=pctvax;
     if pctvax2005<90 then pctLT90_2005=1; 
   end;

   if sum(pctLT90_2000, pctLT90_2001, pctLT90_2002, pctLT90_2003, pctLT90_2004, pctLT90_2005)=4 then  any4=1;
   else any4=0;

   run;

 


 

jcis7
Pyrite | Level 9
There are 6 years in the data (2000, 2001, 2002, 2003, 2004, 2005) and looking for IDs where pctvax<90 in any four years, for example:
Looking for ids with pctvax<90 in the following combination of years
2000, 2001, 2002, 2003
2000, 2002, 2003, 2004
2000, 2001, 2004, 2005
2001, 2002, 2003, 2004
Etc....


Thank you!

PaigeMiller
Diamond | Level 26

Work with VAX which is in a good format to provide the information you want. VAX1 isn't as good.

 

proc sql;
    create table want as select
    id,sum(pctvax<90) as below90
    from vax group by id;
quit;

 

 

--
Paige Miller
jcis7
Pyrite | Level 9
Thanks - I'm looking for school year which is in the results of the code you posted. I'm looking for the school year across the top with the school code listed in the column where the current school year is listed in the results of the code you posted.
PaigeMiller
Diamond | Level 26

@jcis7 wrote:
Thanks - I'm looking for school year which is in the results of the code you posted. I'm looking for the school year across the top with the school code listed in the column where the current school year is listed in the results of the code you posted.

This is a poor data layout for data analysis, don't do this.

 

If you want this data layout in a report or to create an EXCEL spreadsheet, use PROC REPORT from the original data layout in data set VAX.

--
Paige Miller
jcis7
Pyrite | Level 9

proc transpose data=vax out=vax_wide prefix=pctvax;
by id;
id year;
var pctvax;
run;

data vax_wide1;
  set vax_wide;
 
  if pctvax2000 < 90 then pctvax2000LT90=1;
  if pctvax2001 < 90 then pctvax2001LT90=1;
  if pctvax2002 < 90 then pctvax2002LT90=1;
  if pctvax2003 < 90 then pctvax2003LT90=1;
  if pctvax2004 < 90 then pctvax2004LT90=1;
  if pctvax2005 < 90 then pctvax2005LT90=1;

if sum (pctvax2000LT90, pctvax2001LT90, pctvax2002LT90, pctvax2003LT90, pctvax2004LT90, pctvax2005LT90) =4 then any4=1;
  else any4=0;
run;

This seems code above seems work for the simplified dataset I created.  I'm running into a problem where in the actual dataset, it counts missing values as <90. I can post it as a separate post. Thank you everyone for your help!

Reeza
Super User
proc transpose data=vax out=vax_wide prefix=pctvax;
by id;
id year;
var pctvax;
run;

data vax_wide1;
  set vax_wide;
 array pctvax(2000:2005) pctvax2000-pctvax2005;
 array lt90vax(2000:2005) lt90vax2000-lt90vax2005;

*set all to 0;
do year=2000 to 2005;
 lt90vax(year) = 0;
end;

*set to 1 if applicable;
do year=2000 to 2005;
 if not missing(pctvax(year)) and pctvax(year) < 90 then lt90vax(year) = 1;
end;

if sum of lt90vax(*)) =4 then any4=1;
  else any4=0;
run;
jcis7
Pyrite | Level 9
Great - thank you!

* The code worked until:

401
402 if sum of lt90vax(*))=4 then any4=1;
--
22
76
ERROR 22-322: Syntax error, expecting one of the following: (, [, {.

ERROR 76-322: Syntax error, statement will be ignored.

403 else any4=0;
404 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set WANT may be incomplete. When this step was stopped there were 0 observations and 14 variables.
WARNING: Data set WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.08 seconds
cpu time 0.00 seconds





* I tried adding a parenthesis and got the following error:

441 end;
442
443 if sum of (lt90vax(*))=4 then any4=1;
-- -
22 386
68 200
76
ERROR 22-322: Syntax error, expecting one of the following: !, !!, &, *, **, +, -, /, ;, <, <=, <>, =, >, ><, >=, AND, EQ,
GE, GT, LE, LT, MAX, MIN, NE, NG, NL, OR, ^=, |, ||, ~=.

ERROR 386-185: Expecting an arithmetic expression.

ERROR 68-185: The function OF is unknown, or cannot be accessed.

ERROR 200-322: The symbol is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

444 else any4=0;
445 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set N.WANT may be incomplete. When this step was stopped there were 0 observations and 14 variables.
WARNING: Data set N.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.08 seconds
cpu time 0.01 seconds



* Then I tried removing the 'of' and got the following error:

461 end;
462
463 if sum (lt90vax(*))=4 then any4=1;
-
386
200
76
ERROR 386-185: Expecting an arithmetic expression.

ERROR 200-322: The symbol is not recognized and will be ignored.

ERROR 76-322: Syntax error, statement will be ignored.

464 else any4=0;
465 run;

NOTE: The SAS System stopped processing this step because of errors.
WARNING: The data set N.WANT may be incomplete. When this step was stopped there were 0 observations and 14 variables.
WARNING: Data set N.WANT was not replaced because this step was stopped.
NOTE: DATA statement used (Total process time):
real time 0.43 seconds
cpu time 0.03 seconds



Thank you!
Reeza
Super User

It's missing the opening bracket...I think you should be able to figure where that should go for the sum function 🙂

jcis7
Pyrite | Level 9
Thank you! I looked up Sum Function and found: SAS Help Center: SUM Function<>
So now, this works: 😊
if sum (of lt90vax(*))=4 then any4=1;
else any4=0;
run;

sas-innovate-2024.png

Available on demand!

Missed SAS Innovate Las Vegas? Watch all the action for free! View the keynotes, general sessions and 22 breakouts on demand.

 

Register now!

How to Concatenate Values

Learn how use the CAT functions in SAS to join values from multiple variables into a single value.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 10 replies
  • 550 views
  • 8 likes
  • 3 in conversation