Help using Base SAS procedures

query

Accepted Solution Solved
Reply
Frequent Contributor
Posts: 145
Accepted Solution

query

I have following dataset with the following 4 variables...subjid,number,cat,and visitnum..... the number of lesion must remain constant from visit to visit.....i should output where The number of target lesions is not constant from visit to visit....the below sample subject have constant number from visit to visit and we should output discrepancies if there any like those highlighted in blue.....

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02         TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1


Accepted Solutions
Solution
‎12-30-2014 02:09 AM
Super User
Posts: 9,662

Re: query

Did you just say the each NUMBER have the same number of obs in each and every  subjid group ?

data have;
input subjid : $20. number $ cat & $40. visitnum     ;
cards;
100010001           T02         TARGET LESIONS              0
100010001           T01         TARGET LESIONS              0
100010001           T03         TARGET LESIONS              0
100010001           T02         TARGET LESIONS              1
100010001           T03         TARGET LESIONS              1
100010001           T01         TARGET LESIONS              1
100010001           T02         TARGET LESIONS              2
100010001           T01         TARGET LESIONS              2
100010001           T03         TARGET LESIONS              2
100010002           T02         TARGET LESIONS              0
100010002           T01         TARGET LESIONS              0
100010002           T02         TARGET LESIONS              1
100010002           T01         TARGET LESIONS              1
100010002           T02         TARGET LESIONS              2
100010002           T01         TARGET LESIONS              2
100030002           T03         TARGET LESIONS              0
100030002           T04         TARGET LESIONS              0
100030002           T02         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              1
100030002           T04         TARGET LESIONS              1
100030002           T03         TARGET LESIONS              1
100030002           T02         TARGET LESIONS              1
160020001           T02         TARGET LESIONS              0
160020001           T01         TARGET LESIONS              0
160020001           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              0
160020002           T01         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              2
;
run;
proc sql;
 create table temp as
 select * from
 ( select *,count(*) as n
   from have 
    group by subjid     , visitnum)
 group by subjid
  having range(n) ne 0;

quit;

Xia Keshan

Message was edited by: xia keshan

Message was edited by: xia keshan

View solution in original post


All Replies
Respected Advisor
Posts: 4,640

Re: query

What if you had:

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T03         TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

Would you want to flag that discrepancy?

PG

PG
Frequent Contributor
Posts: 145

Re: query

Thanks Pg....We consider Visitnum=0 as baseline visit.....So what ever number present in baseline should be present in next occuring visits.....if any number not present in visitnum=0 but present in next visit or vice verse....we consider them as discrepany and we should output those.....

Solution
‎12-30-2014 02:09 AM
Super User
Posts: 9,662

Re: query

Did you just say the each NUMBER have the same number of obs in each and every  subjid group ?

data have;
input subjid : $20. number $ cat & $40. visitnum     ;
cards;
100010001           T02         TARGET LESIONS              0
100010001           T01         TARGET LESIONS              0
100010001           T03         TARGET LESIONS              0
100010001           T02         TARGET LESIONS              1
100010001           T03         TARGET LESIONS              1
100010001           T01         TARGET LESIONS              1
100010001           T02         TARGET LESIONS              2
100010001           T01         TARGET LESIONS              2
100010001           T03         TARGET LESIONS              2
100010002           T02         TARGET LESIONS              0
100010002           T01         TARGET LESIONS              0
100010002           T02         TARGET LESIONS              1
100010002           T01         TARGET LESIONS              1
100010002           T02         TARGET LESIONS              2
100010002           T01         TARGET LESIONS              2
100030002           T03         TARGET LESIONS              0
100030002           T04         TARGET LESIONS              0
100030002           T02         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              1
100030002           T04         TARGET LESIONS              1
100030002           T03         TARGET LESIONS              1
100030002           T02         TARGET LESIONS              1
160020001           T02         TARGET LESIONS              0
160020001           T01         TARGET LESIONS              0
160020001           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              0
160020002           T01         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              2
;
run;
proc sql;
 create table temp as
 select * from
 ( select *,count(*) as n
   from have 
    group by subjid     , visitnum)
 group by subjid
  having range(n) ne 0;

quit;

Xia Keshan

Message was edited by: xia keshan

Message was edited by: xia keshan

Frequent Contributor
Posts: 145

Re: query

Thanks Xia.... it necessary that number observation in Visit=0 should be same as the number of obseravtions in subsequemt visit....and should output if there is any difference in obseravtion count....Also it would be  great if we can add code where we can output where obeservation count do not match from baseline visit and also if NUMBER variable do not match from baseline visit to next visits......

Super User
Posts: 9,662

Re: query

It is more clear .

data have;
input subjid : $20. number $ cat & $40. visitnum     ;
cards;
100010001           T02         TARGET LESIONS              0
100010001           T01         TARGET LESIONS              0
100010001           T03         TARGET LESIONS              0
100010001           T02         TARGET LESIONS              1
100010001           T03         TARGET LESIONS              1
100010001           T01         TARGET LESIONS              1
100010001           T02         TARGET LESIONS              2
100010001           T01         TARGET LESIONS              2
100010001           T03         TARGET LESIONS              2
100010002           T02         TARGET LESIONS              0
100010002           T01         TARGET LESIONS              0
100010002           T02         TARGET LESIONS              1
100010002           T01         TARGET LESIONS              1
100010002           T02         TARGET LESIONS              2
100010002           T01         TARGET LESIONS              2
100030002           T03         TARGET LESIONS              0
100030002           T04         TARGET LESIONS              0
100030002           T02         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              1
100030002           T04         TARGET LESIONS              1
100030002           T03         TARGET LESIONS              1
100030002           T02         TARGET LESIONS              1
160020001           T02         TARGET LESIONS              0
160020001           T01         TARGET LESIONS              0
160020001           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              0
160020002           T01         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              2
;
run;
data want(drop=n group basic);
group=0;
do until(last.subjid);
 set have;
 by subjid;

 group+1;n=0;
 do until(last.visitnum);
  set have ;
  by subjid visitnum notsorted;
  n+1;
 end;
 if group=1 then basic=n;
 do until(last.visitnum);
  set have ;
  by subjid visitnum notsorted;
  if group ne 1 and n ne basic then output;
 end;
 
end;
run;



Xia Keshan

Respected Advisor
Posts: 3,124

Re: query

I believe you can also do it using Array(), but Hash seems handy:

data have;

     input subjid : $20. number $ cat & $40. visitnum;

     cards;

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02 TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1

;

run;

data want;

     if _n_=1 then

           do;

                declare hash h1();

                h1.definekey('number');

                h1.definedone();

                declare hash h2();

                h2.definekey('number');

                h2.definedone();

           end;

     do until(last.subjid);

           set have;

           by subjid;

           if visitnum=0 then

                rc=h1.replace();

           else rc=h2.replace();

     end;

rc=h1.equals(hash:

     'h2', result: flag);

     put flag=;

     do until (last.subjid);

           set have;

           by subjid;

           if flag ne 1 then

                output;

     end;

     rc=h1.clear();

     rc=h2.clear();

     drop rc flag;

run;

Respected Advisor
Posts: 4,640

Re: query

To get a detailed list of discrepansies:

data visit;

length subjId $12 cat $16;

input subjid $ number $ cat $& visitnum;

datalines;

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02         TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100010002           T03         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1

;

proc sql;

create table visit0 as

select unique a.subjId, b.visitNum, a.number

from visit as a  inner join visit as b

    on a.subjid = b.subjid

where a.visitnum = 0;


create table newNumbers as

select subjId, visitNum, number

from visit  except corr

select * from visit0;


create table missingNumbers as

select subjId, visitNum, number

from visit0  except corr

select * from visit;


title "Discrepansies";

select "New number", * from newNumbers

union all

select "Missing number", * from missingNumbers

order by subjid, visitNum, number;

quit;

PG

PG
Frequent Contributor
Posts: 115

Re: query

I am novice, unlike the greats PG, Xia and Haikuo, hence the long and ugly way, for learning purpse it helps:

data have;/*Raw data copied from Haikuo*/

     input subjid : $20. number $ cat & $40. visitnum;

     cards;

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02 TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1

;

run;

data want;
set have;
by subjid visitnum notsorted;
array n(30) _temporary_; *array subscript is arbitrary;
if first.subjid then call missing(of n{*});
if first.visitnum then count=0;
count+1;
if last.visitnum then n(_n_)=count;
if last.subjid then do;
if max(of n

  • ) ne count then discrepancy="yes";
    else discrepancy="no";
    end;
    if discrepancy="yes" then output;
    drop count;
    run;
  • proc sort data= have out=have1;
    by subjid;
    run;

    data final;
    merge  want have1;
    by subjid;
    if discrepancy="yes" then output;
    drop discrepancy;
    run;

    Regards,

    Naveen Srinivasan

    L&T Infotech

    ☑ This topic is SOLVED.

    Need further help from the community? Please ask a new question.

    Discussion stats
    • 8 replies
    • 475 views
    • 1 like
    • 5 in conversation