BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
rakeshvvv
Quartz | Level 8

I have following dataset with the following 4 variables...subjid,number,cat,and visitnum..... the number of lesion must remain constant from visit to visit.....i should output where The number of target lesions is not constant from visit to visit....the below sample subject have constant number from visit to visit and we should output discrepancies if there any like those highlighted in blue.....

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02         TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1

1 ACCEPTED SOLUTION

Accepted Solutions
Ksharp
Super User

Did you just say the each NUMBER have the same number of obs in each and every  subjid group ?

data have;
input subjid : $20. number $ cat & $40. visitnum     ;
cards;
100010001           T02         TARGET LESIONS              0
100010001           T01         TARGET LESIONS              0
100010001           T03         TARGET LESIONS              0
100010001           T02         TARGET LESIONS              1
100010001           T03         TARGET LESIONS              1
100010001           T01         TARGET LESIONS              1
100010001           T02         TARGET LESIONS              2
100010001           T01         TARGET LESIONS              2
100010001           T03         TARGET LESIONS              2
100010002           T02         TARGET LESIONS              0
100010002           T01         TARGET LESIONS              0
100010002           T02         TARGET LESIONS              1
100010002           T01         TARGET LESIONS              1
100010002           T02         TARGET LESIONS              2
100010002           T01         TARGET LESIONS              2
100030002           T03         TARGET LESIONS              0
100030002           T04         TARGET LESIONS              0
100030002           T02         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              1
100030002           T04         TARGET LESIONS              1
100030002           T03         TARGET LESIONS              1
100030002           T02         TARGET LESIONS              1
160020001           T02         TARGET LESIONS              0
160020001           T01         TARGET LESIONS              0
160020001           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              0
160020002           T01         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              2
;
run;
proc sql;
 create table temp as
 select * from
 ( select *,count(*) as n
   from have 
    group by subjid     , visitnum)
 group by subjid
  having range(n) ne 0;

quit;

Xia Keshan

Message was edited by: xia keshan

Message was edited by: xia keshan

View solution in original post

8 REPLIES 8
PGStats
Opal | Level 21

What if you had:

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T03         TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

Would you want to flag that discrepancy?

PG

PG
rakeshvvv
Quartz | Level 8

Thanks Pg....We consider Visitnum=0 as baseline visit.....So what ever number present in baseline should be present in next occuring visits.....if any number not present in visitnum=0 but present in next visit or vice verse....we consider them as discrepany and we should output those.....

Ksharp
Super User

Did you just say the each NUMBER have the same number of obs in each and every  subjid group ?

data have;
input subjid : $20. number $ cat & $40. visitnum     ;
cards;
100010001           T02         TARGET LESIONS              0
100010001           T01         TARGET LESIONS              0
100010001           T03         TARGET LESIONS              0
100010001           T02         TARGET LESIONS              1
100010001           T03         TARGET LESIONS              1
100010001           T01         TARGET LESIONS              1
100010001           T02         TARGET LESIONS              2
100010001           T01         TARGET LESIONS              2
100010001           T03         TARGET LESIONS              2
100010002           T02         TARGET LESIONS              0
100010002           T01         TARGET LESIONS              0
100010002           T02         TARGET LESIONS              1
100010002           T01         TARGET LESIONS              1
100010002           T02         TARGET LESIONS              2
100010002           T01         TARGET LESIONS              2
100030002           T03         TARGET LESIONS              0
100030002           T04         TARGET LESIONS              0
100030002           T02         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              1
100030002           T04         TARGET LESIONS              1
100030002           T03         TARGET LESIONS              1
100030002           T02         TARGET LESIONS              1
160020001           T02         TARGET LESIONS              0
160020001           T01         TARGET LESIONS              0
160020001           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              0
160020002           T01         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              2
;
run;
proc sql;
 create table temp as
 select * from
 ( select *,count(*) as n
   from have 
    group by subjid     , visitnum)
 group by subjid
  having range(n) ne 0;

quit;

Xia Keshan

Message was edited by: xia keshan

Message was edited by: xia keshan

rakeshvvv
Quartz | Level 8

Thanks Xia.... it necessary that number observation in Visit=0 should be same as the number of obseravtions in subsequemt visit....and should output if there is any difference in obseravtion count....Also it would be  great if we can add code where we can output where obeservation count do not match from baseline visit and also if NUMBER variable do not match from baseline visit to next visits......

Ksharp
Super User

It is more clear .

data have;
input subjid : $20. number $ cat & $40. visitnum     ;
cards;
100010001           T02         TARGET LESIONS              0
100010001           T01         TARGET LESIONS              0
100010001           T03         TARGET LESIONS              0
100010001           T02         TARGET LESIONS              1
100010001           T03         TARGET LESIONS              1
100010001           T01         TARGET LESIONS              1
100010001           T02         TARGET LESIONS              2
100010001           T01         TARGET LESIONS              2
100010001           T03         TARGET LESIONS              2
100010002           T02         TARGET LESIONS              0
100010002           T01         TARGET LESIONS              0
100010002           T02         TARGET LESIONS              1
100010002           T01         TARGET LESIONS              1
100010002           T02         TARGET LESIONS              2
100010002           T01         TARGET LESIONS              2
100030002           T03         TARGET LESIONS              0
100030002           T04         TARGET LESIONS              0
100030002           T02         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              0
100030002           T01         TARGET LESIONS              1
100030002           T04         TARGET LESIONS              1
100030002           T03         TARGET LESIONS              1
100030002           T02         TARGET LESIONS              1
160020001           T02         TARGET LESIONS              0
160020001           T01         TARGET LESIONS              0
160020001           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              0
160020002           T01         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              0
160020002           T03         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              1
160020002           T02         TARGET LESIONS              2
;
run;
data want(drop=n group basic);
group=0;
do until(last.subjid);
 set have;
 by subjid;

 group+1;n=0;
 do until(last.visitnum);
  set have ;
  by subjid visitnum notsorted;
  n+1;
 end;
 if group=1 then basic=n;
 do until(last.visitnum);
  set have ;
  by subjid visitnum notsorted;
  if group ne 1 and n ne basic then output;
 end;
 
end;
run;



Xia Keshan

Haikuo
Onyx | Level 15

I believe you can also do it using Array(), but Hash seems handy:

data have;

     input subjid : $20. number $ cat & $40. visitnum;

     cards;

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02 TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1

;

run;

data want;

     if _n_=1 then

           do;

                declare hash h1();

                h1.definekey('number');

                h1.definedone();

                declare hash h2();

                h2.definekey('number');

                h2.definedone();

           end;

     do until(last.subjid);

           set have;

           by subjid;

           if visitnum=0 then

                rc=h1.replace();

           else rc=h2.replace();

     end;

rc=h1.equals(hash:

     'h2', result: flag);

     put flag=;

     do until (last.subjid);

           set have;

           by subjid;

           if flag ne 1 then

                output;

     end;

     rc=h1.clear();

     rc=h2.clear();

     drop rc flag;

run;

PGStats
Opal | Level 21

To get a detailed list of discrepansies:

data visit;

length subjId $12 cat $16;

input subjid $ number $ cat $& visitnum;

datalines;

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02         TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100010002           T03         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1

;

proc sql;

create table visit0 as

select unique a.subjId, b.visitNum, a.number

from visit as a  inner join visit as b

    on a.subjid = b.subjid

where a.visitnum = 0;


create table newNumbers as

select subjId, visitNum, number

from visit  except corr

select * from visit0;


create table missingNumbers as

select subjId, visitNum, number

from visit0  except corr

select * from visit;


title "Discrepansies";

select "New number", * from newNumbers

union all

select "Missing number", * from missingNumbers

order by subjid, visitNum, number;

quit;

PG

PG
naveen_srini
Quartz | Level 8

I am novice, unlike the greats PG, Xia and Haikuo, hence the long and ugly way, for learning purpse it helps:

data have;/*Raw data copied from Haikuo*/

     input subjid : $20. number $ cat & $40. visitnum;

     cards;

100010001           T02         TARGET LESIONS              0

100010001           T01         TARGET LESIONS              0

100010001           T03         TARGET LESIONS              0

100010001           T02         TARGET LESIONS              1

100010001           T03         TARGET LESIONS              1

100010001           T01         TARGET LESIONS              1

100010001           T02         TARGET LESIONS              2

100010001           T01         TARGET LESIONS              2

100010001           T03         TARGET LESIONS              2

100010002           T02         TARGET LESIONS              0

100010002           T01         TARGET LESIONS              0

100010002           T02 TARGET LESIONS              1

100010002           T01         TARGET LESIONS              1

100010002           T02         TARGET LESIONS              2

100010002           T01         TARGET LESIONS              2

100030002           T03         TARGET LESIONS              0

100030002           T04         TARGET LESIONS              0

100030002           T02         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              0

100030002           T01         TARGET LESIONS              1

100030002           T04         TARGET LESIONS              1

100030002           T03         TARGET LESIONS              1

100030002           T02         TARGET LESIONS              1

160020001           T02         TARGET LESIONS              0

160020001           T01         TARGET LESIONS              0

160020001           T02         TARGET LESIONS              1

;

run;

data want;
set have;
by subjid visitnum notsorted;
array n(30) _temporary_; *array subscript is arbitrary;
if first.subjid then call missing(of n{*});
if first.visitnum then count=0;
count+1;
if last.visitnum then n(_n_)=count;
if last.subjid then do;
if max(of n

  • ) ne count then discrepancy="yes";
    else discrepancy="no";
    end;
    if discrepancy="yes" then output;
    drop count;
    run;
  • proc sort data= have out=have1;
    by subjid;
    run;

    data final;
    merge  want have1;
    by subjid;
    if discrepancy="yes" then output;
    drop discrepancy;
    run;

    Regards,

    Naveen Srinivasan

    L&T Infotech

    sas-innovate-2024.png

    Don't miss out on SAS Innovate - Register now for the FREE Livestream!

    Can't make it to Vegas? No problem! Watch our general sessions LIVE or on-demand starting April 17th. Hear from SAS execs, best-selling author Adam Grant, Hot Ones host Sean Evans, top tech journalist Kara Swisher, AI expert Cassie Kozyrkov, and the mind-blowing dance crew iLuminate! Plus, get access to over 20 breakout sessions.

     

    Register now!

    What is Bayesian Analysis?

    Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

    Find more tutorials on the SAS Users YouTube channel.

    Click image to register for webinarClick image to register for webinar

    Classroom Training Available!

    Select SAS Training centers are offering in-person courses. View upcoming courses for:

    View all other training opportunities.

    Discussion stats
    • 8 replies
    • 1495 views
    • 1 like
    • 5 in conversation