Help using Base SAS procedures

Scanning variable columns in search of specific values for f/u time

Accepted Solution Solved
Reply
Occasional Contributor
Posts: 14
Accepted Solution

Scanning variable columns in search of specific values for f/u time

[ Edited ]

Hi,

Below is longitudinal diabetes (y/n) data, I want to search for the first time a 0 or 1 appears so I can call that a baseline value. And also search for when a 1 appears so I know if the person developed diabetes. There are corresponding days2-days19 variables that are the f/u time (days from visit1) for each diabfield. So for example, obs=1242, I want to say baseline diabetes=0 occurred at the 3rd visit, and that a diabetes diagnosis was recorded at visit7, so total f/u time from baseline to diabetes diagnosis would be days7-days3. For now, we aren't worrying about the missing data between visits. I hope I explained this correctly. Thanks for any help!! Also, I don't use proc SQL very much so if it can be in a data step that would be awesome, otherwise if that's too complicated, SQL works

 

 

Obsdiab1diab2diab3diab4diab5diab6diab7diab8diab9diab10diab11diab12diab13diab14diab15diab16diab17diab18diab19
528...0...............
670.......000.00.0.00.
802....1..............
920......0............
1148...000.............
1242..000.11...........
1327..00000000000.0.1..
1898...................
1981...................
2122...00..............
2124...................
2127...................
2146.......00000.1011..
2936.000001111111......
3059......1............
3101.000000111111......
3192......0000.000..0..
3485.......00..........
3760.0.011111..........

Accepted Solutions
Solution
‎04-03-2018 01:23 PM
Super User
Posts: 13,583

Re: Scanning variable columns in search of specific values for f/u time

data have;
   infile datalines truncover;
   input obs  diab1-diab8 date1-date8;
   array db diab1-diab8;
   array dd date1-date8;
   first0 = whichn(0,of db(*));
   first1 = whichn(1,of db(*));
   if first0 then datefirst0 = dd[first0];
   if first1 then datefirst1 = dd[first1];
datalines;
1 . . 0 1 . . . . . . 2200 2923
2 . 0 . . 0 0 1 0 . 1496 3673 4429 5112 5962
3 . . 0 0 . . 0 1 . . 2195 2921 . . 5110 5803
4 . . 0 . . . . 0 . . 2200 . . . . 5784
5 . 0 0 1 1 1 1 1 . 1479 2200 5000 . . . .
6 . 0 0 0 0 0 . 0 . . 2168 2955 3702 . 5090 5769
7 . 0 0 0 1 1 1 1 . 1599 2199 2927 3634 4369 5118 5824
8 . . . 0 0 . 1 0 . . . 2912 3710 . 5117 5901
run;

I don't think that your date and diab variable align properly but this is an approach to finding which has the first 0 or 1 for the diab and the associated "date".

 

View solution in original post


All Replies
Super User
Posts: 23,776

Re: Scanning variable columns in search of specific values for f/u time

Use the WHICHN() function.  

 

 

Super User
Super User
Posts: 9,599

Re: Scanning variable columns in search of specific values for f/u time

It would make your life a lot easier if you worked with normalised data (data which goes down rather than across) rather than transposed data (data which goes across rather than down).  It just makes most programming tasks far easier.  As for your logic, not following it well, but maybe something on the lines of:

data want;
  set have;
  array diab{19};
  array days{19};
  do i=1 to 19;
    if diab{i} in 0,1 then do;
    ...;
  end;
run;

So array both sets of data which allows you to refer to groups easily.  

Anyways, for more information post test data in the form of a datastep so we can run it, and show what the output should look like, not going to either type that in nor try guessing what the output should look like.

Occasional Contributor
Posts: 14

Re: Scanning variable columns in search of specific values for f/u time

[ Edited ]

data have;
set want;
input obs  diab1-diab8 date2-date8;
datalines;
1 . . 0 1 . . . . . . 2200 2923
2 . 0 . . 0 0 1 0 . 1496 3673 4429 5112 5962
3 . . 0 0 . . 0 1 . . 2195 2921 . . 5110 5803
4 . . 0 . . . . 0 . . 2200 . . . . 5784
5 . 0 0 1 1 1 1 1 . 1479 2200 5000 . . . .
6 . 0 0 0 0 0 . 0 . . 2168 2955 3702 . 5090 5769
7 . 0 0 0 1 1 1 1 . 1599 2199 2927 3634 4369 5118 5824
8 . . . 0 0 . 1 0 . . . 2912 3710 . 5117 5901
run;

 

so I'm using a huge dataset, it's all in wide format. These are people who never had an official visit or diagnosis at baseline (diab1/date1) I'm was hoping to find a quick way to find out the first time (datex ) a person has a diabx of 0/1 so I can call that their baseline diagnosis. Then, if a person has a baseline diagnosis of 0 (no diabetes), and they develop diabetes afterwards, I want to find out at what time (datex ) that happened, so I can calculate their follow-up time (time to diagnosis) hope that helps. The whichn function may work? 

Super User
Super User
Posts: 9,599

Re: Scanning variable columns in search of specific values for f/u time

Yes, you can use arrays as I show in my previous post - am finishing for the day so can't elaborate.  Why is the data not utilising a standard data structure, SDTM for instance.  It would make your life so much easier to learn to work with normalised data, e.g.:

...  DIAGDT  DIAGYN ...

 

Then its simply a matter of sorting and filtering first date where DIAGYN=Y to get baseline.  You can then merge this date back to the original data, and create a change value from baseline with a simple minus, a days since by another minus etc.  It really is so much easier to work with.

Solution
‎04-03-2018 01:23 PM
Super User
Posts: 13,583

Re: Scanning variable columns in search of specific values for f/u time

data have;
   infile datalines truncover;
   input obs  diab1-diab8 date1-date8;
   array db diab1-diab8;
   array dd date1-date8;
   first0 = whichn(0,of db(*));
   first1 = whichn(1,of db(*));
   if first0 then datefirst0 = dd[first0];
   if first1 then datefirst1 = dd[first1];
datalines;
1 . . 0 1 . . . . . . 2200 2923
2 . 0 . . 0 0 1 0 . 1496 3673 4429 5112 5962
3 . . 0 0 . . 0 1 . . 2195 2921 . . 5110 5803
4 . . 0 . . . . 0 . . 2200 . . . . 5784
5 . 0 0 1 1 1 1 1 . 1479 2200 5000 . . . .
6 . 0 0 0 0 0 . 0 . . 2168 2955 3702 . 5090 5769
7 . 0 0 0 1 1 1 1 . 1599 2199 2927 3634 4369 5118 5824
8 . . . 0 0 . 1 0 . . . 2912 3710 . 5117 5901
run;

I don't think that your date and diab variable align properly but this is an approach to finding which has the first 0 or 1 for the diab and the associated "date".

 

Occasional Contributor
Posts: 14

Re: Scanning variable columns in search of specific values for f/u time

thanks this works great!!

☑ This topic is solved.

Need further help from the community? Please ask a new question.

Discussion stats
  • 6 replies
  • 215 views
  • 0 likes
  • 4 in conversation