BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
jsmall
Calcite | Level 5

Hi,

Below is longitudinal diabetes (y/n) data, I want to search for the first time a 0 or 1 appears so I can call that a baseline value. And also search for when a 1 appears so I know if the person developed diabetes. There are corresponding days2-days19 variables that are the f/u time (days from visit1) for each diabfield. So for example, obs=1242, I want to say baseline diabetes=0 occurred at the 3rd visit, and that a diabetes diagnosis was recorded at visit7, so total f/u time from baseline to diabetes diagnosis would be days7-days3. For now, we aren't worrying about the missing data between visits. I hope I explained this correctly. Thanks for any help!! Also, I don't use proc SQL very much so if it can be in a data step that would be awesome, otherwise if that's too complicated, SQL works

 

 

Obsdiab1diab2diab3diab4diab5diab6diab7diab8diab9diab10diab11diab12diab13diab14diab15diab16diab17diab18diab19
528...0...............
670.......000.00.0.00.
802....1..............
920......0............
1148...000.............
1242..000.11...........
1327..00000000000.0.1..
1898...................
1981...................
2122...00..............
2124...................
2127...................
2146.......00000.1011..
2936.000001111111......
3059......1............
3101.000000111111......
3192......0000.000..0..
3485.......00..........
3760.0.011111..........
1 ACCEPTED SOLUTION

Accepted Solutions
ballardw
Super User
data have;
   infile datalines truncover;
   input obs  diab1-diab8 date1-date8;
   array db diab1-diab8;
   array dd date1-date8;
   first0 = whichn(0,of db(*));
   first1 = whichn(1,of db(*));
   if first0 then datefirst0 = dd[first0];
   if first1 then datefirst1 = dd[first1];
datalines;
1 . . 0 1 . . . . . . 2200 2923
2 . 0 . . 0 0 1 0 . 1496 3673 4429 5112 5962
3 . . 0 0 . . 0 1 . . 2195 2921 . . 5110 5803
4 . . 0 . . . . 0 . . 2200 . . . . 5784
5 . 0 0 1 1 1 1 1 . 1479 2200 5000 . . . .
6 . 0 0 0 0 0 . 0 . . 2168 2955 3702 . 5090 5769
7 . 0 0 0 1 1 1 1 . 1599 2199 2927 3634 4369 5118 5824
8 . . . 0 0 . 1 0 . . . 2912 3710 . 5117 5901
run;

I don't think that your date and diab variable align properly but this is an approach to finding which has the first 0 or 1 for the diab and the associated "date".

 

View solution in original post

6 REPLIES 6
Reeza
Super User

Use the WHICHN() function.  

 

 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

It would make your life a lot easier if you worked with normalised data (data which goes down rather than across) rather than transposed data (data which goes across rather than down).  It just makes most programming tasks far easier.  As for your logic, not following it well, but maybe something on the lines of:

data want;
  set have;
  array diab{19};
  array days{19};
  do i=1 to 19;
    if diab{i} in 0,1 then do;
    ...;
  end;
run;

So array both sets of data which allows you to refer to groups easily.  

Anyways, for more information post test data in the form of a datastep so we can run it, and show what the output should look like, not going to either type that in nor try guessing what the output should look like.

jsmall
Calcite | Level 5

data have;
set want;
input obs  diab1-diab8 date2-date8;
datalines;
1 . . 0 1 . . . . . . 2200 2923
2 . 0 . . 0 0 1 0 . 1496 3673 4429 5112 5962
3 . . 0 0 . . 0 1 . . 2195 2921 . . 5110 5803
4 . . 0 . . . . 0 . . 2200 . . . . 5784
5 . 0 0 1 1 1 1 1 . 1479 2200 5000 . . . .
6 . 0 0 0 0 0 . 0 . . 2168 2955 3702 . 5090 5769
7 . 0 0 0 1 1 1 1 . 1599 2199 2927 3634 4369 5118 5824
8 . . . 0 0 . 1 0 . . . 2912 3710 . 5117 5901
run;

 

so I'm using a huge dataset, it's all in wide format. These are people who never had an official visit or diagnosis at baseline (diab1/date1) I'm was hoping to find a quick way to find out the first time (datex ) a person has a diabx of 0/1 so I can call that their baseline diagnosis. Then, if a person has a baseline diagnosis of 0 (no diabetes), and they develop diabetes afterwards, I want to find out at what time (datex ) that happened, so I can calculate their follow-up time (time to diagnosis) hope that helps. The whichn function may work? 

RW9
Diamond | Level 26 RW9
Diamond | Level 26

Yes, you can use arrays as I show in my previous post - am finishing for the day so can't elaborate.  Why is the data not utilising a standard data structure, SDTM for instance.  It would make your life so much easier to learn to work with normalised data, e.g.:

...  DIAGDT  DIAGYN ...

 

Then its simply a matter of sorting and filtering first date where DIAGYN=Y to get baseline.  You can then merge this date back to the original data, and create a change value from baseline with a simple minus, a days since by another minus etc.  It really is so much easier to work with.

ballardw
Super User
data have;
   infile datalines truncover;
   input obs  diab1-diab8 date1-date8;
   array db diab1-diab8;
   array dd date1-date8;
   first0 = whichn(0,of db(*));
   first1 = whichn(1,of db(*));
   if first0 then datefirst0 = dd[first0];
   if first1 then datefirst1 = dd[first1];
datalines;
1 . . 0 1 . . . . . . 2200 2923
2 . 0 . . 0 0 1 0 . 1496 3673 4429 5112 5962
3 . . 0 0 . . 0 1 . . 2195 2921 . . 5110 5803
4 . . 0 . . . . 0 . . 2200 . . . . 5784
5 . 0 0 1 1 1 1 1 . 1479 2200 5000 . . . .
6 . 0 0 0 0 0 . 0 . . 2168 2955 3702 . 5090 5769
7 . 0 0 0 1 1 1 1 . 1599 2199 2927 3634 4369 5118 5824
8 . . . 0 0 . 1 0 . . . 2912 3710 . 5117 5901
run;

I don't think that your date and diab variable align properly but this is an approach to finding which has the first 0 or 1 for the diab and the associated "date".

 

jsmall
Calcite | Level 5

thanks this works great!!

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 6 replies
  • 1798 views
  • 0 likes
  • 4 in conversation