About rfarmenta

rfarmenta · ‎07-13-2015

For this it is just one question with about 20,000 open ended responses since it is a select all that apply and includes a open text field for additional responses. In the past I've generally relied on doing a quick scan of the responses to see which are most frequent to add new categories to our questionnaires. You mention the reasons I would like to do this, the first is to fill in responses of other categories that were listed if a participant missed that it said something like "asthma" in the list and wrote it in instead. With so many other responses and misspellings this would take a tremendous effort and unfortunately we don't have the time/resources either. The other reason is to determine which other responses have the highest frequencies to prioritize new response options, though, as you mention, this can be done by just looking through the responses. I will try what you suggested to create an update dataset. Thank you!

rfarmenta · ‎07-13-2015

In our research project we have other fields for several questions where participants can write in an answer if it is not listed as an option. As you can imagine we get a lot of responses that are spelled wrong or the same with different wording. I know SAS can do fuzzy matching to combine data sets, can this be done within a character variable. For instance, if the responses are: "Back pain" "Back problems" "back aches" they would all be considered the same. Additionally, some participants will list several things, such as: "back pain and hip pain" "back and neck pain" I would want these to be included in the back pain category as well. Is there a way to do this without manually going through everything? There are about 20,000 responses so I am trying to streamline the process. Thank you in advance for your help.

rfarmenta · ‎07-08-2015

That was much simpler than I thought, thank you both for your input, the code worked.

rfarmenta · ‎07-08-2015

I have a long dataset that includes multiple records per participant. Each person has an overall date range (begin_date and end_date) for trips they have made to conduct business for their job. Additionally, each person has a date1 and a dot1 that indicated shorter duration trips that fall within their overall begin_date and end_date. What I need is a new variable that gives a trip number for each row of the data. So for example, in the small dataset below for participant 1, row 1 would be trip 1, so my new variable trip, would be equal to 1, rows 2-5 are trip 2 and there were 4 shorter trips associated with trip 2, then rows 6-9 are trip 3 and they were 4 shorter trips associated with the overall date range. You can tell if the date1 and dot1 dates are all within a larger trip if the dates fall within begin_date and end_date. So basically I want to add a variable called trip and for the example row 1 would be 1, rows 2-5 would be 2, and rows 6-9 would be 3. I tried using the following code but it just gives me a count of all the rows essentially. I think the code doesn't work because I have repeated measures and it is hard to distinguish between dates but I was hoping to be able to do this without transforming the data. Any help would be greatly appreciated. data want; set have; trip+1; by begin_date; if first.begin_date then trip=1; else if date1>end_date and dot1<=end_date then trip=trip+1; run; ID begin_date end_date date1 dot1 1 02/12/06 04/29/06 02/12/06 04/29/16 1 03/14/07 01/15/08 03/14/07 04/18/07 1 03/14/07 01/15/08 04/18/07 05/16/07 1 03/14/07 01/15/08 05/16/07 09/02/07 1 03/14/07 01/15/08 09/02/07 01//15/08 1 05/15/10 08/05/12 06/26/10 08/15/10 1 05/15/10 08/05/12 01/20/11 01/25/11 1 05/15/10 08/05/12 02/01/11 08/01/11 1 05/15/10 08/05/12 04/01/12 04/03/12

rfarmenta · ‎06-18-2015

Thank you both for your help. They both seem to have worked and give me the same results.

rfarmenta · ‎06-18-2015

Thank you both for your replies. Astounding-Can you give me a little more information as to what the obs=0 does in this case? From reading the SAS documentation it tells SAS when to stop processing. So in this case, if a missing value is set to obs=0 then will it choose the previous value? Also, why would I drop visit? Mark-I am going to try your suggestion and see if that works as well. In your case it appears as though I will fill the missing with the first observations value, what if that is missing as well? Thanks!

rfarmenta · ‎06-18-2015

I have a dataset where participants have up to 4 visits. I need to select their most recent visit for multiple variables that does not include missing values. I know I can use last.studyid to select their most recent observation but is there a way for multiple variables to select their most recent observation for each variable that does not include missing values? A short example of what the data might look like it below. Bascially for var1 I would want the third visit value for var1 and the fourth visit value for var2 for study ID 1, for study id 2 I would want the third visit value for both, and for study ID 3 I would want the fourth visit value for var1 and the third visit value for var2. Any help would be great appreciated. ID var1 var2 visit 1 1 0 1 1 1 0 2 1 1 0 3 1 . 0 4 2 . 0 1 2 1 0 2 2 1 1 3 3 0 1 1 3 0 1 2 3 0 1 3 3 1 . 4

rfarmenta · ‎03-20-2015

I am not sure I understand the code from ballardw and I tried count distinct but that doesn't seem to give me what I want either. I can't attach an excel right now for some reason but below is a short example of what I would want to do. I just made up the example data. I would want to know how many people did the behavior at 1 visit, at 2 visits, at 3 visits, etc. so from the data the results would be this, 1=yes, 2= no in the code for behavior: behaviorcount 0=1 (did not report) 1=1 2=3 3=1 4=0 5=1 pid behavior visit 1 0 1 1 1 2 1 1 3 1 0 4 1 0 5 2 0 1 2 0 2 2 0 3 2 0 4 3 1 1 3 1 2 3 1 3 4 1 1 4 0 2 4 0 3 4 0 4 4 0 5 5 1 1 5 1 2 5 1 3 5 1 4 5 1 5 6 0 1 6 1 2 6 1 3 6 0 4 6 0 5 7 1 1 7 1 2

rfarmenta · ‎03-20-2015

I have data from 5 study visits that assessed health behaviors. The same questions were asked at each visit and I would like to determine how many unique participants reported one behavior. I have the data sets appended in long format, so I can run a proc freq and get that 110 participants reported the behavior across all visits but I want to know how many of those are unique participants versus the same participants reporting the behavior at each visit. Ideally I would like to know how many participants reported the behaviors at all visits, how many at any visit, and how many reported 1, 2, 3, and 4 visits. I previously asked a similar question but was not able to figure this out. Any suggestions would be greatly appreciated.

rfarmenta · ‎02-10-2015

Maybe I am running the code incorrectly but that code actually just gives me the same result as running a proc freq with a cross tab for visit and the outcome. So it esentially just tells me who reported the behavior and each visit. My original question maybe doesn't make sense but I want to try to figure out who reported the behavior at only 1 visit, who reported it at 2 visits, and who reported it at all visits...Does that make sense? Thank you for your help. Here is the code I used: proc sql; select distinct(visitnum),sum(pfsy6mo=1) as pfsycount from merged_visits2 group by visitnum; quit;

rfarmenta · ‎02-10-2015

I have repeated measures data (long format) and my outcome variable is binary 1=yes, 0=no. I am looking at factors associated with the outcome over time. What I want is how many participants reported the outcome at each visit, so some participants could have reported it at 1 visit, some at 2, 3, 4 visits, etc...I am not sure how to have SAS tell me this. Any advice is appreciated.

rfarmenta · ‎12-03-2014

I think I just figured out the problem actually, I remove the parenthesis from the end of today() and the calculations are correct. Now I am not sure what the parenthesis actually do and probably didn't need them in the first place. Attached in what I get in my log when I run the code above, thank you for your help.

rfarmenta · ‎12-03-2014

I believe the values and signs are incorrect for some of the calculations but I could be reading the output incorrectly. In the output I posted today, obs 242, time= -49 but the difference between today12m and today is much greater than that. Is the negative value how SAS calculates time spans greater than 1 year? This new variable will be the time covariate in a longitudinal analysis.

rfarmenta · ‎12-03-2014

There are two separate sets of code, one of the 6 month visit and one for the 12 month visit. I only provided sample output from the first code, attached here is the output from the second set of code but it isn't much different that I can see.

rfarmenta · ‎12-02-2014

I have 3 visits worth of data and I need to calculate time elapsed since last visit. Each of the visits have a date for when the interview was conducted. I am subtracting those dates to come up with the number of days between the interviews, however, it does not seem to be working correctly. My code is below and attached is a screenshot of the output. Suggestions as to how to fix the calculation would be greatly appreciated data visit2clean_sn;set visit2clean_s; time=today6m-today(); run; data visit3clean_s; set visit3clean_s; if today6m^=. then time=today12m-today6m; else if today6m=. then time=today12m-today(); run;

Online Status	Offline
Date Last Visited	‎06-24-2019 08:44 PM

Re: Summarize data from long dataset

Re: Summarize data from long dataset

Summarize data from long dataset

Re: Creating new variables from long data

Creating new variables from long data

Re: Calculate percent missing in long dataset

Calculate percent missing in long dataset

Re: How do I calculate time spent in heart rate zones from heart rate ...

Re: How do I calculate time spent in heart rate zones from heart rate ...

Re: How do I calculate time spent in heart rate zones from heart rate ...

Re: Export SAS file to text or CSV

Re: Export SAS file to text or CSV

Re: Combine variables

Re: List all variables in SAS

Re: Merge duplicate records in SAS

Read password protected excel file into SAS

Re: Export SAS file to text or CSV

Re: Other fields in questionnaire-match like responses

Other fields in questionnaire-match like responses

Re: Create visit variable

Create visit variable

Re: Choose last non-missing observation for multiple variables

Re: Choose last non-missing observation for multiple variables

Choose last non-missing observation for multiple variables

Re: Unique outcomes across visits

Unique outcomes across visits

Re: Count Behaviors-Repeated Measures

Count Behaviors-Repeated Measures

Re: Create a time variable

Re: Create a time variable

Re: Create a time variable

Create a time variable