About hellorc

hellorc · ‎10-03-2022

Closed

hellorc · ‎09-29-2022

Hello Ballardw, thank you so much for your response! Apparently I did not think deep enough, I apologize. After confirming with my teammate, the dataset is in the format that when there are multiple rows for the same subject, at least the city would be different. The state can repeat or change back and forth, but city is ALWAYS unique (So for the same id, there will not be rows with both same state and city). So if there are 4 rows for a subject, the 4 rows would have 4 different cities. Using the example in my question and consider "state" as the variable of interest, I am hoping in general the output would be something like: id: 1 unique state: A, B; changed 1 time unique cities: A, B, C, D; changed 3 times unique salary: 100, 101, 102; changed 2 times ... similarly for other id's. The example is not perfect as I realized. Consider the state in your example with id=4 and 6 rows, the output would be: id:4 unique state: C,D; changed 5 times. salary: 120, 110, 99, 100, 88, 90; changed 5 times. (hopefully the difference for each change can also be computed). There is also a date variable which is how the data is sorted by originally, we will stick with this order. I am really stuck on how to generalize the logic since the dataset contains about 200 subjects. Any advice is welcomed, thank you in advance!

hellorc · ‎09-29-2022

Hello SAS community, I have a dataset which looks like: data have; input ID state $ city $ salary @@; datalines; 1 A A 100 1 A B 100 1 A C 101 1 B D 102 2 B E 99 2 B F 99 2 B G 99 3 A C 88 4 C H 120 4 D J 110 ; run; For each id, I would like to compute all unique values of a variable, how many times a variable change from all rows; if possible, I would also need to compute how much is the change if the variable is numeric. Let me elaborate using id=1 as an example, from the 4 observations for id=1, the unique salary values are 100, 101, and 102, so salary changed 2 times from all rows for id=1, and the changes are 1 and 2. Using id=4 and city as another example, the output for would be H and J, and city changed 1 time. I am still new to SAS and I really am having difficulty thinking of a logic to work out those outputs for each id. I tried using first.id and last.id to check but that wouldn't include the 'middle' observations. Can this be done via data step, or is SQL required? Might someone be willing to provide some assistance? Thank you

hellorc · ‎09-22-2022

Closed

hellorc · ‎09-21-2022

Hello SAS community, I have a quick question. Consider the following simple data: data have; input id city $ week money @@; datalines; 1 A 1 100 1 A 2 105 1 A 5 108 1 B 5 206 1 B 7 99 1 C 8 101 1 C 10 110 2 B 5 202 2 C 5 100 2 D 8 95 2 E 8 98 ; run; I would like the observations for each ID have unique week. If for each subject there are duplicate 'week', the 'week' with larger money variable would be kept. So: data want; input id city $ week money @@; datalines; 1 A 1 100 1 A 2 105 1 B 5 206 1 B 7 99 1 C 8 101 1 C 10 110 2 B 5 202 2 E 8 98 ; run; Can this be done via some data steps? Or SQL is a must? Might someone be willing to assist? Thank you in advance!

hellorc · ‎09-21-2022

Thank you both, yabwon and Ksharp!! Both worked!

hellorc · ‎09-20-2022

Hello yabwon, thank you so much for your reply. I have an extra question if you don't mind. Let's say the final week is 20 for everyone. But the last.id for subject 1 is week=10. and for subject 2 is week=15 as in your code. How would you adjust the code so that there would be: 1 C 10 1 C 11 ... 1 C 20 for subject 1?

hellorc · ‎09-20-2022

Hello SAS community, I have quick question related to data structure or formatting. I have the current data that looks like: data have; input id city $ week; datalines; 1 A 3 1 B 6 1 B 7 1 C 10 ; run; But I would actually need to fill the weeks after the first week the person shows up in the city (i.e.): data want; input id city $ week; datalines; 1 A 3 1 A 4 1 A 5 1 B 6 1 B 7 1 B 8 1 B 9 1 C 10 ; run; The week gaps would be filled with the former city value until a city change. There are about 1000 subjects in the data. Might someone be willing to provide assistance how to code it? Thank you

hellorc · ‎09-17-2022

Thank you mkeintz! I will definitely try it out as soon as I am in the office (where SAS is available) Monday! Have a great weekend!

hellorc · ‎09-17-2022

closed

hellorc · ‎08-30-2022

Thank you! It helped, I just figured out the issue!

hellorc · ‎08-30-2022

Thank you for your reply. Nope, after checking in STATA, there are ~600 obs with week=2, but only 24 are omitted in the model. Somehow there is a computed odds ratio for week=2 vs week=0 as 2.08 in STATA, while in SAS it's 10874311.

hellorc · ‎08-30-2022

Thank you for your reply. Nope, after checking in STATA, there are ~600 obs with week=2, but only 24 are omitted in the model. Somehow there is a computed odds ratio for week=2 vs week=0 as 2.08 in STATA, while in SAS it's 10874311.

hellorc · ‎08-30-2022

Hello, Might someone please help me with a problem I encounter when I attempt to convert STATA logistic regression to SAS? To keep the question simple, consider the logistic regression with variables: response: binary 0/1 week: integer from 0 to 10 gender: M/F age: continuous In STATA, the code would be: logit response i.week i.gender age , or From the data, I know that for some weeks, let's say week=2, there is 0 observation with response=1. Running the above code in STATA will result in a warning like the following: 2.week != 0 predicts failure perfectly 2.week omitted and 24 obs not used STATA will automatically handle "no event" or very rare event issue, and compute a meaningful odds ratio. If we fit the same model in SAS using the code: proc genmod; class week(ref='0') gender; model response=week gender age / link=logit dist=bin; lsmeans week gender / ilink diff exp; run; We will get the warning of "Hessian matrix is not positive definite" due to the "no event" or very rare event issue. The resulting odds ratios for week are ridiculously large numbers. Is there any way to re-create omitting the 24 obs in SAS like it does in STATA? I would really like to obtain the same outputs for understanding SAS. Thank you.

hellorc · ‎08-05-2022

Hello SAS community, I am relatively new to SAS. Currently I am taking survival analysis course, and I am really stuck on reformatting a data into 'counting process' (start, stop) form for fitting a Cox regression model with time-varying covariate (https://support.sas.com/resources/papers/proceedings12/168-2012.pdf). To keep the question short, below please find the much simplified version of the data I have now, and the format I want it to be: data have; input id site $ a1-a5 outcome $@@; datalines; 1 a 1 1 0 1 1 y 1 b 0 0 1 0 0 n 2 b 1 1 0 0 0 n 2 c 0 0 1 0 1 n 2 d 0 0 0 1 0 n 3 a 1 0 0 1 0 y ; run; id: unique identification for each person site: working site a1-a5: attendance for week 1-5 (1=person works at the site in that week, 0 =otherwise) outcome: binary outcome of interest (y/n, if the person has y, it means the event happens in his/her last attendance week) There are actual dates for week 1-5, for example: week 1: 2022/07/31 - 2022/08/06 week 2: 2022/08/07 - 2022/08/13 ... week 5: 2022/08/28 - 2022/09/03 Note that not everyone works every week, and some people work at different sites. I am not too sure if it is necessary but I would like to introduce a week indicator variable named 'week', as well as the start and stop variables so that the data format becomes: data want; input id site start $ stop $ week outcome @@; datalines; 1 a 2022/07/31 2022/08/06 1 n 1 a 2022/08/07 2022/08/13 2 n 1 b 2022/08/14 2022/08/20 3 n 1 a 2022/08/21 2022/08/27 4 n 1 a 2022/08/28 2022/09/03 5 y 2 b 2022/07/31 2022/08/06 1 n 2 b 2022/08/07 2022/08/13 2 n 2 c 2022/08/14 2022/08/20 3 n 2 d 2022/08/21 2022/08/27 4 n 2 c 2022/08/28 2022/09/03 5 n 3 a 2022/07/31 2022/08/06 1 n 3 a 2022/08/21 2022/08/27 4 y ; run; My concern is that since some people stay in the same working site throughout the entire time, some switch sites, some switch sites multiple times, some switch back and forth, some don't show up in some weeks, I just could not find the proper logic that could apply to the data (over 1000 people). Might someone be willing to provide guidance? Thank you, rc

Online Status	Offline
Date Last Visited	‎05-22-2024 08:33 PM

Converting wide to long/weekly format for survival data

Adding rows to long format data

Proc Glimmix Poisson regression model with random intercept

Re: Longitudinal analysis proc mixed

Longitudinal analysis proc mixed

Closed

Closed

data analysis

Just a short question about removing rows

About converting data

Re: Converting wide to long/weekly format for survival data

Re: Adding rows to long format data

Re: Just a short question about removing rows

Re: About converting data from one row to multiple rows for Survival a...

Re: Short question about filling down column with a data value

Filling out missing values in different scenarios

Re: Check uniqueness of a variable in a complicated dataset

Check uniqueness of a variable in a complicated dataset

SAS data - fill missing value with existing observation for same ID

SAS select or keep only one of duplicate observations

Re: data - filling gaps

Re: data - filling gaps

data - filling gaps

Re: Data reformatting question

Data reformatting question

Re: STATA logistic regression to SAS problem

Re: STATA logistic regression to SAS problem

Re: STATA logistic regression to SAS problem

STATA logistic regression to SAS problem

Reformat data problem for Cox model