Right. However, it's the data for the same participants over four years. So, based on my understanding, I will have more variables/columns for the same 1400 participants. Please correct me if I am wrong, or suggest to me what you think is efficient.
We cannot tell without seeing how your study is setup.

Normally if you are asking the same questions the values should be stored into the same variable. If the question is different then it will require a different variable.  For example if you ask their weight in the first wave/year and again in the follow-up years then you should store that into the same variable and have different observations for the separate wave/year.  But if you asked their change from baseline weight that would be a different variable and only apply to the the follow-up waves and not the initial wave.  So the variable will have a missing value for the observation that represents the wave where it does not apply.

Makes sense. It seems like I will have to SET then as it matches your first scenario. Well basically, I have four waves, and within those waves, the same variable is calculated 5 to 6 times at different periods of time.
Makes sense. It seems like I will have to SET then as it matches your first scenario. Well basically, I have four waves, and within those waves, the same variable is calculated 5 to 6 times at different periods of time.

Depending on how it is stored you might have a series of variables.  X1 - X6 for example. Or you might have multiple observations per wave with the time period as a secondary key variables.  But that is probably not how it is currently setup based on the observation counts you provided before.

You can always convert from one structure to another structure depending on your reporting/analysis needs.

