BookmarkSubscribeRSS Feed
geneshackman
Pyrite | Level 9

Hi

Say I have a data set with 100 variables. The set has different sections, like demographics, opinions about pets, dietary habits, and so on. When I download the data from a survey monkey type service as a spreadsheet, the downloaded data set doesn't have any short labels, but just the questions themselves (what's the highest level of education you attained, do you live in the eastern region, do you live in the western region) as the header for each column.

I still need to do some data manipulation, combining vars, summing all the yes's, and so on.

When I make variable names, what's the best practice? Or the most common. Would I name everything v1 to v100? But that way, I couldn't look at a var name and tell what the question is. Or use vdemog_ed, vdemog_age, and so on, where each var name is an abbreviated name of the variable. This way I can look at the var name and figure out what question it represents.

Thanks

4 REPLIES 4
PaigeMiller
Diamond | Level 26

When I make variable names, what's the best practice? Or the most common. Would I name everything v1 to v100? But that way, I couldn't look at a var name and tell what the question is

It depends!!

By the way, if you assign a meaningful label to the variables, the v1 to v100 might be the easiest way to go. Or not!

--
Paige Miller
Reeza
Super User

Both of your methods would work but the v1-v100 results in less typing. On the other hand you can probably reference variables without thinking if you list them as vdemog_ed. 


The key is to keep the prefix the same for all types which both of your methods have. That will allow you to use variable lists when referencing the variables i.e. vdemog_: will refer to all variables starting with vdemog. 

I would use v1-v100 with the questions as the labels.  I think SurveyMonkey has some boxes that you need to check that allows you export your data as SPSS or with the question numbers instead that will help to simplify the renaming process. 

ballardw
Super User

If you are stuck with Excel output here is a trick or two that may be helpful.

Copy the row of text labels and paste transpose into a different sheet/workbook.

Add another column with the variable names.  <good idea to save at this step!>

Use Excel formula to create text result of variablename = "label text". The results become the core of label statements for the variables. Mine typically look like =IF(C2="","",A2&"="""&""&C2&"""") . The if is because you MIGHT get some columns without a corresponding heading. In this case column C has the label text and A the variable name.

You can now copy the column of variable names and paste transpose over the line in the data with the questions for use in import.

I usually save as CSV to import as there is more control though you may need to remove some header lines before the data.

TomKari
Onyx | Level 15

Just one word of caution if you go with the v1-v100; if you're ever going to redo this survey, questions may be added and deleted, and your variables will start to look like

v1, v3, v5.1, v5.15, v5.2, v19...

Tom

sas-innovate-2024.png

Join us for SAS Innovate April 16-19 at the Aria in Las Vegas. Bring the team and save big with our group pricing for a limited time only.

Pre-conference courses and tutorials are filling up fast and are always a sellout. Register today to reserve your seat.

 

Register now!

What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 4 replies
  • 704 views
  • 9 likes
  • 5 in conversation