BookmarkSubscribeRSS Feed
🔒 This topic is solved and locked. Need further help from the community? Please sign in and ask a new question.
Astounding
PROC Star

Ah, now the plot thickens.  The code looks fine, so the number one suspect is the data.  What looks like blanks for character values may not actually be blanks.  Take a few variables that appear to contain blanks, and print them in hex form.  For example, if a variable has a length of $ 5, print it (for just one observation):

 

put varname varname $hex10.;

 

Dollars to doughnuts there will be some strange characters in there (hex nulls, carriage returns .... we'll find out).

ballardw
Super User

@Dbynoe wrote:

Thank you so much for the response! I really do appreciate it. The dataset I'm working with has more than 4000 varialbes, do you perhaps know of a more time efficient ways of acheiving the same goal?


That many variables is often a symptom of a poor data structure or process design. If by any chance you have a process that is constantly adding new variables every week/month/ or other period then the process is flawed and should be reconsidered. It is much easier to work with data that has a variable to indicate processing period, date or source with the same variables and then use BY group processing. If you need a REPORT for people to read (4000 columns, really?) then report procedures such as Proc Report and Tabulate are very good at creating such things.

Dbynoe
Fluorite | Level 6

I didn't design the dataset, I just received it from a data coordination center my boss contracted. I'm trying to create 4 subsets of data from the larger dataset that are more accessible. 

 

This particular dataset is a Social Network Analysis, and provides information on many subjects and up to 48 alters for each subjects' network. For example, r1 is the question: "is this person male or female". There are 48 potential responses per subject. So the person who created the dataset provided one line of row per subject, so they created r1_1-r1_48 as variables that indicate the alter's sex, with each variable linked to a unique alter. There are a lot of questions, so that's one reason why there are so many variables in this dataset. I rearranged the dataset so that r1 now represents all the data from r1_1-r1_48...hence the repeated ID measures. It is a very messy dataset, so it's definitely forced me to learn a lot more about SAS than I expected! 

Ready to join fellow brilliant minds for the SAS Hackathon?

Build your skills. Make connections. Enjoy creative freedom. Maybe change the world. Registration is now open through August 30th. Visit the SAS Hackathon homepage.

Register today!
What is Bayesian Analysis?

Learn the difference between classical and Bayesian statistical approaches and see a few PROC examples to perform Bayesian analysis in this video.

Find more tutorials on the SAS Users YouTube channel.

Click image to register for webinarClick image to register for webinar

Classroom Training Available!

Select SAS Training centers are offering in-person courses. View upcoming courses for:

View all other training opportunities.

Discussion stats
  • 17 replies
  • 1753 views
  • 4 likes
  • 4 in conversation