The sample data set I include is representative of what I have (there are other variables in it, but they aren't used for this analysis). The variable 'check' is the list I want to count (~650,000 observations) and the variables a1a, a2a, etc. are the search terms (~500 variations). I want to know how many times each of those terms occurs over the observations in 'check'. What does the list of search terms you want to find look like? Is it essentially a single variable dataset with 500 observations? Yes. I then transpose and merge it to get the example data set I provided. What does the data where you are going to search for the terms look like? Is it also a tall table with a single variable to try to locate the values within? Or something else? If it has multiple variables that need to be searched, why? What is the meaning of the multiple variables? The data I am searching is also tall, but has a lot of related and unrelated variables. They could be dropped to accomplish this, in which case it could be a single variable set as well. There are 12 variables I need to search; each one corresponds to a ranked choice and the search terms are coded values for those choices. Since they are ranked, I will look at them separately, so I can just search over the single column (as in the example). What is the criteria for finding the search term? Is it anywhere in the string or does it need to match a word in the string? Does CASE of the letter matter? The strings must match, case does not matter (but the set is clean and that should not be an issue). There is very little chance of strings that don't match one of the search terms showing up. What is the output you want? Is it a count of the number of observations matching each string? Can an single observation match multiple strings? If the data to be searched has multiple variables what does it mean if the same term appears in more than one of the variables? Does that count as one observation found, or multiple hits found? The sample output is what I'm looking for - just a simple count of how many times each search term shows up. A single observation is only able to match one string (by the nature of the search terms - they are distinct and match the observations exactly), and thus can only be counted once. The expected result above was just the last line, but doesn't correctly represent the code. Here is what the output set looks like: check a1a a2a a3a a1b a2b x1 x2 x3 x4 x5
a1a 1 . . . . 1 . . . .
a1a 1 . . . . 2 . . . .
a2a . 1 . . . 2 1 . . .
a1b . . . 1 . 2 1 . 1 .
a1b . . . 1 . 2 1 . 2 . I can't provide the actual data (NDA... also the 500 variable width would make it difficult), but this sample data is very close to the actual data.As single variable lists, they look more like this: check progs
a1a a1a
a1a a2a
a2a a3a
a1b a1b
a1b a2b ... ... Thanks for taking a look!
... View more