The Basics
- Type "STOPWORD" as a concept to EXCLUDE records that contain the words in the STOPWORD concept.
Example: A80 historical society rule: CONCEPT1=history, CONCEPT2 (STOPWORD)=museum. Result: history museums are excluded from A80.
- Words are NOT case-sensitive.
- Double-slash (//) after a word indicates that NO wildcards are permitted.
Thus, while "ART" includes "artistic", "artists", "arts", etc., "ART//" is limited to "art" only.
- Begin a word with an exclamation mark (!) if it is ok for the word to have a prefix:
E.g., '!enviro' retrieves 'Pennenvironment'
- All words automatically have a space added to their start to ensure that words aren't in the middle of other words (e.g., 'art' in 'smart'). Any extra spaces at the start of a word are trimmed off before the single space is added.
Adding "Residual"/"Desperation" Rules
When do I add a new word to an existing rule and when to create a new rule?
- In general, we are creating 'residual/desperation' rules for most categories, which will be untested (at least initially) and have ratings between 5 and 9, and use only ONE concept (of course, there may be an unlimited number of words within the concept).
- The advantage of creating a new rule is that it can be tested precisely. If you add one new word to a lengthy list of words in an existing rule, you may make a mistake but not have enough hits in your testing to see the problem. This is especially dangerous if the rule has a high confidence level and you add a word that generates many false positives.
- The disadvantages of adding a new rule are two-fold:
- Each rule takes approximately 30 seconds to run.
- With the proliferation of rules, catching overlapping and duplicative rules becomes more difficult.
- Bottom line: When in doubt, create a new rule!
If untested and suspect:
- Give a rating (number correct) of between 5 and 9
- Type "untested" in comments
If a rule overlaps codes and nothing more can be done for the rule:
- Assign multiple codes separated by a commas, with best or most likely code first. E.g., "The Latino Center" could, with equal plausibility, fall into A23 (cultural/ethnic awareness) or P84 (ethnic or immigrant center providing broad range of services