~ Hints & Tips ~
         Petit image    Hints & Tips
Version March 2001

FRAVIA'S 16 QUICK TIPS

I use mostly altavista and google for my examples, but you'll find an Infoseek form, for proximity searches, [at the bottom] of this page. See elsewhere on my site WHICH search engines or bots you should use for your specific queries. The number of retrieved pages, given in pharentesis, will of course vary each time you re-perform each example search. Note also how some of the examples gibem here represent quite useful "links facories" per se.

1) USE VARIOUS RESOURCES!
Should I give only ONE advice, it would be this one. Even more important that "keeping on track" (see below). Never, never, never overestimate your search tool of choice. EACH search engine, [main], [regional] or [local] has its own quirks and its own blindness patterns ("shadows"). Everytime I 're-play' a given search on some specific "free-pages depository" local search engine (� la geocities) I get amazing results... Thus you should NEVER 'stick' to a 'given' search engine 'of choice'. Learn how much they differ and - even more important - understand how much each "given s.e." results' set changes over time! The web is a quicksand, and search engines databases AND POLICIES are continuously changing as well. Altavista and ftpsearch, for instance, are now actively 'censoring' results. Try there a god search for MP3, DVD reversing, Napster, Gnutella or Infraseek and you'll quickly see what I mean.


2) KEEP ON TRACK!
Nothing easier than to loose your thread when you are working on the web. The examples used on my site represent links to interesting (I hope) searches / places /startpoints as well. As you'll soon realize, the examples and links offer you continuous opportunities to leave this site in order to browse to other very promising ones. This is done on purpose: The hyper bastard approach to web page building is -on most sites- to restrict click away opportunities to a bare minimum. Even when a reference demands a link, new methods hide or reduce the visibility of that same link. Everything in order to keep a visitor 'caged' or 'trapped' in a given site. I'll do the exact contrary, since you must learn some discipline if you'r going to be a good seeker. You leave my site for good while searching for a target? Good riddance. My links will offer you a lot of added knowledge AND will at the same time test your capability to keep on track :-)


3) LOWERCASE
Always enter your search terms in lower case (unless you want to limit your search). Most search engine will thus find both upper and lower case occurences of your searchstring. "How to Search" (18132) is NOT the same as "how to search" (32607)


4) EXACT SEQUENCE [""]
Enclose terms in double quotation marks if you want to retrieve those exact terms in that exact sequence. This may be very useful in order to find a specific page. Thus "searchengines" will give you (22351) pages with the two terms 'glued' together. Similarly "saerch engine" will retrieve some (11) pages WITH THIS SAME MISSPELLING ERROR.


5) NARROW DOWN [ AND | & | + ] and ELIMINATE MERCILESSY [ AND NOT | | | - ]
Narrow your searches by linking your search terms with AND or &, or simply use the plus sign [+]. The search engine will find only those pages that contain all of your search terms. Similarly, exclude pages that are not relevant to your search by preceding the search term with AND NOT or | or simply use the minus sign [-]. +"search engines" +hints +tips +techniques -tits -sex -"make money" (933) is better than the more simple +"search engines" +hints +tips +techniques (1233)


6) DOWNSIDE OF THE + & - SIGN
With the + sign you may miss related documents that don't have the words you specify as required. For example, the search "searching tips" +searchlores would not include documents that have the words "searching tips", but not searchlores.
With the - sign it's easy to exclude too much. For example, if you were looking for information on "bots script" but not in javascript, the search +"bots scripts" -javascript would exclude a document that was all about bots scripts, but that had the sentence "this kind of bot would be impossible in javascript"


7) DOWNSIDE OF THE BOOLEAN operators
It's often difficult to specify exactly what you want to include or exclude. You can also get unexpected results if you are not careful about your use of operators and parentheses. For example, the search seeking OR searching AND finding is the same as the search seeking OR (searching AND finding). Both queries will find documents that contain both searching and finding, together with documents that contain the word seeking. However, the query (seeking OR searching) AND finding is not the same. It will find documents containing the word finding and, in the same document, either seeking or searching. Be careful with the boolean operators!


8) "PECULIAR" strings
You should always strive to use differentiating keywords when searching the web. Words that are commonly used will not help you much. Extremely common words like articles and prepositions are so worthless that they are completely ignored. Try to use words which underline the peculiarity of your target. Common words, when combined with boolean qualifiers, can be very effective. You must identify the main concepts in your topic and determine any synonyms, alternate spellings, or variant word forms for the concepts. Remember that the most "peculiar" a word, the more useful it will be in order to sharpen your search.
+ title:"search strateg*" +hints +tips
in this case we did include the "search strateg*" string (which already has an elevate PEC) in the title: keyword.


9) SPECIAL KEYWORDS
Note the use of a keyword in the previous example. Here a short list of the main keywords (for altavista):
10) ASTERISK[*]
Note also the use of the asterisk [*] in the previous example: it MUST be used after at least 3 characters, it is valid for up to 5 characters or as an element of a phrase.
For Altavista:
  1. Asterisk (*): After 3 specified characters will search for matches in up to 5 trailing letters.
  2. Question Mark (?): After 3 specified characters will match exactly one more character.
  3. Double Asterisk (**) More flexible as it will search for matches for an unlimited number of trailing characters.
You also have the ability use the wildcards interchangeably and more than once in the same search string


11) ARCHIVE
You should archive your useful queries and repeat them over time. All search engines that contain the "cgi-bin" snippet in the query produced can be saved and used again later. Since the results of all queryes VARY WITH THE TIME (when traffic is particolarly heavy the search engines "cut" the results) you would be well advised, for important queries, to repeat them again and again.


12) STOP WORDS
Stop words are words such as "and" "the" and "or" which search engines exclude from their searches to make them more effective. These terms are excluded because they are either extremely common or they are used by the search engine for performing more specialized searches. Just think about how many documents on the Web contain the word "the" and you'll understand how important is a good stop words list for all search engines.
If you really do want to search for one of these terms, there is an easy way to work around stop words. By bracketing words in quotation marks, search engines will look for every word inside the quotes, in the sequence you specify. Thus, if you wanted to look for sites with the words search the web you would use the searchstring "search the web".


13) SNOOPING BOLDLY AROUND -1
As you'll learn elsewhere on this site, there are many methods to access some 'non public' portions of the web.
A quick tip is to look for a file called ROBOTS.TXT in the main directory of your target site, entering per hand the URL with the following pattern:
http://www.targetsite.com/robots.txt
This file is used to tell search engines which directories and files they should not index on a specific site. Thus anything that has been put inside a 'robots.txt' file will not be found by your searchqueries. However, once you have seen the names, you can still type them directly into your browser in order to access the various subdirectories and pages.


14) SNOOPING BOLDLY AROUND -2
Another good idea may be to index (after having "registered" it) a site you are interested in with a search bot that does not respect too much the robots exclusion parameters. For instance atomz... you can try registering [there] a target site you are interested in, or else try your luck onto my own site using the form below :-)
Search searchlores.org for:
Match:  Any word All words Exact phrase
Sound-alike matching
Dated:
From: ,
To: ,
Within: 
Show:  results   summaries
Sort by: 

Now check the difference comparing the results you got with atomz with those you'll get using my own namazu searchengine:
[search @ fravia]



15) DOWNLOADING FILES FROM BUSY SERVERS
If you are trying to download some (ahem) popular files, you are probably competing with many other people for access. Pick a server in a country where it is very early in the morning if you have this option, alternatively schedule the download so that it will be effectuated when the time IN THE STATES or in EUROPE is early in the morning (GMT 05.00 or GMT 12.00) or, MUCH MUCH better, use an automatic email downloader like downloadslave instead (see the accmail section) and spare you the hassle :-)


16) PROXIMITY SEARCHES... HIT PAYLOAD EVERYTIME YOU SEARCH!
Real ~S~eekers use proximity operators quite a lot (for obvious - ahem - reasons) as you'll learn in the advanced sections of my site.
Altavista uses the NEAR command in order to select keywords within 10 words of each other, useful but quite limited.
When you seriously work using proximity searches THERE IS ONLY ONE SEARCH ENGINE FOR YOU: Infoseek which will allow you to choose any of the following options... or to combine them... :-)
ADJ, ADJ/#, OADJ, OADJ/#, NEAR, NEAR/#, ONEAR, ONEAR/#, FAR, FAR/#, OFAR, OFAR/#,
  • ADJ (adjacent words in any order)
  • ADJ/# (# number of words apart - exact: no more, no less)
  • 0ADJ (adjacent words in specified order)
  • 0ADJ/# (# number of words apart - exact - in specified order)
  • NEAR (within 25 words)
  • NEAR/#(within # words)
  • ONEAR (within 25 words in specified order)
  • ONEAR/#(within # words in specified order)
  • FAR (more than 25 words from each other in at least ONE instance)
  • FAR/#(more than # words from each other in at least ONE instance)
  • OFAR (more than #25 words from each other in at least ONE instance in specified order)
  • OFAR/#(more than # words from each other in at least ONE instance in specified order)

INFOSEEK
Only 500 results viewable! ~ Note the "Search within results" option at the bottom.


Search
   ADJ, ADJ/#, OADJ, OADJ/#, NEAR, NEAR/#, ONEAR, ONEAR/#, FAR , FAR/#,
OFAR, OFAR/#,

Petit image

(c) III Millennium: [fravia+], all rights reserved and reversed