Remote-Hosted Services for Public and Private Websites
Customized Search Engines and Auditing for Broken Links
Creation and Hosting of Dynamic Websites
Search    Audit    Design    About Us
Benefits    FAQ    Sign-up    Request a Demo Home

Frequently Asked Questions

Does the Blossom Search output ever contain advertisements?


Can I search with Boolean expressions? What about regular expressions?

Simplicity of usage has been the guiding principle for Blossom Search. Most people don't know Boolean from Bully Inn and would assume "Hello, how are you?" to be as regular an expression as any.

Accordingly, Blossom Search accepts neither Boolean nor pattern operators, but that is not to say that the searches it performs are simple-minded. The default phrase search would be entered on some search engines by inserting AND NEAR between the words in the phrase. Thus a phrase will match a Web page if all the words being searched for are located within about 20 words of one another on the page. The default word search would be entered as word* in some search engines. That is, a search string matches any word that begins with the string. (You can remove the NEAR operator or force whole-word matches, as described in the Search Guide Section "Search Options".)

How often will my site be spidered? Will my server slow down during spidering?

The default update schedule for Site and Affinity Search is once per week; for Enterprise Search there are two weekly updates. For all services you can request daily update or trigger an update on demand.

Spider traffic probably won't be noticeable on your server. It requests at most one page every few seconds.

Will the search results page look like the rest of my site?

You have complete control over the search form and the search results page. By supplying HTML before and after the search results you can set the page layout to look like the rest of your site. There are many options for controlling the search output including the use of cascading style sheets.

Where are the Blossom Search servers? How reliable are they?

Blossom's servers are housed in specialized hosting facilities near Washington D.C., San Antonio, and Los Angeles. The servers run the Redhat Linux operating system and the Apache Web server. The servers are dedicated to searching, making them relatively easy to protect from Internet-based attacks.

The search service is fully mirrored across all our servers. Not only does this distribute the load and improve response time, it enhances reliability as the servers operate independently.

Will Blossom Search index Adobe PDF files? What about MS/Word and WordPerfect files?

Blossom Search will index PDF and word processing files, but only if you ask it to. See the Search Guide Section "Indexing Options" for details on how to ask.

What happens when I delete a document from my site?

The spider tests each page in the search index to see if it has changed or been removed. When the spider requests information about a page that has been removed, your Web server will report that the page isn't found (a 404 status code in HTTP-speak) and the page will be removed from the index. Note that just removing the links to a page won't remove the page from your index, you must also remove the file.

Some Web servers are set up to redirect requests for missing pages to a site index or some other page. As a result, the Web server may return success (a 200 status code) for requests of deleted pages. In this case the page may not be removed from the index, but its contents will reflect the redirection. If more than one page has been deleted on your site and all are redirected to the same page, then all but one will be deleted from the index when our indexer checks for duplicates.

Does Blossom's spider follow the Robot Exclusion commands?

Yes. The spider follows instructions in "robots.txt", if it exists, as well as commands in any "robots" meta tag. For more information about Robot Exclusion, please see Wikipedia.

As an alternative to robot exclusion, Blossom's spider also looks for special comments to control spidering and indexing. The comments allow more flexibility since they can be turned on and off inside a document. See the Search Guide Sections on "Files and Directories" and "Headers and Footers" for more details.

How are accented letters handled?

Letters with accent marks (e.g., à, ñ, and ü) are treated as though the accent was not there. Thus, when searching, a letter without an accent will match the same letter with any accent. That is, an "a" will match an "à". Similarly, if an accented letter is entered in a search form, it will match the same letter unaccented.

How do I index a password protected site?

If your site uses Basic Authentication, you can establish a user name and password for the spider to use when it visits the site. To set the user name and password, log on to the search configuration page for your index and follow the "Spidering, Indexing, and Reporting" link. After selecting the appropriate index, you'll see a section titled "For Password-Protected Sites" at the bottom of the form.