Knowledge Archives
http://www.co-internet.net/net/articles/generatefillertext.txt.html

How to Generate Fillertext Content For Your Phantom
Pages and Shadow Domains
------------------------------------------------------
by
Ralph Tegtmeier aka fantomaster
------------------------------------------------------
Phantom pages are web pages offering highly optimized
content intended for search engine spiders only. Shadow
Domains are dedicated web properties focused on
offering optimized content to search engines while
redirecting human visitors to another site, typically a
company's Core or main domain. So, technically
speaking, Shadow Domains are web sites consisting
entirely of cloaked pages not intended for human
perusal.

More often than not, cloaking or IP delivery is a must
for those companies whose web sites offer very little
of what search engine spiders have always preferred
ever since they were invented: text content.

Say you have a web site offering sports merchandise.
Basically, your setup consists of dynamically generated
catalogue pages feturing short product descriptions,
price tags and a shopping cart system. Beyond this,
your textual content is limited to category headers
("baseball caps", "golf club", "sports apparel", etc.)

In other words: search engine spiders have preciously
little to go on in their quest to determine what your
web site is actually about - and as a webmaster, you
for your part don't have a lot of leeway to inoculate
your pages with the keywords and search phrases you are
targeting. Considering that you're in a highly
competitive environment with thousands of other sites
offering very similar programs, it doesn't take a
genius to figure out that your chances at achieving
good-to-excellent search engine rankings are pretty
slim, and that's putting it very mildly.

Within this scenario, what you will want to do is to
set up an industrial-strength cloaking outfit to feed
the spiders what they need. You want to offer pages to
the crawlbots they will be really happy with: rich in
relevant content, with your targeted keywords or search
phrases included at a good keyword density, featured in
the page titles, in meta tags, site links, etc.

As these pages aren't intended for human consumption
anyway, the actual text used needn't really make
"sense" in any grammatical meaning of the word as long
as its semantic content is fairly relevant to your
pages' focus. Thus, you will, for example, want to
avoid using text highly biased towards an in-depth
discussion of web server security or hospital hygiene
if you're actually targeting searchers interested in
Dale Earnhardt sport socks, SF Giants World Series 2002
Custom Road Replica Jerseys or National Hockey League
collectibles, to name but a few typical sports items.

Which, of course, places you squarely between a rock
and a hard place: if you had all that content at your
fingertips, you wouldn't have to go for IP delivery in
the first place, right? Right! However, there are ways
out of this predicament. Indeed, there's lot of freely
available content out there on the Web you can make use
of anytime.

Getting started
---------------
Say you are targeting golf sports related search
phrases in your search engine optimization efforts
because you are selling PGA related merchandise on
your web site.

Step 1: Selecting Relevant Content
----------------------------------
Select a major search engine and enter a general search
phrase related to your pages' overall PGA theme, e.g.
"golf", "golf tournaments", "golf rules", "PGA", and,
even better: "golf glossary".

Now, visit say the top 10 web sites featured in the
SERPs, select any pages rich in relevant textual
content and download or whack them.

Step 2: Generating Relevant Fillertext
--------------------------------------
2.1 Next, concatenate the whacked pages into a single
plain text file. You will now have a raw fillertext
file for your phantom pages which, however will still
require a fair bit of processing to be truly useful.

Hold it! Isn't this illegal copyright violation?
------------------------------------------------
It might very well be if you simply used copyright
protected content whacked from competitors' or any other
web sites as is, i.e. without further processing. This
would be plain piracy, and we're certainly not going to
advise you to go for it: ethics aside, the Web would
make it really very easy to find you out if you were
careless (or dumb) enough to simply use other people's
content without their permission.

However, this isn't required at all. Instead, what you
will have to convert your whacked content into is a
topically focused "text corpus":

2.2 Strip the content of all HTML tags, e-mail
addresses, links or URLs, JavaScript code, SSI code,
and similar.

2.3 Now, sanitize it by eliminating all trademarks,
company names, personal names and copyright notices.
(Obviously you will not want to exclude those
trademarks and product names directly related to your
catalogue of products.)

2.4. Next, delete any other stuff you don't want to see
your products associated with: this could be web page
navigation code or system messages (e.g. "Go to top",
"Your browser does not support frames", etc.), sexually
explicit language, or anything else deemed highly
irrelevant to a "natural language" environment focused
on your targeted search terms.

2.5 Finally, juggle the text file, e.g. by chopping
sentences in half and sorting the result by alphabet,
by size, or whatever algorithm you prefer.

Seeing that your typical text will have been culled
from several, probably hundreds of relevant web pages,
after which it is processed with lots of stuff deleted
and the results being re-sorted, this will effectively
render the original copyrighted material quite
unrecognizable. In fact, there will be no "copyrighted"
(nor, for that matter, copyrightable) material left at
all, and bingo - you're perfectly legal! (If in doubt,
consult a lawyer versed in local and international
copyright matters - this article refuses to be
construed as binding legal advice!)

You have now created a highly relevant, topical fillertext
file for all your golf related merchandise. Let's say you
have named it "filler_golf.txt".

Obviously, you will proceed in a similar manner for all
other themes or product categories you want to optimize
for, thus generating several different fillertext files.
These would be stored under different names, e.g.
"filler_baseball.txt", "filler_football.txt", etc.

A note on size
--------------
If you plan on creating a large number of phantom
pages, as you actually should (after all, search engine
optimization is really a number game), make sure you
create a fairly large fillertext file per set of
targeted topics. The reason for this is that you will
want to generate only unique pages with as great a
variety of content as possible. This will also
dramatically increase the traffic your phantom pages
will attract for search term combinations you may not
have thought of initially. (Check your current logs for
search engine generated traffic and you will very
probably find lots of search phrases - some more, some
less weird - you wouldn't even have dared targeting!)

By way of a rule of thumb a good Shadow Domain should
work from a fillertext file of at least 2-5 MB
sanitized content to ensure variety and uniqueness of
pages. This will normally entail downloading around
20-50 MB of unprocessed, raw HTML pages.

Step 3: Creating Phantom Pages
------------------------------
Based on your newly created fillertext file, you can
now proceed with creating your phantom pages proper:
simply cut and paste parts from your fillertext into
your HTML page template and sprinkle it with your
targeted search phrases (a maximum of two distinct
search phrases per page has proven the most effective
strategy) until you achieve your targeted keyword
density.

Don't forget to add your search phrases in your page
titles, and do crosslink your phantom pages as well as
search engines notoriously dislike orphaned pages.


If you feel that the above seems like a daunting chore,
you're right: depending on the type of site you are
optimizing for, it may cost you quite a bit of effort
to set up a decent, effective IP delivery
infrastructure.

Thankfully, there's some software around which can help
you save tons of time in the process.

Splitting and sorting the text is a task most word
processors and the more powerful text editors can perform
for you.

If you want to automate the process of keyword density
generation, we recommend our own fantomas keyMixer(TM).
More info on this program is available here:
< http://fantomaster.com/fakeymixer0.html >

The fantomas keyMixer(TM) is part and parcel of both our
fantomas Webmaster Suite(TM):
< http://fantomaster.com/fawmsuite0.html >
and our fantomas Super Suite(TM):
< http://fantomaster.com/fasupersuite0.html >

And finally, if you want to automate the whole process
from A to Z, we suggest you take a gander at our new
flagship product, the fantomas shadowMaker(TM): This
powerful server based application lets you generate an
unlimited number of highly optimized Shadow Domains in
a whiffy.

The fantomas shadowMaker(TM) offers the following
selection of features (and then some):
* Automatic selection of relevant URLs.
* Whacking of relevant fillertext raw content.
* Automatic sanitization and processing of raw content to
create fully usable fillertext files.
* Generation of an unlimited number of crosslinked
phantom pages (10,000 pages per hour and more!) with
predefined keyword density and page weight (both
fully customizable).
* Automatic submission of phantom pages to the search
engines.
* Automatic recognition of search engine spiders, working
from the world's largest spider database, with human
visitors being reliably redirected by search term to
any target URL you care to define.

What's more, it is fully customizable - once you have
installed it, you will never require any other cloaking
software again.

Read all about it here:
< http://fantomaster.com/fashadowmaker0.html >



This text may freely be republished or distributed in unmodified form provided the following resource box is included intact either at the beginning or the end of the article and a complimentary copy or notice (link) is sent to the author at the address specified below:

Ralph Tegtmeier is the co-founder and principal of fantomaster.com GmbH (Belgium), < http://fantomaster.com/ >, a company specializing in webmasters software development, industrial-strength cloaking and search engine positioning services.

He has been a web marketer since 1994 and is editor-in-chief of fantomNews, a free newsletter focusing on search engine optimization, available at: < http://fantomaster.com/fantomnews-sub.html > You can contact him at mailto:fneditor@fantomaster.com
(c) copyright 2002 by fantomaster.com
All rights reserved.
Downloaded at: < http://fantomaster.com/ >