This is a common problem in the past few months, since Google introduced some new filters .. And since the Sitemaps team screwed up the site: parameter and other issues.
What can you do to get your indexed pages back ?
1. Remove all the site-wide links in your website (or at least most of them, leaving a maximum of 1 or two). That’s the links in YOUR website, pointing to other websites. Try to explain to your link exchange partners, that a single link will anyway offer them aprox. the same benefits (as far as keyword weight and PR goes).
2. Try to minimize any site-wide links towards your website. That’s the links in OTHER websites, pointing to your website.
One strong advice (altough some will disagree): If you hae the opportunity to receive a site-wide link, just refuse it.. Ask the webmaster to give you 1-2-3 unique links (from different pages of the website) and that’s it.. If he can’t offer you a few unique links (this can happen with most CMS’s) just refuse it.
I will only advise webmasters to aquire site-wide links, if the total number of gained IBL’s do not exceed 20% (rough estimate) of their total other unique links (not site-wides).
In plain english : If you have a website that has 5000 “unique” (2-3 links from the same domain are still kind of unique) links (from 2000 to 5000 unique domains), and you want to aquire a site-wide links from a website with 100.000 indexed pages, it’s not good at all.. If the website had 800 total indexed pages, that would’ve beene acceptable.
And one other important aspect: since Google doesn’t list all the IBL’s that you will gain from a site-wide (altough they are counted internally), you can consider a 5/1 ratio (indexed pages/IBL’s), when calculating how many IBL’s you will gain from a website (just to be even safer).
So if a website has 100.000 pages, you can calculate that you will receive about 20.000 IBL’s that will show in Google.
These are all my aproximate calculations based on my sole experience.
I think the duplicate filter is now (in the past few months) much more critical than in the past:
Occasionally, documents fetched by Googlebot won’t be included for various reasons (e.g. they appear to be duplicates of other pages on the web).
4. Try to add a few more pages to the website .. By this, I mean pages that were never in the website before .. so pages that were never indexed.
After doing all the above, wait 1 up to 2 weeks, to get all the pages in your website spidered/indexed again and see if you get any improvements.
If you do, drop a note here. If you don’t, drop a note here :)