Good post from Joel Oleson’s blog – I found it (and the links inside) very useful.
1) Put your Search db and on separate disks transaction logs, both the fastest most optimized disks with fast optimized spindles for writing (dedicated disks are essential)
2) Optimize your temp db: grow it, give it space, you can even split it into multiple dbs and ensure it is on the most optimized disks (dedicated disks are essential). Don’t forget to optimize the transaction logs of the temp db either!
3) Optimize the network between the servers you are indexing and the index server (GB NIC speed is preferred within a farm)
4) Consider topology changes to optimize network throughput and eliminate double hops (Index server crawling a separate front end (shared by user traffic) to pull changes. Adding the WFE role to your Index server and adding applicable host files is a great way to optimize the indexing and optimize your traffic at the same time.
5) Increase the your RAM on your x64 SQL Servers (8 GB is really a good place to start, 16 GB or more is looking better and better.)
6) Defragment your databases, and applicable drives (if fragmented) and run relevant dbcc consistency checks – Refer to KB on SharePoint Safe DBCC commands
7) Increase the # of crawl threads (you’ll have to watch this, it is the easiest way to speed up your crawls, but watch the box it is “attacking” it can be heavy handed.)
8) Reduce the maximum index file size (optional)
9) Remove any unused, single threaded and poor performing ifilters
10) Reduce the amount of full indexes, run incremental crawls on a schedule where they can complete, and remove non essentials such as every 5 minute incremental jobs these will simply cause unnecessary churn.
Bonus: Install the public update or the service pack when it comes out (includes a few SharePoint Indexing related fixes).
More on disk optimization on a post I did a while back… Also there’s a great paper that just got posted on storage and performance optimization. It is a MUST READ. Performance recommendations for storage planning and monitoring.
Getting crawled and you don’t want to? Here’s a recent KB on how to configure the robots.txt in your SharePoint deployment. There is some more info in this post from the field. It is very easy to have 50% of traffic as the crawl account. Optimizing the indexing by reducing even authentication traffic is a big deal. Use accounts that are in the same domain and where the DCs are fast and local if using NTLM. Kerberos might end up being slightly faster, but does add complexity.