Wednesday 19 May 2010

The search service instance on this server is not online

I came across this tickler today:

Problem:
1) I configured the Office Server Search Service (index role) on an application server without incident
2) I ran the following command on both WFE servers:
Stsadm –o osearch –action start –f –role query -propagationlocation D:\Index
3) This returned the error
The search service instance on this server is not online
4) The Windows or ULS logs showed no useful info. Based on various blog posts I did the following on WFE1 (not necessarily in this order):
a. Added the –farmcontactemail, -farmperformancelevel, -farmserviceaccount and –farmservicepassword switches in various combinations (although I knew none of these should be necessary as all this info had been populated when I created the index)
b. Added the search service account to the local administrators group (even though I knew this should not be necessary, I found a couple of bogus references to this in blogs)
c. Changed the search service account to the farm admin account
d. Created the SSP (hadn’t been created prior to first attempts)
e. Checked that I could access the search admin web service at
https://[indexserver]:56738/[sspname]/Search/SearchAdmin.asmx
f. Started the SPSearch service (previously stopped, not that this should have any impact)
g. Deleted D:\Index, which must have been created by one of the previous failed attempts
h. Tried starting the query service via Central Admin - this didn't display any errors but the service did not start

i. Rebooted WFE1
5) I eventually found that I was able to start the query service on WFE2 by omitting the –propagationlocation switch. At least I assume it was because I omitted the switch, but it may have been due to one of the changes above (creation of the SSP perhaps).
6) I tried running the command without –propagationlocation on WFE1 and this returned a new error
Method failed with unexpected error code 3
7) After a few more attempts I removed WFE1 from the farm and rejoined it. Even this did not resolve the error. I did however notice that during the rejoin (which was scripted), the command psconfig -cmd secureresources produced an error because D:\Index did not exist. Now at this point WFE1 should have been a fresh server in the farm and I had not tried to start the search service, so why was it trying to secure the index location?

Solution:
The solution was to rebuild an empty shell of the index file location directory structure (D:\Index\Office Server\Config) in Windows Explorer on WFE1, and re-run psconfig -cmd secureresources. Once this was done the query service could be started and hey presto!

I don’t know why the original problem occurred, most likely because the very first attempt failed and the service got into a neither stopped nor started state. It may be that I should not have deleted D:\index during my troubleshooting, but I found a blog post recommending just that!


In any case, a new server joined to the farm should not be looking for the index file location unless and until the search service is started, so that part feels like a bug to me.

1 comment:

  1. Update on this - there seems to be a bug in the psconfig -configdb disconnect (or the equivalent process of removing a server from a farm using the config wizard).

    When a server is removed from a farm you'd expect all traces of the farm configuration to be removed, right? Wrong... these entries (among others) are left behind in the registry:

    HKLM\Software\Microsoft\Office Server\12.0\Search\Applications\[GUID]\Gathering Manager\ApplicationPath

    HKLM\Software\Microsoft\Office Server\12.0\Search\Applications\[GUID]\Gathering Manager\DefaultProjectPath

    HKLM\Software\Microsoft\Office Server\12.0\Search\Applications\[GUID]\Gathering Manager\GatherLogsPath

    HKLM\Software\Microsoft\Office Server\12.0\Search\Applications\[GUID]\Gathering Manager\AnchorProject\WorkingDirectory

    HKLM\Software\Microsoft\Office Server\12.0\Search\Applications\[GUID]\Gathering Manager\Portal_Content\WorkingDirectory

    HKLM\Software\Microsoft\Office Server\12.0\Search\Global\Gathering Manager\DefaultApplicationsPath

    HKLM\Software\Microsoft\Office Server\12.0\Search\Setup

    These contain references to the index file location, and this is why psconfig -cmd secureresources tries to set permissions on the index folder, and fails if the folder does not exist.

    I would NOT advise changing or removing any of these entries... just rebuild the folder tree, secure the resources and run your osearch command to reconfigure your search service. That should update the registry and you can then delete the folder tree you created.

    ReplyDelete