Ticket #1857 (closed defect: worksforme)

Opened 1 year ago

Last modified 8 months ago

After a wild machine shutdown NXLucene doesn't restart at the next machine reboot

Reported by: madarche Assigned to: madarche
Priority: P1 Milestone: CPS 3.4.7
Component: Lucene Version: TRUNK
Severity: normal Keywords:
Cc:

Description (Last modified by madarche)

If the server on which NXLucene is running suffers a wild shutdown, NXLucene won't start again at the next machine reboot when asked "nxlucened start".

This is tedious because it requires that an admin logs into the machine and does :

$ bin/nxlucened stop
$ bin/nxlucened start

This is a pure integration problem and only requires shell skills to fix.

Change History

10/05/07 11:49:43 changed by madarche

  • type changed from enhancement to defect.

04/24/08 11:40:30 changed by madarche

  • summary changed from After a wild machine shutdown NXLucene don't start at the next machine reboot to After a wild machine shutdown NXLucene doesn't restart at the next machine reboot.

(follow-up: ↓ 4 ) 04/24/08 22:52:32 changed by rspivak

Is there any log message indicating the reason of failure to start?

(in reply to: ↑ 3 ) 04/25/08 09:53:36 changed by madarche

  • description changed.

Replying to rspivak:

Is there any log message indicating the reason of failure to start?

There isn't any log message. It's a pure integration problem and only requires shell skills, I should have written that (and I'm updating the description of the ticket accordingly).

The problem comes from the way bin/nxlucened is written. It refuses to start if the PID file hasn't been removed before. And when a wild shutdown happens, the stop function is never called and the PID file isn't removed, which makes it impossible for NXLucene to restart at the next reboot.

To reproduce:

  1. start nxlucene
  2. kill the nxlucene process
  3. try to start nxlucene, it won't

04/25/08 21:03:55 changed by rspivak

I should have elaborated. I did testing on my local instance and I didn't meet any troubles with starting NXLucene even after manual kill -9. Stale PID file was removed on next nxlucened startup automatically by twisted sever.

Earlier I met some issues with starting up after abrupt stop of nxlucene when /tmp folder contained cpslucene's lock file, but it happened only when nxlucene was in process of indexing catalog and was abruptly stopped. After that I had to remove stale lock file manually for proper nxlucene startup.

Currently I'm using NXLucene trunk and Twisted-2.2.0. Can you check, please, your version of twistd server. You should be able to find out that in file __init__.py in your python's twisted directory under site-packages/twisted.

In my case twistd.py file in site-packages/twisted/scripts/twistd.py contains text:

log.msg('Removing stale pidfile %s' % pidfile, isError=True)

And related part of the code actually removes stale pid file and allows to start nxlucene correctly.

04/30/08 18:14:23 changed by madarche

  • status changed from new to closed.
  • resolution set to worksforme.

The installation at the client location was done with Twisted-2.5.0.

I have no mean to test on the client machine at this time but I have just done the same tests as you have and came to the same conclusion.

The only test I haven't done is a wild reboot (pulling up the cable) and I don't plan to.

So closing.

Thanks a lot Ruslan.