Archive

Archive for the ‘Linux’ Category

Solr on Jetty on Ubuntu

October 14th, 2011 Comments off

This article explains steps involved in deploying Apache Solr search engine as a system service on the Jetty servlet container on Ubuntu OS. This article is based on information from the Solr Jetty wiki page and on troubleshooting experiences of others.

Prerequisites:

  • Target system should have atleast Java 6 installed (in my case, OpenJRE 6 is installed)

Steps:

1. In this description, /opt/solr will be the target directory where Solr will be deployed.

 

2. The /example directory in the solr package forms the basis of the installation on the target system. It contains multiple configurations, each suitable for a different use case:

/example-DIH : a multicore configuration with each core demonstrating a different data importing configuration

/multicore : a simple multicore installation

/solr : a basic single core configuration.

Copy the configuration suitable for your application into /example/solr (replacing the one already there if necessary) and discard the rest. A configuration typically consists of /conf and /data (and sometimes also /bin and /lib) sub directories.

 

2. Additionally, the /dist and /contrib package directories contain important jars required by some of these configurations:

/dist/apache-solr-dataimporthandler*.jars – if you require data importing capabilities.

/dist/apache-solr-cell-*.jars ,  /contrib/extraction/lib/*.jars – If you require content extraction from PDF, MS office and other document files.

These jars should also be deployed on the target system.

 

3. Copy these files to the target system and create the directory structure suggested below under /opt/solr:

|-- dist - All required jars, including additional jars from /contrib
|-- etc - this should probably go into the root /etc directory, as per conventions
|   |-- jetty.xml
|   `-- webdefault.xml
|-- lib
|-- solr
|   |-- bin
|   |-- conf
|   |   |-- admin-extra.html
|   |   |-- dataimport.properties
|   |   |-- elevate.xml
|   |   |-- protwords.txt
|   |   |-- schema.xml
|   |   |-- scripts.conf
|   |   |-- solrconfig.xml
|   |   |-- stopwords.txt
|   |   |-- synonyms.txt
|   |   `-- xml-data-config.xml
|   |-- data
|-- start.jar
|-- webapps
|   `-- solr.war
`-- work

 

4. The solr process should run with its own dedicated credentials, so that authorizations can be administered at a fine granularity. So create a system user and group named ‘solr’.

$ sudo adduser --system solr
$ sudo addgroup solr
$ sudo adduser solr solr

5. Create a log directory /var/log/solr for solr and jetty logs.

6. Jetty outputs its errors to STDERR by default. Redirect it to a rolling log file by adding this section to /opt/solr/etc/jetty.xml.

    <!-- =========================================================== -->
    <!-- configure logging                                           -->
    <!-- =========================================================== -->
   <new id="ServerLog" class="java.io.PrintStream">
      <arg>
        <new class="org.mortbay.util.RolloverFileOutputStream">
          <arg><systemproperty default="/var/log/solr" name="jetty.logs" />/yyyy_mm_dd.stderrout.log</arg>
          <arg type="boolean">false</arg>
          <arg type="int">90</arg>
          <arg><call class="java.util.TimeZone" name="getTimeZone"><arg>GMT</arg></call></arg>
          <get id="ServerLogName" name="datedFilename"></get>
        </new>
      </arg>
    </new>
    <call class="org.mortbay.log.Log" name="info"><arg>Redirecting stderr/stdout to <ref id="ServerLogName" /></arg></call>
    <call class="java.lang.System" name="setErr"><arg><ref id="ServerLog" /></arg></call>
    <call class="java.lang.System" name="setOut"><arg><ref id="ServerLog" /></arg></call>

 

7. Now we need to set file and directory permissions so that the solr process user can work correctly.

Use chown to make solr:solr as the owner and group.

 

$ sudo chown -R solr:solr /opt/solr
$ sudo chown -R solr:solr /var/log/solr


Use chmod to give write permissions to solr:solr for the following directories:

/opt/solr/data

/opt/solr/work

/var/log/solr

 

8. The basic installation should work now. Try by launching jetty as a regular process:

 

/opt/solr$ sudo java -Dsolr.solr.home=/opt/solr/solr -jar start.jar

 

This should start solr.

Verify that logs are getting generated under /var/logs/solr.

Test it by sending a query to http://localhost:8983/solr/select?q=something using curl.

 

9. Now we need to install solr as a system daemon so that it can start automatically. Download the jetty.sh startup script (link courtesy http://wiki.apache.org/solr/SolrJetty) and save it as /etc/init.d/solr. Give it executable rights.

The following environment variables need to be set. They can either be inserted in this /etc/init.d/solr script itself, or they can be stored in /etc/default/jetty, which is read by the script.

 

JAVA_HOME=/usr/lib/jvm/default-java

JAVA_OPTIONS="-Xmx64m -Dsolr.solr.home=/opt/solr/solr"

JETTY_HOME=/opt/solr

JETTY_USER=solr

JETTY_GROUP=solr

JETTY_LOGS=/var/log/solr

 

Set the -Xmx parameters as per your requirements.

 

10. Additionally, this startup script has a problem that prevents it from running in Ubuntu. If you try running this right now using

 

$ sudo /etc/init.d/solr

 

you’ll get a

Starting Jetty: FAILED

error.

 

The problem – as explained well in this troubleshooting article – is in this line that attempts to start the daemon:

 

if start-stop-daemon -S -p"$JETTY_PID" $CH_USER -d"$JETTY_HOME" -b -m -a "$JAVA" -- "${RUN_ARGS[@]}" --daemon

 

In Ubuntu, –daemon is not a valid option for start-stop-daemon. Remove that option from the script:

if start-stop-daemon -S -p"$JETTY_PID" $CH_USER -d"$JETTY_HOME" -b -m -a "$JAVA" -- "${RUN_ARGS[@]}"

 

If you try starting it now, it should work:

$ sudo /etc/init.d/solr

 

It should give a

Starting Jetty: OK

message, and ps -ef |grep java should show the "java -jar start.jar" process.

 

11. Finally, it’s time to configure this as an init script. Read this article if you want a background on Ubuntu runlevels and init scripts.

Insert these lines at the top of /etc/init.d/solr to make it a LSB (Linux Standard Base) compliant init script. Without these lines, it’s not possible to configure the run level scripts.

### BEGIN INIT INFO

# Provides:          solr

# Required-Start:    $local_fs $remote_fs $network

# Required-Stop:     $local_fs $remote_fs $network

# Should-Start:      $named

# Should-Stop:       $named

# Default-Start:     2 3 4 5

# Default-Stop:      0 1 6

# Short-Description: Start Solr.

# Description:       Start the solr search engine.

### END INIT INFO

 

Now run the following command:

$ sudo update-rc.d solr defaults
 Adding system startup for /etc/init.d/solr ...
   /etc/rc0.d/K20solr -> ../init.d/solr
   /etc/rc1.d/K20solr -> ../init.d/solr
   /etc/rc6.d/K20solr -> ../init.d/solr
   /etc/rc2.d/S20solr -> ../init.d/solr
   /etc/rc3.d/S20solr -> ../init.d/solr
   /etc/rc4.d/S20solr -> ../init.d/solr
   /etc/rc5.d/S20solr -> ../init.d/solr

As you can see, the run levels 2-5 (they are equivalent in Ubuntu) are now configured to start solr.

Categories: Search, Ubuntu Tags: , , ,

Ubuntu startup – init scripts, runlevels, upstart jobs explained

September 25th, 2011 Comments off

Ubuntu has 2 different mechanisms for starting system services:

  • The traditional mechanism based on run levels, and scripts in /etc/init.d and /etc/rcn.d directories
  • A new mechanism known as upstart.

Some services are started using one mechanism and others using the other. If you want to control the services, it’s necessary to understand these mechanisms.


Run levels and init.d scripts – the traditional mechanism

Linux has the concept of run levels, in all distros as part of the Linux Base Specification. They can be considered to be “modes” in which Linux runs.

Run level Name Description
0 Halt Shuts down the system
1 Single-user mode Mode for administrative tasks.
2 Multi-User Mode Does not configure network interfaces and does not export networks services
3 Multi-User Mode with Networking Starts the system normally
4 Not used / user definable For special purposes
5 Start the system normally with GUI display manager Run level 3 + display manager
6 Reboot Reboots the system
s or S Single-user mode Does not configure network interfaces, or start daemons.

In Ubuntu (and Debian), run levels 2 to 5 are equivalent and configured with the same set of services.

Get Current run level

Use the runlevel command to get current run level. runlevel is available in Ubuntu as well as redhat based distros like CentOS (not sure about other distros).

karthik@ubuntuLynx:~$ runlevel
N 2

/etc/init.d directories

The /etc/init.d directory contains scripts, which can start / stop / restart services. These are invoked with a start|stop argument at startup and shutdown.

/etc/rcn.d directories

The /etc/rcn.d directories specify which scripts in /etc/init.d are enabled for run level n.

For example, /etc/rc2.d specifies which scripts in /etc/init.d are enabled for run level 2. At startup and shutdown, only these enabled scripts are invoked.

Entries in /etc/rcn.d directories are symlinks to scripts in /etc/init.d, but with a special prefix of the format

[S|K]nn

S means the script is enabled for this run level.

K means the script is disabled for this run level.

nn is a sequence number that can be used to control the sequence of starting services, so that services which depend on other services are started only after those other services are started.

Below is a listing or /etc/rc2.d. It shows that tomcat6, dovecot and postfix are not automatically started in run level 2. However, they can be started manually.

K08tomcat6
K76dovecot
K80postfix
S20gpm
S20winbind
S50rsync
S70dns-clean
S70pppd-dns
S91apache2
S99grub-common
S99ondemand
S99rc.local

Enabling and disabling run level services

Use the chkconfig –list command to get an overview of all services and their status. If not installed, install it using sudo apt-get install chkconfig. It gives a status listing like this:

karthik@ubuntukarmic:~$ chkconfig --list
acpi-support              0:off  1:off  2:on   3:on   4:on   5:on   6:off
acpid                     0:off  1:off  2:off  3:off  4:off  5:off  6:off
alsa-utils                0:off  1:off  2:off  3:off  4:off  5:off  6:off
...

Use the update-rc.d command to enable or disable a service at a run level:

Syntax: sudo     update-rc.d     name    enable|disable    runlevel

Example: sudo update-rc.d dovecot disable 2

or

sudo update-rc.d dovecot defaults

 

When creating new init scripts, ensure that the script has the following section (this is an example – change values appropriately) at the top to make it  LSB (Linux Standard Base) compliant. Without this section, update-rc.d won’t work but will give a “missing LSB information” warning…

### BEGIN INIT INFO
# Provides:          solr
# Required-Start:    $local_fs $remote_fs $network
# Required-Stop:     $local_fs $remote_fs $network
# Should-Start:      $named
# Should-Stop:       $named
# Default-Start:     2 3 4 5
# Default-Stop:      0 1 6
# Short-Description: Start Solr.
# Description:       Start the solr search engine.
### END INIT INFO

 


Upstart

Upstart jobs are configured in /etc/init directory, in .conf files.

Use the service command to start and stop upstart services:

sudo service <servicename> start|stop

For disabling an upstart service from starting up, open the respective /etc/init/[service].conf file and comment out the lines that begin with start on.

example:

...
#start on (net-device-up
#          and local-filesystems
#         and runlevel [2345])

...

This will disable the service from starting at startup, but allow manual starts using service start command.

For completely disabling a service – both from automatic and manual starts – it’s better to uninstall the package, but it’s also possible to just rename the .conf file to .conf.disabled.


Resources for further reading