Sunday, March 30, 2014

Working With Apache Solr

Background

Apache Solr is based on Apache Lucene. It is a search server. You can index data in Solr and then run queries.

Installing Apache Solr


Download Apache Solr from an Apache Mirror.  At the time of writing, the version was 4.7.0

Start Apache Solr

You can start solr using a default embedded jetty instance by going to the examples directory of your Solr installation.


$> java -jar start.jar

If there are no errors, your Solr instance is available at http://localhost:8983/solr/#/

You should see the default Solr Welcome Screen.

Exploring the collections


On the left hand column, use the drop down to choose the default colletion "Collection1".

You should see a screen for the collection.

Click on "query" on the left hand side.

You should see the query screen.

Press the "Execute query" button.

You should see the query response in JSON format as follows:

{
  "responseHeader": {
    "status": 0,
    "QTime": 2,
    "params": {
      "indent": "true",
      "q": "*:*",
      "_": "1396205101116",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 0,
    "start": 0,
    "docs": []
  }
}

The response shows that we have no data.

This is because we have not fed Apache Solr any data to index.

Index some data


We have Apache Solr running. We will try to index some data.

In another command window, let us go solr-install-dir/examples/exampledocs directory.


solr-4.7.0/example/exampledocs$ java -jar post.jar .
SimplePostTool version 1.5
Posting files to base url http://localhost:8983/solr/update using content-type application/xml..
Indexing directory . (16 files, depth=0)
POSTing file books.csv
SimplePostTool: WARNING: Solr returned an error #400 Bad Request
SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update
POSTing file books.json
SimplePostTool: WARNING: Solr returned an error #400 Bad Request
SimplePostTool: WARNING: IOException while reading response: java.io.IOException: Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update
POSTing file gb18030-example.xml
POSTing file hd.xml
POSTing file ipod_other.xml
POSTing file ipod_video.xml
POSTing file manufacturers.xml
POSTing file mem.xml
POSTing file money.xml
POSTing file monitor.xml
POSTing file monitor2.xml
POSTing file mp500.xml
POSTing file sd500.xml
POSTing file solr.xml
POSTing file utf8-example.xml
POSTing file vidcard.xml
16 files indexed.
COMMITting Solr index changes to http://localhost:8983/solr/update..
Time spent: 0:00:00.413
anil@anil:~/solr/solr-4.7.0/example/exampledocs$
Basically Apache Solr is indexed with all the files available in examples/exampledocs directory.

 Testing the Indexed Data


Now that Solr is indexed with some data, we can send queries.

In the Solr admin screen in the browser, just click the "Execute Query" button. This should return all the data that is available.
{
  "responseHeader": {
    "status": 0,
    "QTime": 5,
    "params": {
      "indent": "true",
      "q": "*:*",
      "_": "1396205796896",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 32,
    "start": 0,
    "docs": [
      {
        "id": "GB18030TEST",
        "name": "Test with some GB18030 encoded characters",
        "features": [
          "No accents here",
          "这是一个功能",
          "This is a feature (translated)",
          "这份文件是很有光泽",
          "This document is very shiny (translated)"
        ],
        "price": 0,
        "price_c": "0,USD",
        "inStock": true,
        "_version_": 1464027600530702300
      },
      {
        "id": "SP2514N",
        "name": "Samsung SpinPoint P120 SP2514N - hard drive - 250 GB - ATA-133",
        "manu": "Samsung Electronics Co. Ltd.",
        "manu_id_s": "samsung",
        "cat": [
          "electronics",
          "hard drive"
        ],
        "features": [
          "7200RPM, 8MB cache, IDE Ultra ATA-133",
          "NoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor"
        ],
        "price": 92,
        "price_c": "92,USD",
        "popularity": 6,
        "inStock": true,
        "manufacturedate_dt": "2006-02-13T15:26:37Z",
        "store": "35.0752,-97.032",
        "_version_": 1464027600570548200
      },
      {
        "id": "6H500F0",
        "name": "Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300",
        "manu": "Maxtor Corp.",
        "manu_id_s": "maxtor",
        "cat": [
          "electronics",
          "hard drive"
        ],
        "features": [
          "SATA 3.0Gb/s, NCQ",
          "8.5ms seek",
          "16MB cache"
        ],
        "price": 350,
        "price_c": "350,USD",
        "popularity": 6,
        "inStock": true,
        "store": "45.17614,-93.87341",
        "manufacturedate_dt": "2006-02-13T15:26:37Z",
        "_version_": 1464027600579985400
      },
      {
        "id": "F8V7067-APL-KIT",
        "name": "Belkin Mobile Power Cord for iPod w/ Dock",
        "manu": "Belkin",
        "manu_id_s": "belkin",
        "cat": [
          "electronics",
          "connector"
        ],
        "features": [
          "car power adapter, white"
        ],
        "weight": 4,
        "price": 19.95,
        "price_c": "19.95,USD",
        "popularity": 1,
        "inStock": false,
        "store": "45.18014,-93.87741",
        "manufacturedate_dt": "2005-08-01T16:30:25Z",
        "_version_": 1464027600588374000
      },
      {
        "id": "IW-02",
        "name": "iPod & iPod Mini USB 2.0 Cable",
        "manu": "Belkin",
        "manu_id_s": "belkin",
        "cat": [
          "electronics",
          "connector"
        ],
        "features": [
          "car power adapter for iPod, white"
        ],
        "weight": 2,
        "price": 11.5,
        "price_c": "11.50,USD",
        "popularity": 1,
        "inStock": false,
        "store": "37.7752,-122.4232",
        "manufacturedate_dt": "2006-02-14T23:55:59Z",
        "_version_": 1464027600592568300
      },
      {
        "id": "MA147LL/A",
        "name": "Apple 60 GB iPod with Video Playback Black",
        "manu": "Apple Computer Inc.",
        "manu_id_s": "apple",
        "cat": [
          "electronics",
          "music"
        ],
        "features": [
          "iTunes, Podcasts, Audiobooks",
          "Stores up to 15,000 songs, 25,000 photos, or 150 hours of video",
          "2.5-inch, 320x240 color TFT LCD display with LED backlight",
          "Up to 20 hours of battery life",
          "Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless, H.264 video",
          "Notes, Calendar, Phone book, Hold button, Date display, Photo wallet, Built-in games, JPEG photo playback, Upgradeable firmware, USB 2.0 compatibility, Playback speed control, Rechargeable capability, Battery level indication"
        ],
        "includes": "earbud headphones, USB cable",
        "weight": 5.5,
        "price": 399,
        "price_c": "399.00,USD",
        "popularity": 10,
        "inStock": true,
        "store": "37.7752,-100.0232",
        "manufacturedate_dt": "2005-10-12T08:00:00Z",
        "_version_": 1464027600599908400
      },
      {
        "id": "adata",
        "compName_s": "A-Data Technology",
        "address_s": "46221 Landing Parkway Fremont, CA 94538",
        "_version_": 1464027600616685600
      },
      {
        "id": "apple",
        "compName_s": "Apple",
        "address_s": "1 Infinite Way, Cupertino CA",
        "_version_": 1464027600618782700
      },
      {
        "id": "asus",
        "compName_s": "ASUS Computer",
        "address_s": "800 Corporate Way Fremont, CA 94539",
        "_version_": 1464027600619831300
      },
      {
        "id": "ati",
        "compName_s": "ATI Technologies",
        "address_s": "33 Commerce Valley Drive East Thornhill, ON L3T 7N6 Canada",
        "_version_": 1464027600620880000
      }
    ]
  }
}

Above we have just sent a query for all data.

Let us try to be specific with our queries.

In the edit box named "q", enter the following word: ipod and click the "Execute Query" button, you should see the following data returned as JSON response.

{
  "responseHeader": {
    "status": 0,
    "QTime": 8,
    "params": {
      "indent": "true",
      "q": "ipod",
      "_": "1396206251386",
      "wt": "json"
    }
  },
  "response": {
    "numFound": 3,
    "start": 0,
    "docs": [
      {
        "id": "IW-02",
        "name": "iPod & iPod Mini USB 2.0 Cable",
        "manu": "Belkin",
        "manu_id_s": "belkin",
        "cat": [
          "electronics",
          "connector"
        ],
        "features": [
          "car power adapter for iPod, white"
        ],
        "weight": 2,
        "price": 11.5,
        "price_c": "11.50,USD",
        "popularity": 1,
        "inStock": false,
        "store": "37.7752,-122.4232",
        "manufacturedate_dt": "2006-02-14T23:55:59Z",
        "_version_": 1464027600592568300
      },
      {
        "id": "F8V7067-APL-KIT",
        "name": "Belkin Mobile Power Cord for iPod w/ Dock",
        "manu": "Belkin",
        "manu_id_s": "belkin",
        "cat": [
          "electronics",
          "connector"
        ],
        "features": [
          "car power adapter, white"
        ],
        "weight": 4,
        "price": 19.95,
        "price_c": "19.95,USD",
        "popularity": 1,
        "inStock": false,
        "store": "45.18014,-93.87741",
        "manufacturedate_dt": "2005-08-01T16:30:25Z",
        "_version_": 1464027600588374000
      },
      {
        "id": "MA147LL/A",
        "name": "Apple 60 GB iPod with Video Playback Black",
        "manu": "Apple Computer Inc.",
        "manu_id_s": "apple",
        "cat": [
          "electronics",
          "music"
        ],
        "features": [
          "iTunes, Podcasts, Audiobooks",
          "Stores up to 15,000 songs, 25,000 photos, or 150 hours of video",
          "2.5-inch, 320x240 color TFT LCD display with LED backlight",
          "Up to 20 hours of battery life",
          "Plays AAC, MP3, WAV, AIFF, Audible, Apple Lossless, H.264 video",
          "Notes, Calendar, Phone book, Hold button, Date display, Photo wallet, Built-in games, JPEG photo playback, Upgradeable firmware, USB 2.0 compatibility, Playback speed control, Rechargeable capability, Battery level indication"
        ],
        "includes": "earbud headphones, USB cable",
        "weight": 5.5,
        "price": 399,
        "price_c": "399.00,USD",
        "popularity": 10,
        "inStock": true,
        "store": "37.7752,-100.0232",
        "manufacturedate_dt": "2005-10-12T08:00:00Z",
        "_version_": 1464027600599908400
      }
    ]
  }
}

Basically we are now returned all the data containing the word "ipod".


Tips

1. By default, Solr Search returns 10 results. If you want to return all the values, just use "&rows=100000" or a high value.

2 comments:

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete