Elasticsearch

Python Interface

Parameters#

Parameter Details
hosts Array of hosts in the form of object containing keys host and port. Default host is ‘localhost’ and port is 9200. A sample entry looks like [{"host": "ip of es server", "port": 9200}]
sniff_on_start Boolean if you want the client to sniff nodes on startup, sniffing means getting list of nodes in elasticsearch cluster
sniff_on_connection_fail Boolean for triggering sniffing if connection fails when client is active
sniffer_timeout time difference in seconds between each sniff
sniff_timeout time for a single request of sniffing in seconds
retry_on_timeout Booelan for if client should timeout trigger contacting a different elasticsearch node or just throw error
http_auth Basic http authentication can be provided here in the form of username:password

Indexing a Document (ie. Adding an sample)

Install the necessary Python Library via:

$ pip install elasticsearch

Connect to Elasticsearch, Create a Document (e.g. data entry) and “Index” the document using Elasticsearch.

from datetime import datetime
from elasticsearch import Elasticsearch

# Connect to Elasticsearch using default options (localhost:9200)
es = Elasticsearch()

# Define a simple Dictionary object that we'll index to make a document in ES
doc = {
    'author': 'kimchy',
    'text': 'Elasticsearch: cool. bonsai cool.',
    'timestamp': datetime.now(),
}

# Write a document
res = es.index(index="test-index", doc_type='tweet', id=1, body=doc)
print(res['created'])

# Fetch the document
res = es.get(index="test-index", doc_type='tweet', id=1)
print(res['_source'])

# Refresh the specified index (or indices) to guarantee that the document
#  is searchable (avoid race conditions with near realtime search)
es.indices.refresh(index="test-index")

# Search for the document
res = es.search(index="test-index", body={"query": {"match_all": {}}})
print("Got %d Hits:" % res['hits']['total'])

# Show each "hit" or search response (max of 10 by default)
for hit in res['hits']['hits']:
    print("%(timestamp)s %(author)s: %(text)s" % hit["_source"])

Connection to a cluster

es = Elasticsearch(hosts=hosts, sniff_on_start=True, sniff_on_connection_fail=True, sniffer_timeout=60, sniff_timeout=10, retry_on_timeout=True)

Creating an empty index and setting the mapping

In this example, we create an empty index (we index no documents in it) by defining its mapping.

First, we create an ElasticSearch instance and we then define the mapping of our choice. Next, we check if the index exists and if not, we create it by specifying the index and body parameters that contain the index name and the body of the mapping, respectively.

from elasticsearch import Elasticsearch

# create an ElasticSearch instance
es = Elasticsearch()
# name the index
index_name = "my_index"
# define the mapping
mapping = {
    "mappings": {
        "my_type": {
                "properties": {
                    "foo": {'type': 'text'},
                    "bar": {'type': 'keyword'}
                }
            }
        }
    }
    
# create an empty index with the defined mapping - no documents added
if not es.indices.exists(index_name):
    res = es.indices.create(
        index=index_name,
        body=mapping
    )
    # check the response of the request
    print(res)
    # check the result of the mapping on the index
    print(es.indices.get_mapping(index_name))

Partial Update and Update by query

Partial Update: Used when a partial document update is needed to be done, i.e. in the following example the field name of the document with id doc_id is going to be updated to ‘John’. Note that if the field is missing, it will just be added to the document.

doc = {
    "doc": {
        "name": "John"
    }
}
es.update(index='index_name',
          doc_type='doc_name',
          id='doc_id',
          body=doc)

Update by query: Used when is needed to update documents that satisfy a condition, i.e. in the following example we update the age of the documents whose name field matches ‘John’.

q = {
  "script": {
    "inline": "ctx._source.age=23",
    "lang": "painless"
  },
  "query": {
    "match": {
        "name": "John"
    }
  }
}

es.update_by_query(body=q, 
                   doc_type='doc_name', 
                   index='index_name')

This modified text is an extract of the original Stack Overflow Documentation created by the contributors and released under CC BY-SA 3.0 This website is not affiliated with Stack Overflow