ELK

2021/01/26

Use Kibana is very convinient to Discover and Visualize data, but sometimes we need to grab all data to process by machine. Here is a quick example for simple case:

import logging
from elasticsearch import Elasticsearch

def main():
    logging.basicConfig(level=logging.INFO)
    es = Elasticsearch(hosts="real-host",
                       http_auth=("es-user-name", "es-password"), # if have
                       timeout=120)
    size = 10000
    data = {
        "size": size,
        "query": {
          "match": {
            "key1": "value1"
          }
        }
    }
    res = es.search(index="index-name", body=data, scroll="5m")
    log.info("total=%s", res["hits"]["total"])
    # for i in range(total/size):
    #     res = es.scroll(scroll_id=res["_scroll_id"], scroll="5m")

if __name__ == "__main__":
    main()

if you don’t want all fields, you can add filter_path parameter for es.search and es.scroll, but make sure the value is specified like:

    fields = [
        "_scroll_id",
        "hits.total", 
        "hits.hits._source.@timestamp",
        "hits.hits._source.key1",
        "hits.hits._source.key2",
        "hits.hits._source.keyn",
    ]

if result total is larger than size, you should use es.scroll to do pagination.

references:

  • https://www.cnblogs.com/Serverlessops/articles/11747962.html

License: (CC 4.0) BY-NC-SA

Search

    Table of Contents