Using IP address data in Elasticsearch

Storing and querying IP addresses and ranges in Elasticsearch is easy. I provide a few examples of what can be done.

IP address and network data can be stored and searched very easily in Elasticsearch.

The two field types commonly used for storing IP address data are:

  • ip for storing a single IP address
  • ip_range for storing IP networks; ranges of IP addresses

Let’s create a mapping to use both of these field types:

PUT routes
{
  "settings": {
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "keyword"
      },
      "destination": {
        "type": "ip_range"
      },
      "nextHop": {
        "type": "ip"
      }
    }
  }
}

Each route has a name, a range of destination IP addresses, and the IP of the next hop for matching traffic.

We can add some data to test with. This will create two routes; Route 1 with the range 192.168.1.0 to 192.168.1.127 and Route 2 with the range 192.168.1.128 to 192.168.1.255. Elasticsearch deals with creating the actual range from the CIDR notation:

POST routes/_doc
{
  "name": "Route 1",
  "destination": "192.168.1.0/25",
  "nextHop": "192.168.2.1"
}

POST routes/_doc
{
  "name": "Route 2",
  "destination": "192.168.1.128/25",
  "nextHop": "192.168.2.2"
}

To find the next hop for traffic destined for a given IP address, we can match the IP against the destination field using a regular term query. Doing this will find documents where the ip_range contains the IP address:

GET routes/_search
{
  "query": {
    "term": {
      "destination": "192.168.1.15"
    }
  }
}

...

{
  ...
  "hits" : {
    ...
    "hits" : [
      {
        ...
        "_source" : {
          "name" : "Route 1",
          "destination" : "192.168.1.0/25",
          "nextHop" : "192.168.2.1"
        }
      }
    ]
  }
}

A term query will also match against an IP field. To find routes with a given nextHop address:

GET routes/_search
{
  "query": {
    "term": {
      "nextHop": "192.168.2.2"
    }
  }
}

...

{
  ...
  "hits" : {
    ...
    "hits" : [
      {
        ...
        "_source" : {
          "name" : "Route 2",
          "destination" : "192.168.1.128/25",
          "nextHop" : "192.168.2.2"
        }
      }
    ]
  }
}

An ip field can also be searched using a range query to find documents with an IP in a given network, although Elasticsearch won’t let us use CIDR notation here:

GET routes/_search
{
  "query": {
    "range": {
      "nextHop": {
        "gte": "192.168.2.0",
        "lte": "192.168.2.4"
      }
    }
  }
}

...

{
  ...
  "hits" : {
    ...
    "hits" : [
      {
        ...
        "_source" : {
          "name" : "Route 1",
          "destination" : "192.168.1.0/25",
          "nextHop" : "192.168.2.1"
        }
      },
      {
        ...
        "_source" : {
          "name" : "Route 2",
          "destination" : "192.168.1.128/25",
          "nextHop" : "192.168.2.2"
        }
      }
    ]
  }
}

One edge-case that cropped up recently on the Elastic discussion forum is searching for documents based on the exact network using CIDR notation. In this example, we could be looking for a specific route.

Using a term query doesn’t work in this case:

GET routes/_search
{
  "query": {
    "term": {
      "destination": "192.168.1.0/25"
    }
  }
}

...

{
  "error" : {
    "root_cause" : [
      {
        "type" : "query_shard_exception",
        "reason" : "failed to create query: '192.168.1.0/25' is not an IP string literal.",
        "index_uuid" : "FfFYQWtfTQOVxlJM4lUJXg",
        "index" : "routes"
      }
    ],
    ...
  }
}

To allow this type of query, add a keyword multi-field to the ip_range field. The route can then be found using the original CIDR notation:

PUT routes_fixed
{
  "settings": {
    "number_of_replicas": 0
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "keyword"
      },
      "destination": {
        "type": "ip_range",
        "fields": {
          "raw": {
            "type": "keyword"
          }
        }
      },
      "nextHop": {
        "type": "ip"
      }
    }
  }
}

POST _reindex
{
  "source": {
    "index": "routes"
  },
  "dest": {
    "index": "routes_fixed"
  }
}

GET routes_fixed/_search
{
  "query": {
    "term": {
      "destination.raw": "192.168.1.0/25"
    }
  }
}

...

{
  ...
  "hits" : {
    ...
    "hits" : [
      {
        ...
        "_source" : {
          "name" : "Route 1",
          "destination" : "192.168.1.0/25",
          "nextHop" : "192.168.2.1"
        }
      }
    ]
  }
}

Multi-fields are incredibly useful when you have a piece of data that needs to be queried in multiple ways. Text data is a common use-case. A piece of text can be stored in a regular text field with a keyword version as a multi-field so the exact text can be queried or used in aggregations. Using multiple analyzers is also a common requirement.

All content on this site is my own and does not necessarily reflect the views of any of my employers or clients, past or present.
Built with Hugo
Theme based on Stack originally designed by Jimmy, forked by George Bridgeman