Updating Olympic Labs data

I distributed the data for the Olympic Events lab exercises with a Bash script that used Elasticsearch’s bulk API to load data into a cluster. This worked for a lot of people but it wasn’t a cross-platform solution.

Kibana now has a tool called the ‘Data Visualizer’ to import data in CSV and ND-JSON formats. The tool is still marked as ‘Experimental’ but I have used it a few times and it appears to work well. Using it is even a requirement of the latest Elastic Certified Engineer curriculum. I am now providing a single file that can be used with this tool to ingest the Olympic Events dataset, removing the need for my Bash script.

Here are the steps to use the upload tool in Kibana:

  1. Download the dataset from here.
  2. Extract the archive.
  3. In Kibana, click on the ‘Upload a file’ button.
    Upload file tool
    Upload file tool
  4. Select or drag the events.ndjson file onto the page.
  5. You’ll be shown a summary of the file but you can accept all the defaults by clicking ‘Import’.
    File summary
    File summary
  6. Specify the index name as olympic-events. Create an index pattern if you like but it’s not required for the labs. Click ‘Import’ again.
    Specify index name
    Specify index name
  7. A new index will be created and the file will be uploaded into it.
    Index created
    Index created

The mappings created by the tool are much better than the dynamic mapping created using the bulk API with the Bash script, but they’re still not optimal. A meta field also gets created to indicate that the index was created using the tool.

New index mapping meta field
New index mapping meta field

I have updated the first set of exercises to use the Data Visualizer and the new source file.

All content on this site is my own and does not necessarily reflect the views of any of my employers or clients, past or present.
Built with Hugo
Theme based on Stack originally designed by Jimmy, forked by George Bridgeman