Nunc Fluens

"Nunc fluens facit tempus, nunc stans facit aeternitatem." - Boethius, _The Consolation of Philosophy_
"The value of philosophy is, in fact, to be sought largely in its very uncertainty . . . it keeps alive our sense of wonder by showing familiar things in an unfamiliar aspect." - Bertrand Russell, _The Problems of Philosophy_

When writing data to Elasticsearch, the first time it sees a field that isn’t already in the field mapping, it will try to detect the data type of the field based on the JSON type and some rules as specified here: dynamic field mapping rules, and it will add that field to the field mapping for the index. For example, if Elasticsearch gets "field1":9999, it will create a field mapping which maps "field1" to a long type:

...
  "field2" : {
    "type" : "long"
  },
...

Also, in Elasticsearch, this data is immutable in the index - it can be created but not changed without reindexing. When using origin-aggregated-logging with MERGE_JSON_LOG=true, this can cause problems when different applications create fields with the same name but with incompatible data types. For example, suppose another application writes a log with "field2" having a different data type "field2":{"field21":"string","field22":1000} ("field2" is a JSON hash). You will get an error like this from Elasticsearch:

{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse [field2]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse [field2]",
    "caused_by" : {
      "type" : "json_parse_exception",
      "reason" : "Current token (START_OBJECT) not numeric, can not use numeric value accessors\n at [Source: org.elasticsearch.common.bytes.BytesReference$MarkSupportingStreamInputWrapper@6340a27e; line: 1, column: 12]"
    }
  },
  "status" : 400
}

The error message is a bit verbose, but it means the field value {"field21":"string","field22":1000} is not a numeric value (because it is a Hash).

How to view the dynamic mappings

If you want to see what are the mappings that have been added dynamically, use the field mapping API. For example, with origin-aggregated-logging:

oc exec -c elasticsearch $espod -- es_util --query=_all/_mapping/*/field/field2?pretty
{
  ".kibana" : {
    "mappings" : { }
  },
  ".operations.2019.03.21" : {
    "mappings" : {
      "com.redhat.viaq.common" : {
        "field2" : {
          "full_name" : "field2",
          "mapping" : {
            "field2" : {
              "type" : "long"
            }
          }
        }
      }
    }
  },
  ".searchguard" : {
    "mappings" : { }
  }
}

Where $espod is the name of one of your Elasticsearch pods. You may have multiple definitions for a field, one for each index in which the field is defined. In origin-aggregated-logging, we create a new index for each day and for each namespace, so you may have many such definitions, and you may have different definitions, if some other index has "field2" with a different type. With the _mapping command above, the fields are listed by index. You can use a tool like jq to parse apart the JSON returned.

Can I force Elasticsearch to store everything as a string?

Not exactly. Using the dynamic templates API you might consider adding a dynamic mapping that forces every value to be a string:

{
    "order": 20,
    "mappings": {
      "_default_": {
        "dynamic_templates": [
        {
          "force_all_to_string": {
            "match_mapping_type": "*",
            "mapping": {
              "type": "text",
              "fields": {
                "raw": {
                  "type":  "keyword",
                  "ignore_above": 256
                }
              }
            }
          }
        }
        ]
      }
    },
    "template": ".operations.*"
}
cat force_string_template.json | oc exec -i -c elasticsearch $espod -- es_util --query=_template/force_all_to_string -X PUT -d@-

And that seems to work for some types:

oc exec -c elasticsearch $espod -- es_util --query=.operations.2019.03.23/com.redhat.viaq.common?pretty -XPOST -d '{"field1":"stringval"}'
{
  "_index" : ".operations.2019.03.23",
  "_type" : "com.redhat.viaq.common",
  "_id" : "AWmiLVg1uPjkpmElglqW",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "created" : true
}

Now I can add a numeric field1 value which will be converted to a string:

oc exec -c elasticsearch $espod -- es_util --query=.operations.2019.03.23/com.redhat.viaq.common?pretty -XPOST -d '{"field1":1000}'
{
  "_index" : ".operations.2019.03.23",
  "_type" : "com.redhat.viaq.common",
  "_id" : "AWmiLdOnuPjkpmElglqY",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "created" : true
}

However, if field1 is a Hash or Array:

oc exec -c elasticsearch $espod -- es_util --query=.operations.2019.03.23/com.redhat.viaq.common?pretty -XPOST -d '{"field1":{"field11":"value"}}'
{
  "error" : {
    "root_cause" : [
      {
        "type" : "mapper_parsing_exception",
        "reason" : "failed to parse [field1]"
      }
    ],
    "type" : "mapper_parsing_exception",
    "reason" : "failed to parse [field1]",
    "caused_by" : {
      "type" : "illegal_state_exception",
      "reason" : "Can't get text on a START_OBJECT at 1:11"
    }
  },
  "status" : 400
}

Elasticsearch cannot convert the value to a string.