Elasticsearch 6 を使ったデータ検証その2（マッピングの登録をしてみる）

keyword型は、text同様に文字列を格納できます。
text型との大きな違いとして、keyword型は格納した文字列を[完全一致]で検索する用途で使います。
keywordに格納したデータはアナライザによる単語分割処理が行われません。
たとえば、「メールアドレス」や「WebサイトのURL」、「タグ分類を行う際のタグ名」といったデータは、分割せずに格納・検索したいケースが多いはずです。

数値データの型

long、short、integer、float

日付データの型

date

boolean型

trueまたはfalseのみを指定できます。
Elasticsearch 5系まで Boolean type のフィールドに 0/1、on/off、yes/no、true/false など多くの値を設定することが可能でしたが、Elasticsearch 6系からは true/false の指定のみとなりました。
"true"/"false"（文字化）でも利用できます。

マッピングを行う

Elasticsearchでドキュメント内のデータ構造やデータ型を定義したもものをマッピングと呼びます。

マッピングはリクエストボディに「mappings」句を書いて定義することができます。

「mappings」句の中ではドキュメントタイプの名前、及び、そこに含まれる各データ型定義（フィールド名及びデータ型」を「properties」句の中に定義します。

マッピングの定義ですが、一度定義したマッピングは変更することができないので注意が必要です。
変更する場合はインデックス毎削除して再作成の必要があります。（もちろんデータの移行も）

そのためマッピングの作成はある程度の注意が必要です。
ただ、フィールドの追加はできます。

さて、今回マッピングの作成はファイルから実施してみます。

まず以下のファイルを作成します。


{
  "settings": {
    "analysis": {
      "analyzer": {
        "my_kuromoji_analyzer": {
          "type": "custom",
          "tokenizer": "kuromoji_tokenizer"
        }
      }
    }
  }
}

{

"settings": {

"analysis": {

"analyzer": {

"my_kuromoji_analyzer": {

"type": "custom",

"tokenizer": "kuromoji_tokenizer"

}

今回、日本語を使うということで上記のファイルを使ってインデックスの作成を行います。

まずはrestaurantからインデックスを作成します。


$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/restaurant?pretty' -d @kuromoji_setting.json
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "restaurant"
}

$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/restaurant?pretty' -d @kuromoji_setting.json

{

"acknowledged" : true,

"shards_acknowledged" : true,

"index" : "restaurant"

}

続いてratingもインデックスを作成します。


$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/rating?pretty' -d @kuromoji_setting.json
{
  "acknowledged" : true,
  "shards_acknowledged" : true,
  "index" : "rating"
}

$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/rating?pretty' -d @kuromoji_setting.json

{

"acknowledged" : true,

"shards_acknowledged" : true,

"index" : "rating"

}

さて、マッピング作成用に以下の２つのファイルを準備しました。

こちらでマッピングの作成を行います。


{
  "properties": {
    "name": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "name_alphabet": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "name_kana": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "address": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "description": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "purpose": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "category": {
      "type": "text",
      "analyzer": "whitespace"
    },
    "photo_count": {
      "type": "long"
    },
    "menu_count": {
      "type": "long"
    },
    "access_count": {
      "type": "long"
    },
    "closed": {
      "type": "boolean"
    },
    "location": {
      "type": "geo_point",
      "store": "true"
    }
  }
}

{

"properties": {

"name": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"name_alphabet": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"name_kana": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"address": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"description": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"purpose": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"category": {

"type": "text",

"analyzer": "whitespace"

"photo_count": {

"type": "long"

"menu_count": {

"type": "long"

"access_count": {

"type": "long"

"closed": {

"type": "boolean"

"location": {

"type": "geo_point",

"store": "true"

}


{
  "properties": {
    "rating_id": {
      "type": "long"
    },
    "restaurant_id": {
      "type": "long"
    },
    "total": {
      "type": "long"
    },
    "food": {
      "type": "long"
    },
    "service": {
      "type": "long"
    },
    "atmosphere": {
      "type": "long"
    },
    "cost_performance": {
      "type": "long"
    },
    "title": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "body": {
      "type": "text",
      "analyzer": "my_kuromoji_analyzer"
    },
    "purpose": {
      "type": "long"
    },
    "created_on": {
      "type": "date"
    }
  }
}

{

"properties": {

"rating_id": {

"type": "long"

"restaurant_id": {

"type": "long"

"total": {

"type": "long"

"food": {

"type": "long"

"service": {

"type": "long"

"atmosphere": {

"type": "long"

"cost_performance": {

"type": "long"

"title": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"body": {

"type": "text",

"analyzer": "my_kuromoji_analyzer"

"purpose": {

"type": "long"

"created_on": {

"type": "date"

}

さて、こちらのマッピングを登録したいと思います。

まずは「restaurant」から。


$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/restaurant/_mapping/type?pretty' -d @mapping_restaurant.json
{
  "acknowledged" : true
}

$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/restaurant/_mapping/type?pretty' -d @mapping_restaurant.json

{

"acknowledged" : true

}

インデックス名が「restaurant」でタイプ名が「type」で登録されました。

さらっと書いていますが、Elasticsearch6では「Content-Type」の設定が必要なのでそちらをリクエストに追加しています。

こちらのインデックスの状況を確認してみます。


$ curl -H "Content-Type: application/json" -X GET 'localhost:9200/restaurant?pretty'
{
  "restaurant" : {
    "aliases" : { },
    "mappings" : {
      "type" : {
        "properties" : {
          "access_count" : {
            "type" : "long"
          },
          "address" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          },
          "category" : {
            "type" : "text",
            "analyzer" : "whitespace"
          },
          "closed" : {
            "type" : "boolean"
          },
          "description" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          },
          "location" : {
            "type" : "geo_point",
            "store" : true
          },
          "menu_count" : {
            "type" : "long"
          },
          "name" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          },
          "name_alphabet" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          },
          "name_kana" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          },
          "photo_count" : {
            "type" : "long"
          },
          "purpose" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "5",
        "provided_name" : "restaurant",
        "creation_date" : "1534134035460",
        "analysis" : {
          "analyzer" : {
            "my_kuromoji_analyzer" : {
              "type" : "custom",
              "tokenizer" : "kuromoji_tokenizer"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "PMxbO-DkRb2ae9pOKNdRkQ",
        "version" : {
          "created" : "6030299"
        }
      }
    }
  }
}

$ curl -H "Content-Type: application/json" -X GET 'localhost:9200/restaurant?pretty'

{

"restaurant" : {

"aliases" : { },

"mappings" : {

"type" : {

"properties" : {

"access_count" : {

"type" : "long"

"address" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

"category" : {

"type" : "text",

"analyzer" : "whitespace"

"closed" : {

"type" : "boolean"

"description" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

"location" : {

"type" : "geo_point",

"store" : true

"menu_count" : {

"type" : "long"

"name" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

"name_alphabet" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

"name_kana" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

"photo_count" : {

"type" : "long"

"purpose" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

}

"settings" : {

"index" : {

"number_of_shards" : "5",

"provided_name" : "restaurant",

"creation_date" : "1534134035460",

"analysis" : {

"analyzer" : {

"my_kuromoji_analyzer" : {

"type" : "custom",

"tokenizer" : "kuromoji_tokenizer"

}

"number_of_replicas" : "1",

"uuid" : "PMxbO-DkRb2ae9pOKNdRkQ",

"version" : {

"created" : "6030299"

}

無事登録されているようですね。

さて、同じように「rating」のマッピングを行います。


$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/rating/_mapping/type?pretty' -d @mapping_rating.json
{
  "acknowledged" : true
}

$ curl -H "Content-Type: application/json" -X PUT 'http://localhost:9200/rating/_mapping/type?pretty' -d @mapping_rating.json

{

"acknowledged" : true

}

インデックス名が「rating」でタイプ名が「type」で登録されています。

同じようにインデックスの状況を確認してみます。


$ curl -H "Content-Type: application/json" -X GET 'localhost:9200/rating?pretty'
{
  "rating" : {
    "aliases" : { },
    "mappings" : {
      "type" : {
        "properties" : {
          "atmosphere" : {
            "type" : "long"
          },
          "body" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          },
          "cost_performance" : {
            "type" : "long"
          },
          "created_on" : {
            "type" : "date"
          },
          "food" : {
            "type" : "long"
          },
          "purpose" : {
            "type" : "long"
          },
          "rating_id" : {
            "type" : "long"
          },
          "restaurant_id" : {
            "type" : "long"
          },
          "service" : {
            "type" : "long"
          },
          "title" : {
            "type" : "text",
            "analyzer" : "my_kuromoji_analyzer"
          },
          "total" : {
            "type" : "long"
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "number_of_shards" : "5",
        "provided_name" : "rating",
        "creation_date" : "1534134083619",
        "analysis" : {
          "analyzer" : {
            "my_kuromoji_analyzer" : {
              "type" : "custom",
              "tokenizer" : "kuromoji_tokenizer"
            }
          }
        },
        "number_of_replicas" : "1",
        "uuid" : "0LM_HIjpTVKFMgoeBnFCnA",
        "version" : {
          "created" : "6030299"
        }
      }
    }
  }
}

$ curl -H "Content-Type: application/json" -X GET 'localhost:9200/rating?pretty'

{

"rating" : {

"aliases" : { },

"mappings" : {

"type" : {

"properties" : {

"atmosphere" : {

"type" : "long"

"body" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

"cost_performance" : {

"type" : "long"

"created_on" : {

"type" : "date"

"food" : {

"type" : "long"

"purpose" : {

"type" : "long"

"rating_id" : {

"type" : "long"

"restaurant_id" : {

"type" : "long"

"service" : {

"type" : "long"

"title" : {

"type" : "text",

"analyzer" : "my_kuromoji_analyzer"

"total" : {

"type" : "long"

}

"settings" : {

"index" : {

"number_of_shards" : "5",

"provided_name" : "rating",

"creation_date" : "1534134083619",

"analysis" : {

"analyzer" : {

"my_kuromoji_analyzer" : {

"type" : "custom",

"tokenizer" : "kuromoji_tokenizer"

}

"number_of_replicas" : "1",

"uuid" : "0LM_HIjpTVKFMgoeBnFCnA",

"version" : {

"created" : "6030299"

}

こちらで無事、マッピングと確認が実施できました。

全体的なインデックスの状態は以下になります。


$ curl 'http://localhost:9200/_cat/indices?v'
health status index      uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   rating     0LM_HIjpTVKFMgoeBnFCnA   5   1          0            0      1.2kb          1.2kb
yellow open   restaurant PMxbO-DkRb2ae9pOKNdRkQ   5   1          0            0      1.2kb          1.2kb

$ curl 'http://localhost:9200/_cat/indices?v'

health status index uuid pri rep docs.count docs.deleted store.size pri.store.size

yellow open rating 0LM_HIjpTVKFMgoeBnFCnA 5 1 0 0 1.2kb 1.2kb

yellow open restaurant PMxbO-DkRb2ae9pOKNdRkQ 5 1 0 0 1.2kb 1.2kb

そしてそれぞれのステータスは以下になります。


$ curl 'http://localhost:9200/restaurant/_stats/indexing?pretty'
{
  "_shards" : {
    "total" : 10,
    "successful" : 5,
    "failed" : 0
  },
  "_all" : {
    "primaries" : {
      "indexing" : {
        "index_total" : 0,
        "index_time_in_millis" : 0,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      }
    },
    "total" : {
      "indexing" : {
        "index_total" : 0,
        "index_time_in_millis" : 0,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      }
    }
  },
  "indices" : {
    "restaurant" : {
      "primaries" : {
        "indexing" : {
          "index_total" : 0,
          "index_time_in_millis" : 0,
          "index_current" : 0,
          "index_failed" : 0,
          "delete_total" : 0,
          "delete_time_in_millis" : 0,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        }
      },
      "total" : {
        "indexing" : {
          "index_total" : 0,
          "index_time_in_millis" : 0,
          "index_current" : 0,
          "index_failed" : 0,
          "delete_total" : 0,
          "delete_time_in_millis" : 0,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        }
      }
    }
  }
}

$ curl 'http://localhost:9200/restaurant/_stats/indexing?pretty'

{

"_shards" : {

"total" : 10,

"successful" : 5,

"failed" : 0

"_all" : {

"primaries" : {

"indexing" : {

"index_total" : 0,

"index_time_in_millis" : 0,

"index_current" : 0,

"index_failed" : 0,

"delete_total" : 0,

"delete_time_in_millis" : 0,

"delete_current" : 0,

"noop_update_total" : 0,

"is_throttled" : false,

"throttle_time_in_millis" : 0

}

"total" : {

"indexing" : {

"index_total" : 0,

"index_time_in_millis" : 0,

"index_current" : 0,

"index_failed" : 0,

"delete_total" : 0,

"delete_time_in_millis" : 0,

"delete_current" : 0,

"noop_update_total" : 0,

"is_throttled" : false,

"throttle_time_in_millis" : 0

}

"indices" : {

"restaurant" : {

"primaries" : {

"indexing" : {

"index_total" : 0,

"index_time_in_millis" : 0,

"index_current" : 0,

"index_failed" : 0,

"delete_total" : 0,

"delete_time_in_millis" : 0,

"delete_current" : 0,

"noop_update_total" : 0,

"is_throttled" : false,

"throttle_time_in_millis" : 0

}

"total" : {

"indexing" : {

"index_total" : 0,

"index_time_in_millis" : 0,

"index_current" : 0,

"index_failed" : 0,

"delete_total" : 0,

"delete_time_in_millis" : 0,

"delete_current" : 0,

"noop_update_total" : 0,

"is_throttled" : false,

"throttle_time_in_millis" : 0

}


$ curl 'http://localhost:9200/rating/_stats/indexing?pretty'
{
  "_shards" : {
    "total" : 10,
    "successful" : 5,
    "failed" : 0
  },
  "_all" : {
    "primaries" : {
      "indexing" : {
        "index_total" : 0,
        "index_time_in_millis" : 0,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      }
    },
    "total" : {
      "indexing" : {
        "index_total" : 0,
        "index_time_in_millis" : 0,
        "index_current" : 0,
        "index_failed" : 0,
        "delete_total" : 0,
        "delete_time_in_millis" : 0,
        "delete_current" : 0,
        "noop_update_total" : 0,
        "is_throttled" : false,
        "throttle_time_in_millis" : 0
      }
    }
  },
  "indices" : {
    "rating" : {
      "primaries" : {
        "indexing" : {
          "index_total" : 0,
          "index_time_in_millis" : 0,
          "index_current" : 0,
          "index_failed" : 0,
          "delete_total" : 0,
          "delete_time_in_millis" : 0,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        }
      },
      "total" : {
        "indexing" : {
          "index_total" : 0,
          "index_time_in_millis" : 0,
          "index_current" : 0,
          "index_failed" : 0,
          "delete_total" : 0,
          "delete_time_in_millis" : 0,
          "delete_current" : 0,
          "noop_update_total" : 0,
          "is_throttled" : false,
          "throttle_time_in_millis" : 0
        }
      }
    }
  }
}