JasonTrue · March 11, 2025 03:37 · Sep 6, 2021 · Apr 2, 2019 · Sep 27, 2018 · Sep 27, 2018
diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -21,7 +21,7 @@ When you change the search_data hash structure, you'll need to reindex that mode
 
 # Searching
 
-In all recent versions of Postgres, you need to explicitly specify the fields you'll search.
+In all recent versions of Elasticsearch, you need to explicitly specify the fields you'll search.
 
 Your search should look something like this:
 

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -159,7 +159,7 @@ And may require using an explicit search body.
 
 However one solution to avoid this complexity would be to use the exact matching above, and index the field as lowercase, and maybe to pre-filter strings that look like email addresses in queries to lower-case.
 
-## with case insensitive
+## Case sensitivity
 By default searches are case insensitive. To override that for everything, you can alter the searchkick call `searchkick case_sensitive: [:field :list]`, or use exact matching:
 
     UserCourse.search(query,

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -21,18 +21,67 @@ When you change the search_data hash structure, you'll need to reindex that mode
 
 # Searching
 
+In all recent versions of Postgres, you need to explicitly specify the fields you'll search.
+
 Your search should look something like this:
 
 ```ModelName.search(query, fields: ['stringified_id', 'name', 'description', ...])```
 
-In all recent versions of Postgres, you need to explicitly specify the fields you'll search.
 
 # Common Indexing challenges, common solutions
 
-## We want to be able to search by ID (in full-text queries).
+## We want to be able to search by ID (in full-text queries)
 
-When you conduct a search with ElasticSearch, you specify which fields you want to query.
+By default, an integer field can only be searched as an integer, but if you coerce the field to be a string it's searchable with full text search.
+
+    def search_data
+      {
+        id: id;
+        _stringified_id: id.to_s,_
+      }
+    end
 
+Your search should look something like this:
+
+```ModelName.search(query, fields: ['stringified_id', 'name', 'description', ...])```
+
+It's worth noting that because you can use non-string types (including arrays of non-string types), it sometimes comes in handly to do more of your searching/filtering in Elastic than in Postgres. You can combine a full text query with some specific fields.
+
+    def search_data
+        {
+          blog_id: blog_id,
+          author: user.name,
+          author_id: user.id,
+          publish_year: publish_at.year,
+          publish_month: publish_at.month,
+          publish_day: publish_at.day,
+          publish_at: publish_at,
+          created_at: created_at,
+          updated_at: updated_at,
+          tags: tag_list,
+          story: story,
+          title: title,
+          approved: approved
+        }
+      end
+
+Then a flexible, type-aware search that still does full text search on some fields, like title and story: 
+
+    search_params = { approved: true, publish_at: { lte: 'now/m' } }
+    search_params = search_params.merge(blog_id: @blog.id) if @blog.present?
+    search_params = search_params.merge(publish_year: @year) if @year.present?
+    search_params = search_params.merge(publish_month: @month) if @month.present?
+    search_params = search_params.merge(publish_day: @day) if @day.present?
+    search_params = search_params.merge(tags: {all: @tags}) if @tags.present?
+
+    if @query.present?
+      @posts = Post.search(@query, fields: [:title, :story], where: search_params, page: params[:page])
+    else
+      @posts = Post.search(fields: [:title, :story], where: search_params, page: params[:page], per_page: 20,
+                                 order: [{publish_at: :desc}])
+    end
+    logger.info({ query: @query, params: search_params })      
+      
 
 ## We want to eager load associations so that it's not so expensive to update the index.
 

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -19,7 +19,7 @@ In practice, you'll need to customize what gets indexed. This is done by definin
 
 When you change the search_data hash structure, you'll need to reindex that model. You can do that in the rails console by typing `Model.reindex` but you can also use the rake task `searchkick:reindex:all`, or index just one specific model.
 
-#Searching
+# Searching
 
 Your search should look something like this:
 

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -107,6 +107,8 @@ This will require something like:
     searchkick merge_mappings: true, mappings: {...}
 
 And may require using an explicit search body.
+
+However one solution to avoid this complexity would be to use the exact matching above, and index the field as lowercase, and maybe to pre-filter strings that look like email addresses in queries to lower-case.
 
 ## with case insensitive
 By default searches are case insensitive. To override that for everything, you can alter the searchkick call `searchkick case_sensitive: [:field :list]`, or use exact matching:

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -108,14 +108,14 @@ This will require something like:
 
 And may require using an explicit search body.
 
-# with case insensitive
+## with case insensitive
 By default searches are case insensitive. To override that for everything, you can alter the searchkick call `searchkick case_sensitive: [:field :list]`, or use exact matching:
 
     UserCourse.search(query,
       fields:  [{my_field: :exact}, :other_field]
       
       
-# Japanese-aware indexing
+## Japanese-aware indexing
 While there's reasonable support out of the box for Japanese search, you can get additional features with the elasticsearch analysis-kuromoji plugin.
 
 searchkick language: "japanese"
@@ -124,14 +124,14 @@ If you go down this route, and want to support multiple analyzers, you need to u
 
 See https://www.elastic.co/guide/en/elasticsearch/guide/current/mixed-lang-fields.html for some possible options, and the searchkick docs for how to do custom mappings and custom/advanced search.
 
-# Any notice of combinations above
+## Any notice of combinations above
 Generally combinations are supported by choosing the right field to query. Most of the parameters that normally take a symbol can be replaced with a hash from that symbol to various options. (
 https://github.com/ankane/searchkick will have better examples than I can provide).
 
-# Not compatible with each other
+## Not compatible with each other
 In principle, you can create several fields that have their own analyzers and behaviors. When you build up the Search call, you can combine options. I'm not aware of specific incompatibilities but relevancy weighting may appear better or worse depending on the user's expectations. So for example, if you have a dilemma about how to search something, you could potentially use very dissimilar pseudo_fields with different search rules, and just include all of them, with potentially different boosting rules, in your search call.
 
-# Some parameters update frequently or require a lot of CPU time to reindex
+## Some parameters update frequently or require a lot of CPU time to reindex
 In conjunction with a scheduled background job, you can call ModelName.reindex(:custom_reindexer) and have a method like that returns only the fields that need special treatment.
 
     def custom_reindexer

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -125,11 +125,11 @@ If you go down this route, and want to support multiple analyzers, you need to u
 See https://www.elastic.co/guide/en/elasticsearch/guide/current/mixed-lang-fields.html for some possible options, and the searchkick docs for how to do custom mappings and custom/advanced search.
 
 # Any notice of combinations above
-Generally combinations 
-
+Generally combinations are supported by choosing the right field to query. Most of the parameters that normally take a symbol can be replaced with a hash from that symbol to various options. (
+https://github.com/ankane/searchkick will have better examples than I can provide).
 
 # Not compatible with each other
-In principle, you can create several fields that have their own analyzers. When you build up the Search call, you can combine options. I'm not aware of specific incompatibilities but relevancy weighting may appear better or worse depending on the user's expectations.
+In principle, you can create several fields that have their own analyzers and behaviors. When you build up the Search call, you can combine options. I'm not aware of specific incompatibilities but relevancy weighting may appear better or worse depending on the user's expectations. So for example, if you have a dilemma about how to search something, you could potentially use very dissimilar pseudo_fields with different search rules, and just include all of them, with potentially different boosting rules, in your search call.
 
 # Some parameters update frequently or require a lot of CPU time to reindex
 In conjunction with a scheduled background job, you can call ModelName.reindex(:custom_reindexer) and have a method like that returns only the fields that need special treatment.

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -109,8 +109,12 @@ This will require something like:
 And may require using an explicit search body.
 
 # with case insensitive
-By default searches are case insensitive. To override that, you can alter the Settings 
+By default searches are case insensitive. To override that for everything, you can alter the searchkick call `searchkick case_sensitive: [:field :list]`, or use exact matching:
 
+    UserCourse.search(query,
+      fields:  [{my_field: :exact}, :other_field]
+      
+      
 # Japanese-aware indexing
 While there's reasonable support out of the box for Japanese search, you can get additional features with the elasticsearch analysis-kuromoji plugin.
 

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -19,11 +19,20 @@ In practice, you'll need to customize what gets indexed. This is done by definin
 
 When you change the search_data hash structure, you'll need to reindex that model. You can do that in the rails console by typing `Model.reindex` but you can also use the rake task `searchkick:reindex:all`, or index just one specific model.
 
+#Searching
+
+Your search should look something like this:
+
+```ModelName.search(query, fields: ['stringified_id', 'name', 'description', ...])```
+
+In all recent versions of Postgres, you need to explicitly specify the fields you'll search.
+
 # Common Indexing challenges, common solutions
 
 ## We want to be able to search by ID (in full-text queries).
 
-When you conduct a search with ElasticSearch, you specify which fields you want to query. (T
+When you conduct a search with ElasticSearch, you specify which fields you want to query.
+
 
 ## We want to eager load associations so that it's not so expensive to update the index.
 

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -53,9 +53,13 @@ UserCourse.search(params[:query], {
       fields: ["name^5", "id"],
       misspellings: {below: 5}
 
+Alternatively, `MyModel.search query, body_options: {min_score: 1}` tunes out a lot of noise.
 
 ## Match "Reallyenglish, Co., Ltd." organization with "really"
-Depending on the goals, this can be solved a few different ways.
+
+Make sure the index includes this configuration for the field you want:
+
+    searchkick word_start: [:name]
 
 Match word start on a specific search:
 
@@ -64,11 +68,13 @@ Match word start on a specific search:
       match: :word_start
     )
 
-Index a specific field to always support word_start earches:
+## Match "Test Reallyenglish Program" program with "english" (In the middle of name)
 
-    searchkick word_start: [:name]
+Make sure the index includes this configuration for the field you want:
 
-## Match "Test Reallyenglish Program" program with "english" (In the middle of name)
+    searchkick word_middle: [:name]
+
+Then for the search:
 
     UserCourse.search(query,
       fields: ['stringified_id', 'name', 'description', ...]
@@ -83,8 +89,18 @@ Don't match "新潟大学" organization with "新人" (Disabling ambiguity)
       match: :word_middle
     )
 
+This is a case sensitive search, however, and probably not exactly what you want. More likely you'll want a tokenizer to treat an email address as a single word, which is a little more complicated. An article below covers this, but requires a custom mapping to implement, and a reconfigured analzyer.
+
+https://medium.com/linagora-engineering/searching-email-address-in-elasticsearch-3b09a11e3c2b
+
+This will require something like:
+
+    searchkick merge_mappings: true, mappings: {...}
+
+And may require using an explicit search body.
+
 # with case insensitive
-By default searches are case insensitive. To override that, you can 
+By default searches are case insensitive. To override that, you can alter the Settings 
 
 # Japanese-aware indexing
 While there's reasonable support out of the box for Japanese search, you can get additional features with the elasticsearch analysis-kuromoji plugin.
@@ -96,12 +112,11 @@ If you go down this route, and want to support multiple analyzers, you need to u
 See https://www.elastic.co/guide/en/elasticsearch/guide/current/mixed-lang-fields.html for some possible options, and the searchkick docs for how to do custom mappings and custom/advanced search.
 
 # Any notice of combinations above
+Generally combinations 
 
-# Not compatible with each other
-In principle, you can create several indexes and 
 
-Benchmark ILIKE vs partial match with elasticsearch using more than 1 million records of users table
-Others
+# Not compatible with each other
+In principle, you can create several fields that have their own analyzers. When you build up the Search call, you can combine options. I'm not aware of specific incompatibilities but relevancy weighting may appear better or worse depending on the user's expectations.
 
 # Some parameters update frequently or require a lot of CPU time to reindex
 In conjunction with a scheduled background job, you can call ModelName.reindex(:custom_reindexer) and have a method like that returns only the fields that need special treatment.

diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -44,18 +44,72 @@ However, this scope is used only for batch import. When an individual entity is
     !deleted
   end
 ```
+## Avoid short query strings (single or two character searches) returning lots of results.
+
+By default, misspelling-gentle search is turned on in searchkick. So the two ways to reduce unwanted search results are to turn off or adjust the misspelling-friendly feature, or to query with a relevancy score filter.
+
+For example,
+UserCourse.search(params[:query], {
+      fields: ["name^5", "id"],
+      misspellings: {below: 5}
+
+
+## Match "Reallyenglish, Co., Ltd." organization with "really"
+Depending on the goals, this can be solved a few different ways.
+
+Match word start on a specific search:
+
+    UserCourse.search(query,
+      fields: ['stringified_id', 'name', 'description', ...]
+      match: :word_start
+    )
+
+Index a specific field to always support word_start earches:
+
+    searchkick word_start: [:name]
+
+## Match "Test Reallyenglish Program" program with "english" (In the middle of name)
+
+    UserCourse.search(query,
+      fields: ['stringified_id', 'name', 'description', ...]
+      match: :word_middle
+    )
 
-# Match "Reallyenglish, Co., Ltd." organization with "really"
-This can be solved by prefix 
-Match "Test Reallyenglish Program" program with "english" (In the middle of name)
 Don't match "新潟大学" organization with "新人" (Disabling ambiguity)
-Exact match with User Email
-with case insensitive
-If you can finish above in short time, next issues are
 
-Any notice of combinations above
-Not compatible with each other
+## Exact match with User Email
+    UserCourse.search(query,
+      fields:  [{email: :exact}, :name]
+      match: :word_middle
+    )
+
+# with case insensitive
+By default searches are case insensitive. To override that, you can 
+
+# Japanese-aware indexing
+While there's reasonable support out of the box for Japanese search, you can get additional features with the elasticsearch analysis-kuromoji plugin.
+
+searchkick language: "japanese"
+
+If you go down this route, and want to support multiple analyzers, you need to use the searchkick mappings feature and multiple fields. It's not terribly hard, but it's more involved than a quick FAQ can handle.
+
+See https://www.elastic.co/guide/en/elasticsearch/guide/current/mixed-lang-fields.html for some possible options, and the searchkick docs for how to do custom mappings and custom/advanced search.
+
+# Any notice of combinations above
+
+# Not compatible with each other
+In principle, you can create several indexes and 
+
 Benchmark ILIKE vs partial match with elasticsearch using more than 1 million records of users table
 Others
 
+# Some parameters update frequently or require a lot of CPU time to reindex
+In conjunction with a scheduled background job, you can call ModelName.reindex(:custom_reindexer) and have a method like that returns only the fields that need special treatment.
+
+    def custom_reindexer
+      {
+        just_the_field_that_matters: calculation_method
+      }
+    end
+
 
diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -5,15 +5,57 @@ https://github.com/ankane/searchkick
 
 By default, simply adding the call 'searchkick' to a model will do an unclever indexing of all fields (but not has_many or belongs_to attributes).
 
-In practice, you'll need to customize what gets indexed. This is done by defining a method called `search_data`
-
-  def search_data
-    {
-      name: name,
-      department_name: department.name,
-      on_sale: sale_price.present?
-    }
+In practice, you'll need to customize what gets indexed. This is done by defining a method on your model called `search_data`
+
+    def search_data
+      {
+        id: id;
+        stringified_id: id.to_s,
+        tags: tags.join(" "),
+        user: user.full_name,
+        pass_rate: calculate_pass_rate
+      }
+    end
+
+When you change the search_data hash structure, you'll need to reindex that model. You can do that in the rails console by typing `Model.reindex` but you can also use the rake task `searchkick:reindex:all`, or index just one specific model.
+
+# Common Indexing challenges, common solutions
+
+## We want to be able to search by ID (in full-text queries).
+
+When you conduct a search with ElasticSearch, you specify which fields you want to query. (T
+
+## We want to eager load associations so that it's not so expensive to update the index.
+
+Define a scope by this name, and invoke appropriate #joins or #includdes.
+
+`scope :search_import, -> { includes(study_tracking: study_tracking_details) }`
+
+## We have soft-deleted records and want to exclude them from indexing
+
+Similar to the above solution, 
+
+`scope :search_import, -> { where(deleted: false) }`
+
+However, this scope is used only for batch import. When an individual entity is saved, it is updated separately, so you'll also want to implement:
+
+```
+  def should_index?
+    !deleted
   end
+```
+
+# Match "Reallyenglish, Co., Ltd." organization with "really"
+This can be solved by prefix 
+Match "Test Reallyenglish Program" program with "english" (In the middle of name)
+Don't match "新潟大学" organization with "新人" (Disabling ambiguity)
+Exact match with User Email
+with case insensitive
+If you can finish above in short time, next issues are
+
+Any notice of combinations above
+Not compatible with each other
+Benchmark ILIKE vs partial match with elasticsearch using more than 1 million records of users table
+Others
+
 
-For example, 
-When you change the struct
diff --git a/searchkick_and_elasticsearch_guidance.md b/searchkick_and_elasticsearch_guidance.md
@@ -0,0 +1,19 @@
+# Resources:
+https://github.com/ankane/searchkick
+
+# Indexing
+
+By default, simply adding the call 'searchkick' to a model will do an unclever indexing of all fields (but not has_many or belongs_to attributes).
+
+In practice, you'll need to customize what gets indexed. This is done by defining a method called `search_data`
+
+  def search_data
+    {
+      name: name,
+      department_name: department.name,
+      on_sale: sale_price.present?
+    }
+  end
+
+For example, 
+When you change the struct
No results found