rlugojr · March 28, 2017 11:54
diff --git a/grunt-hugo-lunrjs b/grunt-hugo-lunrjs
 > How to implement a custom search for [Hugo](http://gohugo.io) usig [Gruntjs](http://gruntjs.com) and [Lunrjs](http://lunrjs.com).

 ## Requirements

 Install the following tools:

 * [Nodejs](https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager)
 * [Gruntjs](http://gruntjs.com/getting-started)

 ## Setup

 ### Project organization

 Here is my Hugo based website project structure

 ```
  MyWebsite/
   |- site/ <= Hugo project root folder
         |- content/
         |- layout/
         |- static/
            |- js/
               |- lunr/ <= Where we generate the lunr json index file
               |- vendor/
                  |- lunrjs.min.js <= lunrjs library
            |- ...
         |- config.yaml
         |- ...
   |- Gruntfile.js <= Where the magic happens
   |- package.json <= Dependencies declaration required to build the index
   |- ...
 ```

 ### Install the Nodejs dependencies

 From the project root folder launch `npm install --save-dev grunt string toml`

 * [string](http://stringjs.com/) <= do almost all the work
 * [toml](https://github.com/BinaryMuse/toml-node)
    - Used to parse the Frontmatter, mine is in TOML... obviously
    - Otherwise you can install [yamljs](https://www.npmjs.com/package/yamljs)

 ## Time to work

 ### The principle

 We will work both at buildtime and runtime. With Gruntjs (buildtime), we'll generate a JSON index file and with a small js script (runtime) initilize and use lunrjs.

 ### Build the Lunr index file

 Lunrjs allows you to define fields to describe your pages (documents in lunrjs terms) that will be used to search and hopefully find stuff. The index file is basically a JSON file corresponding to an array of all the documents (pages) composing the website.

 Here are the fields I chose (and the boost level to increase the page ranking) to describe my pages:

 * `title` <=> page title
    - boost 10
 * `tags` <=> page tags
    - boost 5
 * `content` <=> page content
 * `ref` <=> page URI

 #### Workflow

 1. Recursively walk through all files of the `content` folder
 2. Two possibilities
    1. Markdown file
        1. Parse the Frontmatter to extract the `title` and the `tags`
        2. Parse and clean the content
    2. HTML file
        1. Parse and clean the content
        2. Use the file name as `title`
 3. Use the path file as `ref` (link toward the page)

 #### Show me the code!

 Here is the `Gruntfile.js` file:

 ```js

 var toml = require("toml");
 var S = require("string");

 var CONTENT_PATH_PREFIX = "site/content";

 module.exports = function(grunt) {

    grunt.registerTask("lunr-index", function() {

        grunt.log.writeln("Build pages index");

        var indexPages = function() {
            var pagesIndex = [];
            grunt.file.recurse(CONTENT_PATH_PREFIX, function(abspath, rootdir, subdir, filename) {
                grunt.verbose.writeln("Parse file:",abspath);
                pagesIndex.push(processFile(abspath, filename));
            });

            return pagesIndex;
        };

        var processFile = function(abspath, filename) {
            var pageIndex;

            if (S(filename).endsWith(".html")) {
                pageIndex = processHTMLFile(abspath, filename);
            } else {
                pageIndex = processMDFile(abspath, filename);
            }

            return pageIndex;
        };

        var processHTMLFile = function(abspath, filename) {
            var content = grunt.file.read(abspath);
            var pageName = S(filename).chompRight(".html").s;
            var apiHref = S(abspath)
                .chompLeft(CONTENT_PATH_PREFIX)
                .chompRight(filename).s +
                "index.html#" +
                pageName + "/";

            return {
                title: pageName,
                href: apiHref,
                content: S(content).trim().stripTags().stripPunctuation().s
            };
        };

        var processMDFile = function(abspath, filename) {
            var content = grunt.file.read(abspath);
            var pageIndex;
            // First separate the Front Matter from the content and parse it
            content = content.split("+++");
            var frontMatter;
            try {
                frontMatter = toml.parse(content[1].trim());
            } catch (e) {
                conzole.failed(e.message);
            }

            var href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(".md").s;
            // href for index.md files stops at the folder name
            if (filename === "index.md") {
                href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(filename).s;
            }

            // Build Lunr index for this page
            pageIndex = {
                title: frontMatter.title,
                tags: frontMatter.tags,
                href: href,
                content: S(content[2]).trim().stripTags().stripPunctuation().s
            };

            return pageIndex;
        };

        grunt.file.write("site/static/js/lunr/PagesIndex.json", JSON.stringify(indexPages()));
        grunt.log.ok("Index built");
    });
 };

 ```

 Launch the task: `grunt lunr-index`

 ### Use the index

 TBD
	> How to implement a custom search for [Hugo](http://gohugo.io) usig [Gruntjs](http://gruntjs.com) and [Lunrjs](http://lunrjs.com).

	## Requirements

	Install the following tools:

	* [Nodejs](https://github.com/joyent/node/wiki/Installing-Node.js-via-package-manager)
	* [Gruntjs](http://gruntjs.com/getting-started)

	## Setup

	### Project organization

	Here is my Hugo based website project structure

	```
	MyWebsite/
	\|- site/ <= Hugo project root folder
	\|- content/
	\|- layout/
	\|- static/
	\|- js/
	\|- lunr/ <= Where we generate the lunr json index file
	\|- vendor/
	\|- lunrjs.min.js <= lunrjs library
	\|- ...
	\|- config.yaml
	\|- ...
	\|- Gruntfile.js <= Where the magic happens
	\|- package.json <= Dependencies declaration required to build the index
	\|- ...
	```

	### Install the Nodejs dependencies

	From the project root folder launch `npm install --save-dev grunt string toml`

	* [string](http://stringjs.com/) <= do almost all the work
	* [toml](https://github.com/BinaryMuse/toml-node)
	- Used to parse the Frontmatter, mine is in TOML... obviously
	- Otherwise you can install [yamljs](https://www.npmjs.com/package/yamljs)

	## Time to work

	### The principle

	We will work both at buildtime and runtime. With Gruntjs (buildtime), we'll generate a JSON index file and with a small js script (runtime) initilize and use lunrjs.

	### Build the Lunr index file

	Lunrjs allows you to define fields to describe your pages (documents in lunrjs terms) that will be used to search and hopefully find stuff. The index file is basically a JSON file corresponding to an array of all the documents (pages) composing the website.

	Here are the fields I chose (and the boost level to increase the page ranking) to describe my pages:

	* `title` <=> page title
	- boost 10
	* `tags` <=> page tags
	- boost 5
	* `content` <=> page content
	* `ref` <=> page URI

	#### Workflow

	1. Recursively walk through all files of the `content` folder
	2. Two possibilities
	1. Markdown file
	1. Parse the Frontmatter to extract the `title` and the `tags`
	2. Parse and clean the content
	2. HTML file
	1. Parse and clean the content
	2. Use the file name as `title`
	3. Use the path file as `ref` (link toward the page)

	#### Show me the code!

	Here is the `Gruntfile.js` file:

	```js

	var toml = require("toml");
	var S = require("string");

	var CONTENT_PATH_PREFIX = "site/content";

	module.exports = function(grunt) {

	grunt.registerTask("lunr-index", function() {

	grunt.log.writeln("Build pages index");

	var indexPages = function() {
	var pagesIndex = [];
	grunt.file.recurse(CONTENT_PATH_PREFIX, function(abspath, rootdir, subdir, filename) {
	grunt.verbose.writeln("Parse file:",abspath);
	pagesIndex.push(processFile(abspath, filename));
	});

	return pagesIndex;
	};

	var processFile = function(abspath, filename) {
	var pageIndex;

	if (S(filename).endsWith(".html")) {
	pageIndex = processHTMLFile(abspath, filename);
	} else {
	pageIndex = processMDFile(abspath, filename);
	}

	return pageIndex;
	};

	var processHTMLFile = function(abspath, filename) {
	var content = grunt.file.read(abspath);
	var pageName = S(filename).chompRight(".html").s;
	var apiHref = S(abspath)
	.chompLeft(CONTENT_PATH_PREFIX)
	.chompRight(filename).s +
	"index.html#" +
	pageName + "/";

	return {
	title: pageName,
	href: apiHref,
	content: S(content).trim().stripTags().stripPunctuation().s
	};
	};

	var processMDFile = function(abspath, filename) {
	var content = grunt.file.read(abspath);
	var pageIndex;
	// First separate the Front Matter from the content and parse it
	content = content.split("+++");
	var frontMatter;
	try {
	frontMatter = toml.parse(content[1].trim());
	} catch (e) {
	conzole.failed(e.message);
	}

	var href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(".md").s;
	// href for index.md files stops at the folder name
	if (filename === "index.md") {
	href = S(abspath).chompLeft(CONTENT_PATH_PREFIX).chompRight(filename).s;
	}

	// Build Lunr index for this page
	pageIndex = {
	title: frontMatter.title,
	tags: frontMatter.tags,
	href: href,
	content: S(content[2]).trim().stripTags().stripPunctuation().s
	};

	return pageIndex;
	};

	grunt.file.write("site/static/js/lunr/PagesIndex.json", JSON.stringify(indexPages()));
	grunt.log.ok("Index built");
	});
	};

	```

	Launch the task: `grunt lunr-index`

	### Use the index

	TBD