Skip to content

Instantly share code, notes, and snippets.

@ismail0234
Forked from xhrix/one-million-posts.md
Created August 3, 2022 11:16
Show Gist options
  • Select an option

  • Save ismail0234/d5e5fc9da0851ea4d18e290d579dc86f to your computer and use it in GitHub Desktop.

Select an option

Save ismail0234/d5e5fc9da0851ea4d18e290d579dc86f to your computer and use it in GitHub Desktop.

Revisions

  1. @xhrix xhrix renamed this gist Apr 8, 2017. 1 changed file with 0 additions and 0 deletions.
    File renamed without changes.
  2. @xhrix xhrix revised this gist Apr 8, 2017. No changes.
  3. @xhrix xhrix created this gist Apr 8, 2017.
    539 changes: 539 additions & 0 deletions one million posts
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,539 @@
    # One million posts

    Let's create one million posts to see the performance of a Wordpress site.

    ## Creation of a post

    The purpose of this document is to find out what is the underlying data that gets modified in a wordpress database when a post with an image are created.


    ## What I did

    1. I unziped the Worpress.
    2. I created a brand new database.
    3. I installed Wordpress.

    At that point, I created a database backup `pristine_wp.sql`.

    4. I logged in.
    5. I uploaded an image (`i-am-a-splash-image.png`) to media, and filled it's "alt text".
    5. I created a tag for the post `debugtag`.
    6. I created a post with it's "title", "content", "excerpt", an image in its content, a featured image, a category (uncategorized) and a tag.
    7. I published the post.
    8. I logged out.

    At that point, I created another database backup: `first_post_wp.sql`.


    ## System Specs

    The system I used to test this was:

    - OS: MacOS Sierra 10.12.4
    - PHP Version: 7.1.1
    - Worpress Version: 4.7.3
    - Database type: MySQL

    The date when this operation was performed was `April 7th, 2017`.

    ## Things that were modified between `pristine_wp.sql` and `first_post_wp.sql`

    ### wp_post_meta

    A total of 5 entries in the `wp_postmeta` were created.

    The first one is an `edit lock`. It might not be important.

    ```javascript
    {
    "meta_id": 3,
    "post_id": 4,
    "meta_key": "_edit_lock",
    "meta_value": "1491615951:1" // It seems to be the unix epocth of the lock : userId.
    }
    ```

    ---

    The second one relates a file (the image that I uploaded) to the post of type `attachment` that was created to hold the image.

    ```javascript
    {
    "meta_id": 4,
    "post_id": 5, // ID of the post of type `attachment` that holds the image.
    "meta_key": "_wp_attached_file", // Indicates that this post has an attached file.
    "meta_value": "2017/04/i-am-a-splash-image.png" // Path to the file.
    }
    ```

    ---

    The third one is all the metadata of the image I uploaded. That metadata contains information relating it's condition as an attachment.

    ```javascript
    {
    "meta_id": 5,
    "post_id": 5,
    "meta_key": "_wp_attachment_metadata",
    "meta_value": `a:5:{s:5:"width";i:1717;s:6:"height";i:1040;s:4:"file";s:31:"2017/04/i-am-a-splash-image.png";s:5:"sizes";a:5:{s:9:"thumbnail";a:4:{s:4:"file";s:31:"i-am-a-splash-image-150x150.png";s:5:"width";i:150;s:6:"height";i:150;s:9:"mime-type";s:9:"image/png";}s:6:"medium";a:4:{s:4:"file";s:31:"i-am-a-splash-image-300x182.png";s:5:"width";i:300;s:6:"height";i:182;s:9:"mime-type";s:9:"image/png";}s:12:"medium_large";a:4:{s:4:"file";s:31:"i-am-a-splash-image-768x465.png";s:5:"width";i:768;s:6:"height";i:465;s:9:"mime-type";s:9:"image/png";}s:5:"large";a:4:{s:4:"file";s:32:"i-am-a-splash-image-1024x620.png";s:5:"width";i:1024;s:6:"height";i:620;s:9:"mime-type";s:9:"image/png";}s:32:"twentyseventeen-thumbnail-avatar";a:4:{s:4:"file";s:31:"i-am-a-splash-image-100x100.png";s:5:"width";i:100;s:6:"height";i:100;s:9:"mime-type";s:9:"image/png";}}s:10:"image_meta";a:12:{s:8:"aperture";s:1:"0";s:6:"credit";s:0:"";s:6:"camera";s:0:"";s:7:"caption";s:0:"";s:17:"created_timestamp";s:1:"0";s:9:"copyright";s:0:"";s:12:"focal_length";s:1:"0";s:3:"iso";s:1:"0";s:13:"shutter_speed";s:1:"0";s:5:"title";s:0:"";s:11:"orientation";s:1:"0";s:8:"keywords";a:0:{}}}`
    }
    ```

    Let's unserialize the `meta_value` of the previous bunch of data:

    ```javascript
    {
    "width": 1717,
    "height": 1040,
    "file": "2017/04/i-am-a-splash-image.png",
    "sizes": {
    "thumbnail": {
    "file": "i-am-a-splash-image-150x150.png",
    "width": 150,
    "height": 150,
    "mime-type": "image/png"
    },
    "medium": {
    "file": "i-am-a-splash-image-300x182.png",
    "width": 300,
    "height": 182,
    "mime-type": "image/png"
    },
    "medium_large": {
    "file": "i-am-a-splash-image-768x465.png",
    "width": 768,
    "height": 465,
    "mime-type": "image/png"
    },
    "large": {
    "file": "i-am-a-splash-image-1024x620.png",
    "width": 1024,
    "height": 620,
    "mime-type": "image/png"
    },
    "twentyseventeen-thumbnail-avatar": {
    "file": "i-am-a-splash-image-100x100.png",
    "width": 100,
    "height": 100,
    "mime-type": "image/png"
    }
    },
    "image_meta": {
    "aperture": "0",
    "credit": "",
    "camera": "",
    "caption": "",
    "created_timestamp": "0",
    "copyright": "",
    "focal_length": "0",
    "iso": "0",
    "shutter_speed": "0",
    "title": "",
    "orientation": "0",
    "keywords": []
    }
    }
    ```

    ---

    The fourth entry relates the post of the image to it's "alt text".

    ```javascript
    {
    "meta_id": 6,
    "post_id": 5, // ID of the post of type `attachment` that holds the image.
    "meta_key": "_wp_attachment_image_alt", // Indicates that this metadata holds the alt text of the image.
    "meta_value": "I am an alt text." // Alt text of the image.
    }
    ```

    ---

    The fifth entry relates the **entry of the very post** to it's thumbnail id.

    ```javascript
    {
    "meta_id": 7,
    "post_id": 4, // ID of the very post.
    "meta_key": "_thumbnail_id", // Indicates that this metadata relates the very post to it's thumbnail.
    "meta_value": "5" // ID of the post of the image.
    }
    ```

    ### wp_posts

    A total of 4 entries in the `wp_posts` table were created.

    The first is an `Auto Draft` with `post_type` `post`. It might not be imporant.

    ---

    The second one is the important one, **the entry of the very post**.

    ```javascript
    {
    ID: 4,
    post_author: 1,
    post_date: '2017-04-08 01:47:04',
    post_date_gmt: '2017-04-08 01:47:04',
    post_content: `I am just a simple content.

    [caption id="attachment_5" align="alignnone" width="300"]<img class="size-medium wp-image-5" src="http://localhost/razon/wp-content/uploads/2017/04/i-am-a-splash-image-300x182.png" alt="I am an alt text." width="300" height="182" /> I am an image caption.[/caption]`,
    post_title: 'I am a debug title',
    post_excerpt: 'I am a simple excerpt.',
    post_status: 'publish',
    comment_status: 'open',
    ping_status: 'open',
    post_password: '',
    post_name: 'i-am-a-debug-title',
    to_ping: '',
    pinged: '',
    post_modified: '2017-04-08 01:47:04',
    post_modified_gmt: '2017-04-08 01:47:04',
    post_content_filtered: '',
    post_parent: 0,
    guid: 'http://localhost/razon/?p=4',
    menu_order: 0,
    post_type: 'post',
    post_mime_type: '',
    comment_count: 0,
    }
    ```

    ---

    The thirth one is the one for the **image**.

    ```javascript
    {
    ID: 5,
    post_author: 1,
    post_date: "2017-04-08 01:45:00",
    post_date_gmt: "2017-04-08 01:45:00",
    post_content: "I am a description.",
    post_title: "i-am-a-splash-image",
    post_excerpt: "I am an image caption.",
    post_status: "inherit",
    comment_status: "open",
    ping_status: "closed",
    post_password: "",
    post_name: "i-am-a-splash-image",
    to_ping: "",
    pinged: "",
    post_modified: "2017-04-08 01:45:28",
    post_modified_gmt: "2017-04-08 01:45:28",
    post_content_filtered: "",
    post_parent: 4,
    guid: "http://localhost/razon/wp-content/uploads/2017/04/i-am-a-splash-image.png",
    menu_order: 0,
    post_type: "attachment",
    post_mime_type: "image/png",
    comment_count: 0
    }
    ```

    ---

    The fourth one is a `Revition` with `post_type` `revition`. It might not be imporant.

    ### wp_terms_relationships

    The first entry in `wp_terms_relationships` relates the very post to the term "uncategorized" under the taxonomy "category".

    ```javascript
    {
    object_id: 4, // This is the ID of the post.
    term_taxonomy_id: 1, // This is the ID of the default term, "uncategorized" under the taxonomy "category"
    term_order: 0,
    }
    ```

    ---

    The second entry in `wp_terms_relationships` relates the very post to the term "debugtag" under the taxonomy "post_tag".

    ```javascript
    {
    object_id: 4, // This is the ID of the post.
    term_taxonomy_id: 2, // This is for the term I created, "debugtag" under the taxonomy "post_tag".
    term_order: 0,
    }
    ```

    ### wp_term_taxonomy

    When I created a tag with name `debugtag`, this entry was created in `wp_term_taxonomy`.

    ```javascript
    {
    term_taxonomy_id: 2,
    term_id: 2,
    taxonomy: 'post_tag',
    description: '',
    parent: 0,
    count: 1,
    }
    ```

    > That seems to be a built-in **type of taxonomy** which all the tags of a post have.


    ### wp_terms

    When I created a tag with name `debugtag`, this entry was created in `wp_terms`.

    ```javascript
    {
    term_id: 2,
    name: 'debugtag',
    slug: 'debugtag',
    term_group: 0,
    }
    ```

    > That seems to be the actual tag I created.


    ### wp_usersmeta

    Entries with `meta_key` 'wp_dashboard_quick_press_last_post_id', 'closedpostboxes_post', 'metaboxhidden_post', 'wp_user-settings', 'wp_user-settings-time' were created. They seem to be not important.

    Also, the entry with `meta_key` 'session_tokens' was deleted. It might be due to the fact I logged out when I created the post.


    ## Files created when I uploaded the image

    The images were created in this directory `wp-content/uploads/2017/04`.

    The image files were:

    - `i-am-a-splash-image-100x100.png`
    - `i-am-a-splash-image-1024x620.png`
    - `i-am-a-splash-image-150x150.png`
    - `i-am-a-splash-image-300x182.png`
    - `i-am-a-splash-image-768x465.png`
    - `i-am-a-splash-image.png`


    ## Creating one million posts

    To create one million posts, I can start from the `first_post_wp.sql` database, that contains an uploaded image, and a post. I will no longer need to upload images or create more post types of type `attachment`, or terms, or taxonomies.

    The thing I want to create is:

    - A post of post type `post`.
    - That post must have title, content, excerpt.
    - The content of the post must have an attachment (the same as the debug post).
    - That post must have a tag. I will use the `debugtag` i previously created.
    - That post must have a category. I will use the `uncategorized` default category of Wordpress.


    So, the tables I will have to insert data into are in this order:

    ### Insert into `wp_posts`

    ```javascript
    {
    // ID: 4, // This must be ommited so MySQL can assign an ID automatically.
    post_author: 1, // Keep it constant to assign it to my usrer ID.
    post_date: '2017-04-08 01:47:04', // Keep it constant. We don't want to mess witih dates.
    post_date_gmt: '2017-04-08 01:47:04', // Keep it constant. We don't want to mess witih dates.
    // In the content add the ${$new_post_id} to differentiate it.
    post_content: `I am just a simple content x${$new_post_id}.

    [caption id="attachment_5" align="alignnone" width="300"]<img class="size-medium wp-image-5" src="http://localhost/razon/wp-content/uploads/2017/04/i-am-a-splash-image-300x182.png" alt="I am an alt text." width="300" height="182" /> I am an image caption.[/caption]`,
    // In the title of the post add the ${$new_post_id} to differentiate it.
    post_title: `I am a debug title x${$new_post_id}`,
    // In the excerpt add the ${$new_post_id} to diffetentiate it.
    post_excerpt: `I am a simple excerpt x${$new_post_id}.`,
    post_status: 'publish', // Keep it constant. We want all our posts to be published.
    comment_status: 'open', // Keep it.
    ping_status: 'open', // Keep it.
    post_password: '', // Keep it.
    // The post name MUST be unique. To ensure this, we add the `x-${$new_post_id}` at the end of the value.
    post_name: `i-am-a-debug-title-x-${$new_post_id}`,
    to_ping: '', // Keep it.
    pinged: '', // Keep it.
    post_modified: '2017-04-08 01:47:04', // Keep it.
    post_modified_gmt: '2017-04-08 01:47:04', // Keep it.
    post_content_filtered: '', // Keep it.
    post_parent: 0, // Keep it.
    // The guid MUST be unique, and also, the 'p' param must be the ID of the post. To achieve this, we use our friend ${$new_post_id}.
    guid: `http://localhost/razon/?p=${$new_post_id}`,
    menu_order: 0, // Keep it.
    post_type: 'post', // Keep it.
    post_mime_type: '', // Keep it.
    comment_count: 0, // Keep it.
    }
    ```

    ---

    ### Insert into `wp_post_meta`

    To relate the post to its thumbnail:

    ```javascript
    {
    // "meta_id": 7, // This must be ommited so MySQL can assign an ID automatically.
    "post_id": $new_post_id, // ID of the recently created post.
    "meta_key": "_thumbnail_id", // Keep this constant. Indicates that this metadata relates the very post to it's thumbnail.
    "meta_value": "5" // Keep this constant. ID of the post of the image, the only one I have uploaded.
    }
    ```

    ---

    ### Insert into `wp_terms_relationships`

    Relate the new post to the term `uncategorized` under the taxonomy `category`.

    ```javascript
    {
    object_id: $new_post_id, // ID of the new post.
    term_taxonomy_id: 1, // ID of the default term, "uncategorized" under the taxonomy "category"
    term_order: 0,
    }
    ```

    ---

    Relate the new post to the term `debugtag` under the taxonomy `post_tag`.

    ```javascript
    {
    object_id: $new_post_id, // ID of the new post.
    term_taxonomy_id: 2, // This is for the term I created, "debugtag" under the taxonomy "post_tag".
    term_order: 0,
    }
    ```

    ## The MySQL script

    ```sql
    drop procedure if exists load_foo_test_data;

    delimiter #
    create procedure load_foo_test_data()
    begin

    declare v_max int unsigned default 1; -- Alter this number up to one million, be careful not to overload your database, it might die.
    declare v_counter int unsigned default 0;

    start transaction;
    while v_counter < v_max do

    -- CREATE POST ----------------------------------------------------------------------------
    -- ----------------------------------------------------------------------------------------
    INSERT INTO `wp_posts` (
    -- `ID`, // This must be ommited so MySQL can assign an ID automatically.
    `post_author`,
    `post_date`,
    `post_date_gmt`,
    `post_content`,
    `post_title`,
    `post_excerpt`,
    `post_status`,
    `comment_status`,
    `ping_status`,
    `post_password`,
    `post_name`,
    `to_ping`,
    `pinged`,
    `post_modified`,
    `post_modified_gmt`,
    `post_content_filtered`,
    `post_parent`,
    `guid`,
    `menu_order`,
    `post_type`,
    `post_mime_type`,
    `comment_count`
    ) VALUES (
    1, -- post_author // Keep it constant to assign it to my usrer ID.
    '2017-04-08 01:47:04', -- post_date // Keep it constant. We don't want to mess witih dates.
    '2017-04-08 01:47:04', -- post_date_gmt // Keep it constant. We don't want to mess witih dates.
    '', -- post_content // Updated later on.
    '', -- post_title // Updated later on.
    '', -- post_excerpt // Updated later on.
    'publish', -- post_status // Keep it constant. We want all our posts to be published.
    'open', -- comment_status // Keep it.
    'open', -- ping_status // Keep it.
    '', -- post_password // Keep it.
    '', -- post_name // Updated later on.
    '', -- to_ping // Keep it.
    '', -- pinged // Keep it.
    '2017-04-08 01:47:04', -- post_modified // Keep it.
    '2017-04-08 01:47:04', -- post_modified_gmt // Keep it.
    '', -- post_content_filtered // Keep it.
    0, -- post_parent // Keep it.
    '', -- guid // Updated later on.
    0, -- menu_order // Keep it.
    'post', -- post_type // Keep it.
    '', -- post_mime_type // Keep it.
    1 -- comment_count // Keep it.
    );
    set @future_id = LAST_INSERT_ID();
    UPDATE `wp_posts` SET
    -- In the content add the ${$new_post_id} to differentiate it.
    `post_content` = CONCAT('I am just a simple content x-', @future_id, '.

    [caption id="attachment_5" align="alignnone" width="300"]<img class="size-medium wp-image-5" src="http://localhost/wp-content/uploads/2017/04/i-am-a-splash-image-300x182.png" alt="I am an alt text." width="300" height="182" /> I am an image caption.[/caption]'),
    -- In the title of the post add the ${$new_post_id} to differentiate it.
    `post_title` = CONCAT('I am a debug title x-', @future_id),
    -- In the excerpt add the ${$new_post_id} to diffetentiate it.
    `post_excerpt` = CONCAT('I am a simple excerpt x-', @future_id),
    -- The post name MUST be unique. To ensure this, we add the `x-${$new_post_id}` at the end of the value.
    `post_name` = CONCAT('i-am-a-debug-title-x-', @future_id),
    -- The guid MUST be unique, and also, the 'p' param must be the ID of the post. To achieve this, we use our friend ${$new_post_id}.
    `guid` = CONCAT('http://localhost/?p=', @future_id)
    WHERE `ID` = @future_id
    ;


    -- UPDATE WP_POST_META ----------------------------------------------------------------------
    -- ------------------------------------------------------------------------------------------
    -- These two guys were added when I created the post programatically with PHP, but not when I created the post using the normal UI tools of the Worpress admin. So, I won't use them.
    INSERT INTO `wp_postmeta` (
    -- `meta_id`, // This must be ommited so MySQL can assign an ID automatically.
    `post_id`,
    `meta_key`,
    `meta_value`
    ) VALUES (
    @future_id, -- post_id // ID of the recently created post.
    '_thumbnail_id', -- meta_key // Keep this constant. Indicates that this metadata relates the very post to it's thumbnail.
    '5' -- meta_value // Keep this constant. ID of the post of the image, the only one I have uploaded.
    );
    -- INSERT INTO `wp_postmeta` (`post_id`, `meta_key`, `meta_value`) VALUES (@future_id, '_pingme', '1');
    -- INSERT INTO `wp_postmeta` (`post_id`, `meta_key`, `meta_value`) VALUES (@future_id, '_encloseme', '1');


    -- UPDATE WP_TERM_RELATIONSHIPS -------------------------------------------------------------
    -- ------------------------------------------------------------------------------------------

    -- Relate the new post to the term `uncategorized` under the taxonomy `category`.
    -- object_id: $new_post_id, // ID of the new post.
    -- term_taxonomy_id: 1, // ID of the default term, "uncategorized" under the taxonomy "category"
    -- term_order: 0,
    INSERT INTO `wp_term_relationships` (`object_id`, `term_taxonomy_id`, `term_order`) VALUES (@future_id, 1, 0);

    -- Relate the new post to the term `debugtag` under the taxonomy `post_tag`.
    -- object_id: $new_post_id, // ID of the new post.
    -- term_taxonomy_id: 2, // This is for the term I created, "debugtag" under the taxonomy "post_tag".
    -- term_order: 0,
    INSERT INTO `wp_term_relationships` (`object_id`, `term_taxonomy_id`, `term_order`) VALUES (@future_id, 2, 0);


    -- UPDATE COUNTER ---------------------------------------------------------------------------
    set v_counter = v_counter + 1;
    end while;
    commit;
    end #
    delimiter ;

    call load_foo_test_data();
    ```


    ----------