Created
October 24, 2013 09:40
-
-
Save chengmu/7134155 to your computer and use it in GitHub Desktop.
Revisions
-
chengmu revised this gist
Oct 25, 2013 . 1 changed file with 3 additions and 6 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -36,16 +36,12 @@ GPU involved in compositing contents of web page ###How Broswer Works  + __repaint__ : expensive, need to recaculate the pixels info, like colors; + __redraw__ : cheap ##Compositing in Webkit/Blink @@ -259,7 +255,8 @@ repaint the gap ##Tool [Frame Viewer](http://www.chromium.org/developers/how-tos/trace-event-profiling-tool/frame-viewer) ##Reference @@ -280,6 +277,6 @@ repaint the gap + [Painting in Chromium](http://www.youtube.com/watch?v=A5-aXfSt-RA) + how painting works, chromium and Skia side of painting + [How Browser Works](http://www.html5rocks.com/en/tutorials/internals/howbrowserswork) > Written with [StackEdit](https://stackedit.io/). -
chengmu revised this gist
Oct 24, 2013 . 1 changed file with 115 additions and 93 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -4,18 +4,15 @@ _summaried from Chromium documents and other resources_ ##Background ###Stacking Contexts __Positioned elements(relative, absolute, fixed) with a z-index group of layers into a isolated layer__ >A Stacking Contexts 'flattens' the elements' subtree, so nothing outside of the subtree can paint between elements of the subtree. >In other word: the rest of tehe Dom Tree can treat the `Stacking context` as an __atomic conceptual layer__ for painting #####Standards [CSS2.1 9.9 Layer Presentation](http://www.w3.org/TR/CSS2/visuren.html#layers) [CSS2.1 Appendix E. Elaborate description of Stacking Contexts](http://www.w3.org/TR/CSS2/zindex.html) @@ -30,144 +27,175 @@ The GPU process exists primarily for security reasons. Currently Chrome uses a single GPU process per browser instance, serving requests from all the renderer processes and any plugin processes. The GPU process, while single threaded, can multiplex between multiple command buffers, each one of which is associated with its own rendering context. >implementation of hardware-accelerated compositing in Chrome Traditionally: Browsers depend on CPU to render web page content GPU involved in compositing contents of web page ###How Broswer Works  [How Browser Works](http://www.html5rocks.com/en/tutorials/internals/howbrowserswork) + __repaint__ : expensive, need to recaculate the pixels info, like colors; + __redraw__ : cheap ##Compositing in Webkit/Blink ### How WebKit Render Pages RenderObject's are stored in a parallel tree structure, called the Render Tree. A RenderObject knows how to present (paint) the contents of the Node on a display surface. It does so by issuing the necessary draw calls to a GraphicsContext. A GraphicsContext is ultimately responsible for writing the pixels into a bitmap that gets displayed to the screen. In Chrome, the GraphicsContext wraps Skia, our 2D drawing library, and most GraphicsContext calls become calls to an SkCanvas or SkPlatformCanvas (see this document for more on how Chrome uses Skia). RenderLayers exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc. Notice that there isn't a one-to-one correspondence between RenderObjects and RenderLayers. A particular RenderObject is associated either with the RenderLayer that was created for it, if there is one, or with the RenderLayer of the first ancestor that has one. RenderLayers form a tree hierarchy as well. The root node is the RenderLayer corresponding to the root element in the page and the descendants of every node are layers visually contained within the parent layer. The children of each RenderLayer are kept into two sorted lists both sorted in ascending order, the negZOrderList containing child layers with negative z-indices (and hence layers that go below the current layer) and the posZOrderList contain child layers with positive z-indices (layers that go above the current layer). + __The DOM tree__, which is our fundamental retained model + __The RenderObject tree__, which has a 1:1 mapping to the DOM tree’s visible nodes. + RenderObjects know how to paint their corresponding DOM nodes. + __The RenderLayer tree__, made up of RenderLayers that map to a RenderObject on the RenderObject tree. The mapping is many-to-one, as each RenderObject is either associated with its own RenderLayer or the RenderLayer of its first ancestor that has one. The RenderLayer tree preserves z-ordering amongst layers.  ####Two Render Path : Software or Hardware ####Software implementation In the software path, the page is rendered by sequentially painting all the RenderLayers, from back to front. The RenderLayer hierarchy is traversed recursively starting from the root and the bulk of the work is done in RenderLayer::paintLayer() which performs the following basic steps (the list of steps is simplified here for clarity): 1. Determines whether the layer intersects the damage rect for an early out. 2. Recursively paints the layers below this one by calling paintLayer() for the layers in the negZOrderList. 3. Asks RenderObjects associated with this RenderLayer to paint themselves. 4. This is done by recursing down the RenderObject tree starting with the RenderObject which created the layer. Traversal stops whenever a RenderObject associated with a different RenderLayer is found. 5. Recursively paints the layers above this one by calling paintLayer() for the layers in the posZOrderList. In this mode RenderObjects paint themselves into the destination bitmap by issuing draw calls into __a single shared GraphicsContext (implemented in Chrome via Skia)__.  ###Hardware Implementation: Compositor & GPU ####Compositor + Some (but not all) of the RenderLayers get their own backing surface + layers with their own backing surfaces are called compositing layers + into which they paint instead of drawing directly into the common bitmap for the page __Example__ __Eg:__ the compositor is responsible for applying the necessary transformations (as specified by the layer's CSS transform properties) to each compositing layer’s bitmap before compositing it. Further, since painting of the layers is decoupled from compositing, invalidating one of these layers only results in repainting the contents of that layer alone and recompositing. __In contrast, with the software path, invalidating any layer requires repainting all layers (at least the overlapping portions of them) below and above it which unnecessarily taxes the CPU.__ __What is Compositing?__ >(in the context of rendering websites), __The use of multiple backing stores to cache and group chunks of the render tree__ __Benefits__ + Avoide unnecessary repainting + components have own backing stores, nothing needs repaiting while this example animates + Makes some features more efficient or practical + scrolling, 3D CSS, opacity, filters, WebGL, hardware video decoding __Tasks of Compositing__ 1. __determine__ how to grount contents into backing stores 2. __Paint__ the contents of each composited layer 3. __Draw__ the composited layers to make a final image __New Tree!__ With the introduction of compositing, we add an additional conceptual tree: `the GraphicsLayer tree`. Each RenderLayer either has its own GraphicsLayer (if it is a compositing layer) or uses the GraphicsLayer of its first ancestor that has one. This is similar to RenderObject’s relationship with RenderLayers. Each GraphicsLayer has a `GraphicsContext` for the associated RenderLayers to draw into. __Code__ related to the compositor lives inside WebCore, behind the USE(ACCELERATED_COMPOSITING) guards. ####Here Comes the GPU!! With the addition of the accelerated compositor, in order to eliminate costly memory transfers, the final rendering of the browser's tab area is handled directly by the GPU. ( Code for it lives behind the ACCELERATED_COMPOSITING compile-time flag.) The Compositor library is essentially using the GPU to composite rectangular areas of the page (i.e. all those compositing layers) into a single bitmap, which is the final page image. #####Benefits of GPU + eliminating unnecessary (and very slow) copies of large data, especially copies from video memory to system memory. + In most cases, the GPU can achieve far better efficiency than the CPU (both in terms of speed and power draw) in drawing and compositing operations that involve large numbers of pixels as the hardware is designed specifically for these types of workloads. + Utilizing the GPU for these operations also provides parallelism between the CPU and GPU, which can operate at the same time to create an efficient graphics pipeline. ####When Will This Happen? + when the --forced-compositing-mode flag is turned on + by default in Chrome on Android and ChromeOS + Safari on the Mac (and most likely iOS) follows the hardware accelerated path and makes heavy use of Apple's proprietary CoreAnimation API. + at least one of the page’s RenderLayer’s requires hardware acceleration ####Candidates for Optimizations In the current WebKit implementation, the following conditions are some of those that cause a RenderLayer to get its own compositing layer (see the CompositingReasons enum in `RenderLayer.h` for a longer list ): + __Opacity, transforms, filters, reflections__ _Significantly easier to apply to the composited layer when drawing_ + Layer has 3D or perspective transform CSS properties + Layer uses a CSS animation for its opacity or uses an animated webkit transform + Layer uses accelerated CSS filters + Layer with a composited descendant has information that needs to be in the + composited layer tree, such as a clip or reflection + __Scrolling, fixed-position__ _Cases where compositing a subtree of content greatly reduces the number of costly repaints_ + __Content that is rendered separately__ _Compositing on the GPU can remove the need for read-back of pixels For example, WebGL, hardware-decoded video, some plugins_ + Layer is used by` <video> `element using accelerated video decoding + Layer is used by a` <canvas> `element with a 3D context or accelerated 2D context + Layer is used for a composited plugin + __Idealy Shouldn't but it DOES__ + Layer has a sibling with a lower z-index which has a compositing layer (in other + words the layer is rendered on top of a composited layer) + Composited descendant may need composited parent To correctly propagate transform, preserve-3d, or clipping information in the composited tree. ##Debug mode in Chrome ###Flags [chrome://flags](chrome://flags) + `--force-compositing-mode` Pages that don't "require" compositing will still use it + `--show-composited-layer-borders` Visualize borders (and tiles) on composited layers. + `--show-paint-rects` Visualize what layers required repainting + `--show-property-changed-rects` Visualize what layers required redrawing without repainting ###Test [Poster Circle](http://www.webkit.org/blog-files/3d-transforms/poster-circle.html) Animations disable overlap testing and conservatively composite - try adding a stacking context that does not overlap anything - it still gets composited! [MapsGL](https://maps.google.com/?vector=1) HTML controls and popups easily overlayed on top of WebGL content. [Android apps page](http://www.android.com/apps/) See composited layers come and go while transition animations are playing. Notice clipping elements and 3d elements usually become layers. ###Summary Now we know roughly how to draw a page using the compositor: the page is divided up into layers, layers are rasterized into textures, textures are uploaded to the GPU, and the compositor tells the GPU to put all the textures together into the final screen image @@ -179,20 +207,18 @@ Now we know roughly how to draw a page using the compositor: the page is divided + __asterization__: in our terms, the phase of rendering where the bitmaps backing up RenderLayers are filled. This can occur immediately as GraphicsContext calls are by the RenderObjects, or it can occur later if we’re using SkPicture record for painting and SkPicture playback for rasterization. + __compositing__: in our terms, the phase of rendering that combines RenderLayer’s textures into a final screen image + __drawing__ : in our terms, the phase of rendering that actually puts pixels onto the screen (i.e. puts the final screen image onto the screen). Using the --show-composited-layer-borders flag will display borders around layers, and uses colors to display information about the layers, or tiles within layers: + Green - The border around the outside of a composited layer. + Dark Blue - The border around the outside of a "render surface". Surfaces are textures used as intermediate targets while drawing the frame. + Purple - The border around the outside of a surface's reflection. + Cyan - The border around a tile within a tiled composited layer. Large composited layers are broken up into tiles to avoid using large textures. + Red - The border around a composited layer, or a tile within one, for which the texture is not valid or present. Red can indicate a compositor bug, where the texture is lost, but typically indicates the compositor has reached its memory limits, and the red layers/tiles were unable to fit within those limits. ##More Stuff ###Optimize rendering __60FPS__ @@ -229,23 +255,19 @@ repaint the gap + prepainting: paint tiles in ahead ###Treaded Compositor ##Reference + [GPU Accelerated Compositing in Chrome](http://www.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome) + How Chromium's compositor works, from `cc:layer` to `GPU Process` + with detailed description; and Glossary; + [Compositing in Blink and WebKit](http://www.youtube.com/watch?v=Lpk1dYdo62o) _[(Slides)](https://docs.google.com/presentation/d/1dDE5u76ZBIKmsqkWi2apx3BqV8HOcNf4xxBdyNywZR8/edit#slide=id.gc00886d7_2431)_ _[(Q&A)](https://www.google.com/moderator/#15/e=2015a4&t=2015a4.89)_ + summary and focus on Task 1 + Connecting `WebCore::RenderLayer` to `cc::Layer` + render layer tree to a composited layer -
chengmu revised this gist
Oct 24, 2013 . 1 changed file with 76 additions and 81 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -1,38 +1,65 @@ #GPU & Composting in Blink and Chrome _summaried from Chromium documents and other resources_ ##Background ###Stacking Contexts and Paint Order ####Stacking Context __Positioned elements(relative, absolute, fixed) with a z-index group of layers into a isolated layer__ >A Stacking Contexts 'flattens' the elements' subtree, so nothing outside of the subtree can paint between elements of the subtree. >In other word: the rest of tehe Dom Tree can treat the `Stacking context` as an __atomic conceptual layer__ for painting ####Standards [CSS2.1 9.9 Layer Presentation](http://www.w3.org/TR/CSS2/visuren.html#layers) [CSS2.1 Appendix E. Elaborate description of Stacking Contexts](http://www.w3.org/TR/CSS2/zindex.html) ###The GPU Process __For the renderer process: to issue command to the GPU.__ The GPU process exists primarily for security reasons. >Restricted by its sandbox, the Renderer process (where WebKit and the compositor live) cannot directly issue calls to the 3D APIs provided by the OS (we use Direct3D on Windows, OpenGL everywhere else). For that reason we use a separate process to do the rendering. The GPU process is specifically designed to provide access to the system's 3D APIs from within the Renderer sandbox or the even more restrictive Native Client "jail". Currently Chrome uses a single GPU process per browser instance, serving requests from all the renderer processes and any plugin processes. The GPU process, while single threaded, can multiplex between multiple command buffers, each one of which is associated with its own rendering context. ##Browser basic render procedure + layout/repaint :need to recaculate the pixels info, like colors; + redraw: ###Example if components have their own backing stores, then nothing need repainting while animation ###FIRST: understand the basic building blocks of how WebKit renders pages RenderObject's are stored in a parallel tree structure, called the Render Tree. A RenderObject knows how to present (paint) the contents of the Node on a display surface. It does so by issuing the necessary draw calls to a GraphicsContext. A GraphicsContext is ultimately responsible for writing the pixels into a bitmap that gets displayed to the screen. In Chrome, the GraphicsContext wraps Skia, our 2D drawing library, and most GraphicsContext calls become calls to an SkCanvas or SkPlatformCanvas (see this document for more on how Chrome uses Skia). RenderLayers exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc. Notice that there isn't a one-to-one correspondence between RenderObjects and RenderLayers. A particular RenderObject is associated either with the RenderLayer that was created for it, if there is one, or with the RenderLayer of the first ancestor that has one. RenderLayers form a tree hierarchy as well. The root node is the RenderLayer corresponding to the root element in the page and the descendants of every node are layers visually contained within the parent layer. The children of each RenderLayer are kept into two sorted lists both sorted in ascending order, the negZOrderList containing child layers with negative z-indices (and hence layers that go below the current layer) and the posZOrderList contain child layers with positive z-indices (layers that go above the current layer). >In summary, there are conceptually three parallel tree structures in place that serve slightly different purposes for rendering: + __The DOM tree__, which is our fundamental retained model + __The RenderObject tree__, which has a 1:1 mapping to the DOM tree’s visible nodes. + RenderObjects know how to paint their corresponding DOM nodes. + __The RenderLayer tree__, made up of RenderLayers that map to a RenderObject on the RenderObject tree. The mapping is many-to-one, as each RenderObject is either associated with its own RenderLayer or the RenderLayer of its first ancestor that has one. The RenderLayer tree preserves z-ordering amongst layers.  ###Why Compositing + Avoide unnecessary repainting + components have own backing stores, nothing needs repaiting while this example animates @@ -80,49 +107,6 @@ Utilizing the GPU for these operations also provides parallelism between the CPU ####How it works? ###Render Path: Software implementation WebKit fundamentally renders a web page by traversing the RenderLayer hierarchy starting from the root layer. @@ -155,14 +139,14 @@ this could be quite wasteful in terms of memory (vram especially). In the current WebKit implementation, the following conditions are some of those that cause a RenderLayer to get its own compositing layer (see the CompositingReasons enum in RenderLayer.h for a longer list ): + Layer has 3D or perspective transform CSS properties + Layer is used by` <video> `element using accelerated video decoding + Layer is used by a` <canvas> `element with a 3D context or accelerated 2D context + Layer is used for a composited plugin + Layer uses a CSS animation for its opacity or uses an animated webkit transform + Layer uses accelerated CSS filters + Layer with a composited descendant has information that needs to be in the + composited layer tree, such as a clip or reflection + Layer has a sibling with a lower z-index which has a compositing layer (in other + words the layer is rendered on top of a composited layer) Significantly, this means that pages with composited RenderLayers will always render via the compositor. Other pages may or may not, depending on the status of the `--forced-compositing-mode ` flag. @@ -174,16 +158,6 @@ With the addition of the accelerated compositor, in order to eliminate costly me The Compositor library is essentially using the GPU to composite rectangular areas of the page (i.e. all those compositing layers) into a single bitmap, which is the final page image. When a page renders via the compositor, all of its pixels are drawn directly onto the window via the GPU process. The compositor maintains a hierarchy of GraphicsLayers which is constructed by traversing the RenderLayer tree and updated as the page changes. With the exception of WebGL and video layers, the contents of each of the GraphicsLayers are first drawn into a system memory bitmap (just as was the case in the software path): each RenderLayer asks all of its RenderObjects to paint themselves into the GraphicsLayer’s GraphicsContext, which is backed by a bitmap in shared system memory. This bitmap is then passed to the GPU process (using the resource transfer machinery explained above in the GPU Process section), and then the GPU process uploads the bitmap to the GPU as a texture. The compositor keeps track of which GraphicsLayers have changed since the last time they were drawn and only updates the textures as needed. @@ -198,6 +172,27 @@ The bulk of the Chromium implementation for the compositor lives in WebCore's pl ###Summary Now we know roughly how to draw a page using the compositor: the page is divided up into layers, layers are rasterized into textures, textures are uploaded to the GPU, and the compositor tells the GPU to put all the textures together into the final screen image ####Glossary + __bitmap__: a buffer of pixel values in memory (main memory or the GPU’s video RAM) + __texture__: a bitmap meant to be applied to a 3D model on the GPU + __painting__: in our terms, the phase of rendering where RenderObjects make calls into the GraphicsContext API to make a visual representation of themselves + __asterization__: in our terms, the phase of rendering where the bitmaps backing up RenderLayers are filled. This can occur immediately as GraphicsContext calls are by the RenderObjects, or it can occur later if we’re using SkPicture record for painting and SkPicture playback for rasterization. + __compositing__: in our terms, the phase of rendering that combines RenderLayer’s textures into a final screen image + __drawing__ : in our terms, the phase of rendering that actually puts pixels onto the screen (i.e. puts the final screen image onto the screen). ####debug Flag : `--show-composited-layer-borders` , threaded compositing’, ‘threaded animation’ [chrome://flags](chrome://flags) Using the --show-composited-layer-borders flag will display borders around layers, and uses colors to display information about the layers, or tiles within layers: Green - The border around the outside of a composited layer. Dark Blue - The border around the outside of a "render surface". Surfaces are textures used as intermediate targets while drawing the frame. Purple - The border around the outside of a surface's reflection. Cyan - The border around a tile within a tiled composited layer. Large composited layers are broken up into tiles to avoid using large textures. Red - The border around a composited layer, or a tile within one, for which the texture is not valid or present. Red can indicate a compositor bug, where the texture is lost, but typically indicates the compositor has reached its memory limits, and the red layers/tiles were unable to fit within those limits. ###Optimize rendering __60FPS__ -
chengmu revised this gist
Oct 24, 2013 . 1 changed file with 160 additions and 0 deletions.There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -84,11 +84,171 @@ Utilizing the GPU for these operations also provides parallelism between the CPU RenderObject's are stored in a parallel tree structure, called the Render Tree. A RenderObject knows how to present (paint) the contents of the Node on a display surface. It does so by issuing the necessary draw calls to a GraphicsContext. A GraphicsContext is ultimately responsible for writing the pixels into a bitmap that gets displayed to the screen. In Chrome, the GraphicsContext wraps Skia, our 2D drawing library, and most GraphicsContext calls become calls to an SkCanvas or SkPlatformCanvas (see this document for more on how Chrome uses Skia). RenderLayers exist so that the elements of the page are composited in the correct order to properly display overlapping content, semi-transparent elements, etc. Notice that there isn't a one-to-one correspondence between RenderObjects and RenderLayers. A particular RenderObject is associated either with the RenderLayer that was created for it, if there is one, or with the RenderLayer of the first ancestor that has one. RenderLayers form a tree hierarchy as well. The root node is the RenderLayer corresponding to the root element in the page and the descendants of every node are layers visually contained within the parent layer. The children of each RenderLayer are kept into two sorted lists both sorted in ascending order, the negZOrderList containing child layers with negative z-indices (and hence layers that go below the current layer) and the posZOrderList contain child layers with positive z-indices (layers that go above the current layer). >In summary, there are conceptually three parallel tree structures in place that serve slightly different purposes for rendering: The DOM tree, which is our fundamental retained model The RenderObject tree, which has a 1:1 mapping to the DOM tree’s visible nodes. RenderObjects know how to paint their corresponding DOM nodes. The RenderLayer tree, made up of RenderLayers that map to a RenderObject on the RenderObject tree. The mapping is many-to-one, as each RenderObject is either associated with its own RenderLayer or the RenderLayer of its first ancestor that has one. The RenderLayer tree preserves z-ordering amongst layers.  ####Glossary + __bitmap__: a buffer of pixel values in memory (main memory or the GPU’s video RAM) + __texture__: a bitmap meant to be applied to a 3D model on the GPU painting: in our terms, the phase of rendering where RenderObjects make calls into the GraphicsContext API to make a visual representation of themselves asterization: in our terms, the phase of rendering where the bitmaps backing up RenderLayers are filled. This can occur immediately as GraphicsContext calls are by the RenderObjects, or it can occur later if we’re using SkPicture record for painting and SkPicture playback for rasterization. compositing: in our terms, the phase of rendering that combines RenderLayer’s textures into a final screen image drawing: in our terms, the phase of rendering that actually puts pixels onto the screen (i.e. puts the final screen image onto the screen). ####debug Flag : `--show-composited-layer-borders` , threaded compositing’, ‘threaded animation’ [about:flags](about:flags) Using the --show-composited-layer-borders flag will display borders around layers, and uses colors to display information about the layers, or tiles within layers: Green - The border around the outside of a composited layer. Dark Blue - The border around the outside of a "render surface". Surfaces are textures used as intermediate targets while drawing the frame. Purple - The border around the outside of a surface's reflection. Cyan - The border around a tile within a tiled composited layer. Large composited layers are broken up into tiles to avoid using large textures. Red - The border around a composited layer, or a tile within one, for which the texture is not valid or present. Red can indicate a compositor bug, where the texture is lost, but typically indicates the compositor has reached its memory limits, and the red layers/tiles were unable to fit within those limits. ###Render Path: Software implementation WebKit fundamentally renders a web page by traversing the RenderLayer hierarchy starting from the root layer. Recall that the WebKit codebase contains two distinct code paths for rendering the contents of a page, the software path and hardware accelerated path. ###Render Path: handware implementation As the name suggests, the hardware accelerated path is there to make use of GPU acceleration for compositing some of the RenderLayer contents. Code for it lives behind the ACCELERATED_COMPOSITING compile-time flag. ####When the Hardware accelerated path will be used Chrome currently uses the hardware accelerated path when at least one of the page’s RenderLayer’s requires hardware acceleration, or when the --forced-compositing-mode flag is turned on. This flag is currently on by default in Chrome on Android and ChromeOS. Eventually this will also be the case for Chrome on other platforms. Safari on the Mac (and most likely iOS) follows the hardware accelerated path and makes heavy use of Apple's proprietary CoreAnimation API. ####What is Compositor In the hardware accelerated path, some (but not all) of the RenderLayers get their own backing surface (layers with their own backing surfaces are called compositing layers) into which they paint instead of drawing directly into the common bitmap for the page. We still start with the RenderLayer tree and end up with a single bitmap, but this two-phase approach allows the compositor to perform additional work on a per-compositing-layer basis. For instance, the compositor is responsible for applying the necessary transformations (as specified by the layer's CSS transform properties) to each compositing layer’s bitmap before compositing it. Further, since painting of the layers is decoupled from compositing, invalidating one of these layers only results in repainting the contents of that layer alone and recompositing. __In contrast, with the software path, invalidating any layer requires repainting all layers (at least the overlapping portions of them) below and above it which unnecessarily taxes the CPU.__ Recall that in the software path there was a single GraphicsContext for the entire page. With accelerated compositing, we need a GraphicsContext for each compositing layer so that each layer can draw into a separate bitmap. Recall further that we conceptually already have a set of parallel tree structures, each more sparse than the last and responsible for a subtree of the previous: the DOM tree, the RenderObject tree, and the RenderLayer tree. With the introduction of compositing, we add an additional conceptual tree: the GraphicsLayer tree. Each RenderLayer either has its own GraphicsLayer (if it is a compositing layer) or uses the GraphicsLayer of its first ancestor that has one. This is similar to RenderObject’s relationship with RenderLayers. Each GraphicsLayer has a GraphicsContext for the associated RenderLayers to draw into. this could be quite wasteful in terms of memory (vram especially). In the current WebKit implementation, the following conditions are some of those that cause a RenderLayer to get its own compositing layer (see the CompositingReasons enum in RenderLayer.h for a longer list ): Layer has 3D or perspective transform CSS properties Layer is used by <video> element using accelerated video decoding Layer is used by a <canvas> element with a 3D context or accelerated 2D context Layer is used for a composited plugin Layer uses a CSS animation for its opacity or uses an animated webkit transform Layer uses accelerated CSS filters Layer with a composited descendant has information that needs to be in the composited layer tree, such as a clip or reflection Layer has a sibling with a lower z-index which has a compositing layer (in other words the layer is rendered on top of a composited layer) Significantly, this means that pages with composited RenderLayers will always render via the compositor. Other pages may or may not, depending on the status of the `--forced-compositing-mode ` flag. #####Code Code related to the compositor lives inside WebCore, behind the USE(ACCELERATED_COMPOSITING) guards. ####GPU come into play: With the addition of the accelerated compositor, in order to eliminate costly memory transfers, the final rendering of the browser's tab area is handled directly by the GPU. The Compositor library is essentially using the GPU to composite rectangular areas of the page (i.e. all those compositing layers) into a single bitmap, which is the final page image. ##The GPU Process __For the renderer process: to issue command to the GPU.__ The GPU process exists primarily for security reasons. >Restricted by its sandbox, the Renderer process (where WebKit and the compositor live) cannot directly issue calls to the 3D APIs provided by the OS (we use Direct3D on Windows, OpenGL everywhere else). For that reason we use a separate process to do the rendering. We call this process the GPU Process. The GPU process is specifically designed to provide access to the system's 3D APIs from within the Renderer sandbox or the even more restrictive Native Client "jail". Currently Chrome uses a single GPU process per browser instance, serving requests from all the renderer processes and any plugin processes. The GPU process, while single threaded, can multiplex between multiple command buffers, each one of which is associated with its own rendering context. When a page renders via the compositor, all of its pixels are drawn directly onto the window via the GPU process. The compositor maintains a hierarchy of GraphicsLayers which is constructed by traversing the RenderLayer tree and updated as the page changes. With the exception of WebGL and video layers, the contents of each of the GraphicsLayers are first drawn into a system memory bitmap (just as was the case in the software path): each RenderLayer asks all of its RenderObjects to paint themselves into the GraphicsLayer’s GraphicsContext, which is backed by a bitmap in shared system memory. This bitmap is then passed to the GPU process (using the resource transfer machinery explained above in the GPU Process section), and then the GPU process uploads the bitmap to the GPU as a texture. The compositor keeps track of which GraphicsLayers have changed since the last time they were drawn and only updates the textures as needed. Once all the textures are uploaded to the GPU, rendering the contents of a page is simply a matter of doing a depth first traversal of the GraphicsLayer hierarchy and issuing a GL command to draw a texture quad for each layer with the previously-uploaded texture. A texture quad is simply a 4-gon (i.e. rectangle) on the screen that’s filled with the given texture (in our case, the relevant GraphicsLayer’s contents). The Code The bulk of the Chromium implementation for the compositor lives in WebCore's platform/graphics/chromium directory. The compositing logic is mostly in LayerRendererChromium.cpp and the implementations of the various composited layer types are in {Content|Video|Image} LayerChromium.cpp files. ###Summary Now we know roughly how to draw a page using the compositor: the page is divided up into layers, layers are rasterized into textures, textures are uploaded to the GPU, and the compositor tells the GPU to put all the textures together into the final screen image ###Optimize rendering __60FPS__ manage to do all of this 60 times a second so animation, scrolling, and other page interactions are smooth. ####Damage WebKit keeps track of what parts of the screen need to be updated. The result is a damage rectangle whose coordinates indicate the part of the page that needs to be repainted traverse the RenderLayer tree and only repaint the parts of each RenderLayer that intersect with the damage rect, skipping the layer entirely if it doesn’t overlap with the damage rect. This prevents us from having to repaint the entire page every time any part of it changes, an obvious performance win. ####Tiling #####texture-size problem + __software__: bitmap, easy to change subregion + __handware__: bitmap as Textture, can't be change partially #####Solution: only paint and upload what’s currently visible. texture never needs to be larger than the viewport, Each layer is split up into tiles (currently of a fixed size, 256x256 pixels). We determine which parts of the layer are needed on the GPU and only paint + upload those tiles. texture streaming: save GPU from stalling during a long upload ####Case Study: Scroll #####Software: repaint the gap #####Handware (highlight compositor's power) + very small scroll amount: already got texture on GPU, just drawing no painting needed + large scroll amout: only need to rasterize and upload these tiles, no other affected + prepainting: paint tiles in ahead ###Treaded Compositor ####software:run on main thread with others ####Hardware: ##Reference + [GPU Accelerated Compositing in Chrome](http://www.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome) + How Chromium's compositor works, from `cc:layer` to `GPU Process` + with detailed description; and Glossary; + [Compositing in Blink and WebKit](http://www.youtube.com/watch?v=Lpk1dYdo62o) + summary and focus on Task 1 -
chengmu created this gist
Oct 24, 2013 .There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode charactersOriginal file line number Diff line number Diff line change @@ -0,0 +1,108 @@ ##Composting in Blink and Chrome ###base: layout and redraw + layout/repaint :need to recaculate the pixels info, like colors; + redraw: if components have their own backing stores, then nothing need repainting while animation ###CSS basic: Stacking Contexts and Paint Order + Normal Flow + children are laid-out according to inline-level, block-level, float, and other formatting + Relative Positioned Elements + positioned with respect to containing block; Not part of normal flow + Absolute Positioned Elements + positioned with respect to containing block;Not part of normal flow + Fixed-Position Elements + positioned with respect to viewport or other container; Not part of normal flow + Z-index: allows control over how elements are ordered ####Stacking Context >Positioned elements(relative, absolute, fixed) with a z-index group of layers into a isolated layer >A Stacking Contexts 'flattens' the elements' subtree, so nothing outside of the subtree can paint between elements of the subtree. >In other word: the rest of tehe Dom Tree can treat the `Stacking context` as an __atomic conceptual layer__ for painting #####Standards [CSS2.1 9.9 Layer Presentation](http://www.w3.org/TR/CSS2/visuren.html#layers) [CSS2.1 Appendix E. Elaborate description of Stacking Contexts](http://www.w3.org/TR/CSS2/zindex.html) ###Why Compositing + Avoide unnecessary repainting + components have own backing stores, nothing needs repaiting while this example animates + Makes some features more efficient or practical + scrolling, 3D CSS, opacity, filters, WebGL, hardware video decoding ###What is Compositing ####Tasks of Compositing 1. __determine__ how to grount contents into backing stores 2. __Paint__ the contents of each composited layer 3. __Draw__ the composited layers to make a final image (in the context of rendering websites) __The use of multiple backing stores to cache and group chunks of the render tree__ ####How Compositing Works RenderObject Tree ===> RenderLayer Tree many to one ##GPU In Chrome >implementation of hardware-accelerated compositing in Chrome Traditionally: Browsers depend on CPU to render web page content GPU involved in compositing contents of web page ####Benefits eliminating unnecessary (and very slow) copies of large data, especially copies from video memory to system memory. In most cases, the GPU can achieve far better efficiency than the CPU (both in terms of speed and power draw) in drawing and compositing operations that involve large numbers of pixels as the hardware is designed specifically for these types of workloads. Utilizing the GPU for these operations also provides parallelism between the CPU and GPU, which can operate at the same time to create an efficient graphics pipeline. ####Candidates for Optimizations + `<video>` element which is using hardware decoder + WebGL `canvas` + compositing of page layers ####How it works? #####FIRST: understand the basic building blocks of how WebKit renders pages RenderObject's are stored in a parallel tree structure, called the Render Tree. A RenderObject knows how to present (paint) the contents of the Node on a display surface. It does so by issuing the necessary draw calls to a GraphicsContext. A GraphicsContext is ultimately responsible for writing the pixels into a bitmap that gets displayed to the screen. In Chrome, the GraphicsContext wraps Skia, our 2D drawing library, and most GraphicsContext calls become calls to an SkCanvas or SkPlatformCanvas (see this document for more on how Chrome uses Skia). ##Reference + [GPU Accelerated Compositing in Chrome](http://www.chromium.org/developers/design-documents/gpu-accelerated-compositing-in-chrome) + How Chromium's compositor works, from `cc:layer` to `GPU Process` + [Compositing in Blink and WebKit](http://www.youtube.com/watch?v=Lpk1dYdo62o) + summary and focus on Task 1 + Connecting `WebCore::RenderLayer` to `cc::Layer` + render layer tree to a composited layer + [Rendering in WebKit](http://www.youtube.com/watch?v=RVnARGhhs9w) + WebCore guts, including ` WebCore::RenderLayer` + How rendering works, html to Dom Tree to Render Tree + [Painting in Chromium](http://www.youtube.com/watch?v=A5-aXfSt-RA) + how painting works, chromium and Skia side of painting > Written with [StackEdit](https://stackedit.io/).