Skia Gardener Documentation

Contents

What does a Skia Gardener do?


A Skia Gardener keeps an eye on the tree, DEPS rolls, Gold tool, the Perf tool, and triages Chrome bugs.

Below is a brief summary of what the gardener does for each task:

Skia tree

Triage

You should triage Chromium, Skia, and OSS-fuzz bugs that show up under “Untriaged Bugs” on the status page. The Android Gardener will triage the untriaged Android Bugs. For a more detailed view of bugs see Skia Bugs Central.

To access the oss-fuzz bugs, see go/skia-fuzz.

Blamer

If you have Go installed, a command-line tool is available to search through git history and do text searches on the full patch text and the commit message. To install blamer run:

go get go.skia.org/infra/blamer/go/blamer

Then run blamer from within a Skia checkout. For example, to search if the string “SkDevice” has appeared in the last 10 commits:

$ $GOPATH/bin/blamer --match SkDevice --num 10

commit ea70c4bb22394c8dcc29a369d3422a2b8f3b3e80
Author: robertphillips <robertphillips@google.com>
Date:   Wed Jul 20 08:54:31 2016 -0700

    Remove SkDevice::accessRenderTarget virtual
    GOLD_TRYBOT_URL= https://gold.skia.org/search?issue=2167723002

    Review-Url: https://codereview.chromium.org/2167723002

Autorollers

  • Ensure that all AutoRollers listed on the status page are successfully landing.

Gold and Perf

  • Pay attention for new Perf and Gold alerts (by clicking on the bell at the top right of the status page).
  • The gardener’s duty here is to make sure that when developers introduce new images or new perf regressions, that they are aware of what happened, and they use these tools to take appropriate action.

Basic workflow for triaging Perf issues

  1. Click on the regressions link on the Skia Status page to navigate to Perf’s regressions page. screenshot of status skia emphasizing perf regressions

  2. The regressions page lists all currently untriaged regressing CLs. In this example, there is only one, an autoroll.

screenshot of perf list of regressions

  1. The columns to the right of each CL record its performance impact. A ∅ indicates no change, while a ? in the high or low column indicates a potential improvement or regression.

screenshot of perf with the ? symbols

  1. Clicking a ? opens a window that displays the statistical analysis of the performance change, and tools for investigating the change further:

    • Gauge the noise: Use the provided Least Squares Error and step size to determine if the change is a true regression or just statistical noise.

    • Investigate the CL: Click the linked CL ID (in this example 79169) to review the exact commit and see if it’s the root cause or, in the case of a roll, to dig into the upstream changes.

screenshot of perf regression window

  1. The “View on Dashboard” link is one of the most important tools for triaging performance regressions. Perf will automatically filter which benchmarks are affected by a CL; “View on Dashboard” graphs the performance of these benchmarks at each merged CL, marking the regressing CL with a red line (such as in this diagram). Each line is the test run on different devices. The vertical refers to some metric in performance (e.g. seconds) and the horizontal represents time (i.e. CLs). You may click the right hand check boxes to view or hide a line.

    When looking at this graph, keep the following in mind:

    • Expand the window range: Considering the trend helps determine if the regression is a true change or just a transient spike.
    • Look for broad impact: A true regression usually impacts multiple benchmarks.

the main skia perf graph showcasing the test under scrutiny across multiple devices.

  1. Once confident that a CL is truly regressing or not, mark the corresponding check ✓ (for “not an issue”) or x (for “this is an issue”) and fill out the justification text box. Selecting x will also take you to Buganizer to file a bug for the regression.

Screenshot of the Perf regression window with options to mark as not an issue or file a bug

Perf Tips

Responding to performance regressions is similar to responding to Gold regressions. If you see a performance regression from a change, you usually want to notify the person who made the patch.

Although statistical analysis triggers the Perf alerts, the underlying data can be very noisy. Often, a regression is simply noise or, on Android devices, thermal throttling.

Although the performance on some devices can be flaky, a regression will rarely regress only a single device, so pay attention to all the different devices.

For notifying someone of a regression:

  • If it’s a GPU driver issue -> notify the current GPU gardener
  • If it’s a patch from a Skia team member -> notify them
  • If it’s a roll from another Google team -> investigate further

Sometimes perf will find a regression that is very tiny; you can ignore these but use prudence.

Remember to zoom out and look at the overall trend of Perf, to see whether or not the issue is transient.

Documentation

  • Improve/update this documentation page for future gardeners, especially the Tips section.

In general, gardeners should have a strong bias towards actions that keep the tree green and then open; if a simple revert can fix the problem, the gardener should revert first and ask questions later.

Preparing for your rotation


Useful bookmarks

Chat rooms

  • Flutter Engine Sherriff room to watch for Flutter issues that are caused by Skia bugs or need assistance from our team.

View current and upcoming rotations


The list of Skia Gardeners is specified here. The gardeners widget on the status page also displays the current gardeners.

How to swap rotation shifts


If you need to swap shifts with someone (because you are out sick or on vacation), please get approval from the person you want to swap with and directly make the swap via the rotations page.

Tips for Skia Gardeners


When to file bugs

Pay close attention to the “Failures” view in the status page. Look at all existing BreakingTheBuildbots bugs. If the list is kept up to date then it should accurately represent everything that is causing failures. If it does not, then please file/update bugs accordingly.

How to close or re-open the tree

  1. Go to tree-status.skia.org.
  2. Change the status.
  • To close the tree, include the word “closed” in the status.
  • To open the tree, include the word “open” in the status.
  • To caution the tree, include the word “caution” in the status.

How to submit when the tree is closed

  • Submit manually using the “git cl land” with the –bypass-hooks flag.
  • Add “No-Tree-Checks: true” to your CL description and use the CQ as usual.

How to revert a CL

See the revert documentation here.

What to do if DEPS roll fails to land

A common cause of DEPS roll failures are layout tests. Find the offending Skia CL by examining the commit hash range in the DEPS roll and revert (or talk to the commit author if they are available). If you do revert then keep an eye on the next DEPS roll to make sure it succeeds.

If a Skia CL changes layout tests, but the new images look good, the tests need to be rebaselined. See Rebaseline Layout Tests.

Rebaseline Layout Tests (i.e., add suppressions)

  • First create a Chromium bug:

    • goto crbug.com
    • Make sure you’re logged in with your Chromium credentials
    • Click “New Issue”
    • Summary: “Skia image rebaseline”
    • Description:
      • DEPS roll #,
      • Helpful message about what went wrong (e.g., “Changes to how lighting is scaled in Skia r#### changed the following images:”)
      • Layout tests affected
      • You should copy the list of affected from stdio of the failing bot
    • Status: Assigned
    • Owner: yourself
    • cc: bsalomon@, robertphillips@ & developer responsible for changes
    • Labels: OS-All & Cr-Blink-LayoutTests
    • If it is filter related, cc senorblanco@
  • (Dispreferred but faster) Edit skia/skia_test_expectations.txt

    • Add # comment about what has changed (I usually paraphrase the crbug text)
    • Add line(s) like the following after the comment:
      • crbug.com/<bug#youjustcreated> foo/bar/test-name.html [ ImageOnlyFailure ]
    • Note: this change is usually done in the DEPS roll patch itself
  • (Preferred but slower) Make a separate Blink patch by editing LayoutTests/TestExpectations

    • Add # comment about what has changed (I usually paraphrase the crbug text)
    • Add line(s) like the following after the comment:
      • crbug.com/<bug#youjustcreated> foo/bar/test-name.html [ Skip ] # needs rebaseline
    • Commit the patch you created and wait until it lands and rolls into Chrome
  • Retry the DEPS roll (for the 1st/dispreferred option this usually means just retrying the layout bots)

  • Make a Blink patch by editing LayoutTests/TestExpectations

    • Add # comment about what has changed
    • Add line(s) like the following after the comment:
      • crbug.com/<bug#youjustcreated> foo/bar/test-name.html [ Skip ] # needs rebaseline
        • (if you took the second option above you can just edit the existing line(s))
  • If you took the first/dispreferred option above:

    • Wait for the Blink patch to roll into Chrome
    • Create a Chrome patch that removes your suppressions from skia/skia_test_expectations.txt