Hacking on BuildStream¶
Some tips and guidelines for developers hacking on BuildStream
Major feature additions should be proposed on the mailing list before being considered for inclusion, we strongly recommend proposing in advance of commencing work.
New features must be well documented and tested either in our main test suite if possible, or otherwise in the integration tests.
It is expected that the individual submitting the work take ownership of their feature within BuildStream for a reasonable timeframe of at least one release cycle after their work has landed on the master branch. This is to say that the submitter is expected to address and fix any side effects and bugs which may have fell through the cracks in the review process, giving us a reasonable timeframe for identifying these.
Branches must be submitted as merge requests in gitlab and should usually be associated to an issue report on gitlab.
Commits in the branch which address specific issues must specify the issue number in the commit message.
Merge requests that are not yet ready for review must be prefixed with the
WIP: identifier. A merge request is not ready for review until the
submitter expects that the patch is ready to actually land.
Submitted branches must not contain a history of the work done in the feature branch. Please use git’s interactive rebase feature in order to compose a clean patch series suitable for submission.
We prefer that test case and documentation changes be submitted in separate commits from the code changes which they test.
Ideally every commit in the history of master passes its test cases. This makes bisections more easy to perform, but is not always practical with more complex branches.
Commit messages must be formatted with a brief summary line, optionally followed by an empty line and then a free form detailed description of the change.
The summary line must start with what changed, followed by a colon and a very brief description of the change.
If there is an associated issue, it must be mentioned somewhere in the commit message.
element.py: Added the frobnicator so that foos are properly frobbed. The new frobnicator frobnicates foos all the way throughout the element. Elements that are not properly frobnicated raise an error to inform the user of invalid frobnication rules. This fixes issue #123
Coding style details for BuildStream
Python coding style for BuildStream is pep8, which is documented here: https://www.python.org/dev/peps/pep-0008/
We have a couple of minor exceptions to this standard, we dont want to compromise code readability by being overly restrictive on line length for instance.
The pep8 linter will run automatically when running the test suite.
Module imports inside BuildStream are done with relative
from .context import Context
from buildstream.context import Context
The exception to the above rule is when authoring plugins, plugins do not reside in the same namespace so they must address buildstream in the imports.
An element plugin will derive from Element by importing:
from buildstream import Element
When importing utilities specifically, dont import function names from there, instead import the module itself:
from . import utils
This makes things clear when reading code that said functions are not defined in the same file but come from utils.py for example.
Policy for private symbols¶
Private symbols are expressed via a leading
_ single underscore, or
in some special circumstances with a leading
__ double underscore.
Before understanding the naming policy, it is first important to understand that in BuildStream, there are two levels of privateness which need to be considered.
These are treated subtly differently and thus need to be understood:
A symbol is considered to be API private if it is not exposed in the public API.
Even if a symbol does not have any leading underscore, it may still be API private if the containing class or module is named with a leading underscore.
A symbol is considered to be local private if it is not intended for access outside of the defining scope.
If a symbol has a leading underscore, it might not be local private if it is declared on a publicly visible class, but needs to be accessed internally by other modules in the BuildStream core.
For better readability and consistency, we try to keep private symbols below public symbols. In the case of public modules where we may have a mix of API private and local private symbols, API private symbols should come before local private symbols.
Any private symbol must start with a single leading underscore for two reasons:
- So that it does not bleed into documentation and public API.
- So that it is clear to developers which symbols are not used outside of the declaring scope
Remember that with python, the modules (python files) are also symbols
within their containing package, as such; modules which are entirely
private to BuildStream are named as such, e.g.
Cases for double underscores¶
The double underscore in python has a special function. When declaring a symbol in class scope which has a leading underscore, it can only be accessed within the class scope using the same name. Outside of class scope, it can only be accessed with a cheat.
We use the double underscore in cases where the type of privateness can be ambiguous.
For private modules and classes
We never need to disambiguate with a double underscore
For private symbols declared in a public scope
In the case that we declare a private method on a public object, it becomes ambiguous whether:
- The symbol is local private, and only used within the given scope
- The symbol is API private, and will be used internally by BuildStream from other parts of the codebase.
In this case, we use a single underscore for API private methods which are not local private, and we use a double underscore for local private methods declared in public scope.
Documenting private symbols¶
Any symbol which is API Private (regardless of whether it is also local private), should have some documentation for developers to better understand the codebase.
Contrary to many other python projects, we do not use docstrings to document private symbols, but prefer to keep API Private symbols documented in code comments placed above the symbol (or beside the symbol in some cases, such as variable declarations in a class where a shorter comment is more desirable), rather than docstrings placed below the symbols being documented.
Other than this detail, follow the same guidelines for documenting symbols as described below.
BuildStream starts out as a documented project from day one and uses sphinx to document itself.
Documentation formatting policy¶
The BuildStream documentation style is as follows:
- Titles and headings require two leading empty lines above them. Only the first word should be capitalized.
- If there is an
.. _internal_linkanchor, there should be two empty lines above the anchor, followed by one leading empty line.
- If there is an
- Within a section, paragraphs should be separated by one empty line.
- Notes are defined using:
.. note::blocks, followed by an empty line and then indented (3 spaces) text.
- Code blocks are defined using:
.. code:: LANGUAGEblocks, followed by an empty line and then indented (3 spaces) text. Note that the default language is python.
- Cross references should be of the form
- To cross reference arbitrary locations with, for example, the anchor
_anchor_name, you must give the link an explicit title:
:ref:`Link text <anchor_name>`. Note that the “_” prefix is not required.
- To cross reference arbitrary locations with, for example, the anchor
For further information, please see the Sphinx Documentation.
The documentation build is not integrated into the
setup.py and is
difficult (or impossible) to do so, so there is a little bit of setup
you need to take care of first.
Before you can build the BuildStream documentation yourself, you need
to first install
sphinx along with some additional plugins and dependencies,
using pip or some other mechanism:
# Install sphinx pip3 install --user sphinx # Install some sphinx extensions pip3 install --user sphinx-click pip3 install --user sphinx_rtd_theme # Additional optional dependencies required pip3 install --user arpy
Furthermore, the documentation build requires that BuildStream itself be installed.
To build the documentation, just run the following:
make -C doc
This will give you a
doc/build/html directory with the html docs which
you can view in your browser locally to test.
Unfortunately it is quite difficult to integrate the man pages build
setup.py, as such, whenever the frontend command line
interface changes, the static man pages should be regenerated and
committed with that.
To do this, first ensure you have
click_man installed, possibly
pip install --user click_man
Then, in the toplevel directory of buildstream, run the following:
python3 setup.py --command-packages=click_man.commands man_pages
And commit the result, ensuring that you have added anything in
man/ subdirectory, which will be automatically included
in the buildstream distribution.
We use the sphinx.ext.napoleon extension for the purpose of having a bit nicer docstrings than the default sphinx docstrings.
A docstring for a method, class or function should have the following format:
"""Brief description of entity Args: argument1 (type): Description of arg argument2 (type): Description of arg Returns: (type): Description of returned thing of the specified type Raises: (SomeError): When some error occurs (SomeOtherError): When some other error occurs A detailed description can go here if one is needed, only after the above part documents the calling conventions. """
BuildStream uses pytest for regression tests and testing out the behavior of newly added components.
The elaborate documentation for pytest can be found here: http://doc.pytest.org/en/latest/contents.html
Don’t get lost in the docs if you don’t need to, follow existing examples instead.
To run the tests, just type:
At the toplevel.
When debugging a test, it can be desirable to see the stdout and stderr generated by a test, to do this use the –addopts function to feed arguments to pytest as such:
./setup.py test --addopts -s
You can always abort on the first failure by running:
./setup.py test --addopts -x
If you want to run a specific test or a group of tests, you can specify a prefix to match. E.g. if you want to run all of the frontend tests you can do:
./setup.py test --addopts '-k tests/frontend/'
We also have a set of slow integration tests that are disabled by default - you will notice most of them marked with SKIP in the pytest output. To run them, you can use:
./setup.py test --addopts '--integration'
By default, buildstream also runs pylint on all files. Should you want to run just pylint (these checks are a lot faster), you can do so with:
./setup.py test --addopts '-m pylint'
Alternatively, any IDE plugin that uses pytest should automatically
.pylintrc in the project’s root directory.
Tests are found in the tests subdirectory, inside of which there is a separarate directory for each domain of tests. All tests are collected as:
If the new test is not appropriate for the existing test domains, then simply create a new directory for it under the tests subdirectory.
Various tests may include data files to test on, there are examples of this in the existing tests. When adding data for a test, create a subdirectory beside your test in which to store data.
When creating a test that needs data, use the datafiles extension to decorate your test case (again, examples exist in the existing tests for this), documentation on the datafiles extension can be found here: https://pypi.python.org/pypi/pytest-datafiles
Tests that run a sandbox should be decorated with:
and use the integration cli helper.
Measuring BuildStream performance¶
BuildStream has a utility to measure performance which is available from a separate repository at https://gitlab.com/BuildStream/benchmarks. This tool allows you to run a fixed set of workloads with multiple versions of BuildStream. From this you can see whether one version performs better or worse than another which is useful when looking for regressions and when testing potential optimizations.
For full documentation on how to use the benchmarking tool see the README in the ‘benchmarks’ repository.
When looking for ways to speed up the code you should make use of a profiling tool.
Python provides cProfile which gives you a list of all functions called during execution and how much time was spent in each function. Here is an example of running bst –help under cProfile:
python3 -m cProfile -o bst.cprofile – $(which bst) –help
You can then analyze the results interactively using the ‘pstats’ module:
python3 -m pstats ./bst.cprofile
For more detailed documentation of cProfile and ‘pstats’, see: https://docs.python.org/3/library/profile.html.
For a richer visualisation of the callstack you can try Pyflame. Once you have followed the instructions in Pyflame’s README to install the tool, you can profile bst commands as in the following example:
pyflame –output bst.flame –trace bst –help
You may see an Unexpected ptrace(2) exception: error. Note that the bst operation will continue running in the background in this case, you will need to wait for it to complete or kill it. Once this is done, rerun the above command which appears to fix the issue.
Once you have output from pyflame, you can use the
from the Flamegraph project
to generate an .svg image:
./flamegraph.pl bst.flame > bst-flamegraph.svg
The generated SVG file can then be viewed in your preferred web browser.
Profiling specific parts of BuildStream with BST_PROFILE¶
BuildStream can also turn on cProfile for specific parts of execution using BST_PROFILE.
BST_PROFILE can be set to a section name, or ‘all’ for all sections. There is a list of topics in buildstream/_profile.py. For example, running:
BST_PROFILE=load-pipeline bst build bootstrap-system-x86.bst
will produce a profile in the current directory for the time take to call most of initialized, for each element. These profile files are in the same cProfile format as those mentioned in the previous section, and can be analysed with pstats or pyflame.
Profiling the artifact cache receiver¶
Since the artifact cache receiver is not normally run directly, it’s necessary to alter the ForceCommand part of sshd_config to enable profiling. See the main documentation in doc/source/artifacts.rst for general information on setting up the artifact cache. It’s also useful to change directory to a logging directory before starting bst-artifact-receive with profiling on.
This is an example of a ForceCommand section of sshd_config used to obtain profiles:
Match user artifacts ForceCommand BST_PROFILE=artifact-receive cd /tmp && bst-artifact-receive --pull-url https://example.com/ /home/artifacts/artifacts
The MANIFEST.in and setup.py¶
When adding a dependency to BuildStream, it’s important to update the setup.py accordingly.
When adding data files which need to be discovered at runtime by BuildStream, update setup.py accordingly.
When adding data files for the purpose of docs or tests, or anything that is not covered by setup.py, update the MANIFEST.in accordingly.
At any time, running the following command to create a source distribution should result in creating a tarball which contains everything we want it to include: