The seL4 Bitfield Generator
The seL4 kernel code and the libsel4 library use the so-called bitfield generator to automatically produce code and formal proofs for tagged unions and structs of bitfields from a higher-level specification.
The code for this tool is available in the seL4 repository in the tools directory. This page contains the manual of the bitfield generator. The manual sources are also available for the tools directory.
Bitfield Generator Manual
Introduction
The tool bitfield_gen.py
in the seL4/tools/
directory generates bitfield
accessor code in C with formal specifications and proofs in Isabelle/HOL.
It takes specifications of bitfield layouts in memory such as the following
block VMFault {
field address 32
field FSR 5
padding 7
field instructionFault 1
padding 15
field seL4_FaultType 4
}
and generates a C struct with constructor and getter/setter functions with call-by-value and call-by-pointer semantics. It also, separately, generates a corresponding data type definition in Isabelle/HOL with corresponding functions in HOL for that data type, as well as proofs that the generated C code correctly implements these functions. This includes absence of undefined behaviour in C.
The purpose of the tool is to provide bit-level access to structures in memory with precise, deterministic and reliable layout semantics, with proofs of correctness, and without the need for manual masking and shifting operations. The C standard leaves C bitfields too underspecified to be fit for that purpose.
In addition to basic blocks of bitfields such as above, the tool supports tagged unions and various layout options for these blocks.
Bit Field Generator by Example
This section walks through the features of the bitfield specification languages by example. The full syntax is described in the next section for reference.
Blocks
The basic concepts of the specification languages are blocks which contain a
list of fields. The example used in the introduction declares a block with
name VMFault
, which contains four fields and two padding entries:
block VMFault {
field address 32
field FSR 5
padding 7
field instructionFault 1
padding 15
field seL4_FaultType 4
}
This will generate a type VMFault_t
in C. Assuming that the word size of the
underlying architecture is 32 bits, this VMFault_t
type will have a size of
two machine words. The sizes of the content and padding fields in a block must
add up to a multiple of the machine word size.
The bits in the block are laid out in memory in the order they are mentioned in
the specification. That is, the left-most 32 bits (i.e. bits 63 to 32) of
VMFault
will correspond to the address
field, the next 5 bits to FSR
, the
next 7 bits are inaccessible padding which is set to 0 on creation of the
bitfield, then one bit for instructionFault
, 15 bits of padding, and finally 4
bits (bits 0 to 3 of the entire type) for seL4_FaultType
.
The following diagram corresponds to the block specification above.
│------------------------ address ----------------------------│
┌─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┐
│ │ ..
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
63 32
instruction fault seL4_FaultType
│-- FSR --│ │-| │-------|
┌─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┬─┐
.. │ │ │ │ │ │
└─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┴─┘
31 27 19 3 0
For creating new values of the type VMFault
, the generator will provide a
function VMFault
that takes an argument of the word size of the machine for
the value of each field – for the example above, this is uint32_t
VMFault_t VMFault(uint32_t address, uint32_t FSR, uint32_t instructionFault,
uint32_t seL4_FaultType);
Values that do not fit in the bit size of the target field will be cut off
accordingly, i.e., as if using val & mask(size)
.
To read the value of each field, the generator will provide the following functions:
uint32_t VMFault_get_address(VMFault_t vm_fault);
uint32_t VMFault_get_FSR(VMFault_t vm_fault);
uint32_t VMFault_get_instructionFault(VMFault_t vm_fault);
uint32_t VMFault_get_seL4_FaultType(VMFault_t vm_fault);
For setting one of the fields in an existing value of type VMFault_t
:
VMFault_t VMFault_set_address(VMFault_t vm_fault, uint32_t address);
VMFault_t VMFault_set_FSR(VMFault_t vm_fault, uint32_t FSR);
VMFault_t VMFault_set_instructionFault(VMFault_t vm_fault, uint32_t instructionFault);
VMFault_t VMFault_set_seL4_FaultType(VMFault_t vm_fault, uint32_t seL4_FaultType);
All of the above (constructor, getter, setter) are pure functions without side
effects or dependency on global state. They do contain assert
statements that
are in effect in debug configurations. The generator declares all of them as
static inline CONST
to enable corresponding compiler optimisations.
Often we expect such bitfields to be accessed via pointers, e.g. as part of a buffer or a larger struct in memory or for instance a page table. Therefore the generator also produces getter/setter functions with pointer access:
uint32_t VMFault_ptr_get_address(VMFault_t *vm_fault);
uint32_t VMFault_ptr_get_FSR(VMFault_t *vm_fault);
uint32_t VMFault_ptr_get_instructionFault(VMFault_t *vm_fault);
uint32_t VMFault_ptr_get_seL4_FaultType(VMFault_t *vm_fault);
These are declared PURE
, but not CONST
, since they depend on global state.
The pointer set functions are:
void VMFault_set_address(VMFault_t *vm_fault, uint32_t address);
void VMFault_set_FSR(VMFault_t *vm_fault, uint32_t FSR);
void VMFault_set_instructionFault(VMFault_t *vm_fault, uint32_t instructionFault);
void VMFault_set_seL4_FaultType(VMFault_t *vm_fault, uint32_t seL4_FaultType);
These are neither PURE
nor CONST
, but still declared as static inline
.
Note that the generator has a command line option
--prune
to skip functions that are not used, and the seL4 build system makes use of that option. That means, not necessarily all of the combinations above will appear in the generated files for seL4, only the ones that are actually used in the rest of the code.
Field High
The usual access patterns for fields is that if we provide e.g. a value of 5 to
a setter function, we expect a corresponding 8-bit field to be set to the bit
pattern 0000 0101
.
In general, on a 32-bit architecture, we are providing a 32-bit value to store in an n-bit field. One way of looking at this is that we know the top 32-n bits of the value are always 0.
Sometimes we instead know that the bottom n bits of the value we provide are
- This can happen for instance when we want to store an address that is always
aligned to a page boundary, as it would in a page table entry. This means we are
interested in the high bits of the address. The
field_high
tag allows us to store only the significant high bits of such addresses.
For instance the base pointer stored in a page table cap in seL4 for 32-bit RISC-V is such a case:
block page_table_cap {
field capPTMappedASID 9
field_high capPTBasePtr 20
padding 3
padding 7
field capPTIsMapped 1
field_high capPTMappedAddress 20
field capType 4
}
The generated access methods of the field_high capPTBasePtr
have the same type
as the ones for field
, but only the top 20 bits of the provided value will be
stored in the setter methods as opposed to the bottom 20 bits for normal fields.
The getter methods will automatically correctly return a value that has the top 20 bits set to the stored value and the bottom 12 bits set to 0. That is, getter and setter remain inverses of each other as long as the values provided to the setter have the bottom 12 bits set to 0.
On most 64-bit architectures, the top bits of pointers are also determined,
because not all 64 bits are available (e.g. only 48 of them on x64). The
generator assumes field_high
values to be used for pointers and to need to store
only the significant bits of the pointer, along with potential sign extension.
See the Architecture Parameters section below for details.
For instance, if the platform is declared to have base 64 and 48 significant bits, and we have a pointer that we know is aligned to 12 bits, we need to store only 48-12 = 36 bits to recreate the correct original pointer value. That means if we declare
block store_more {
field some_info 16
field_high ptr 36
field other_info 12
}
we can store a 12-aligned pointer with sm = store_more_set_ptr(sm, p)
and
extract it again with p = store_more_get_ptr(sm)
without losing information.
This means, we can store 28 bits of additional information in the space that
would be taken up by a single pointer. For this it does not matter where in the
store_more
block the ptr
field is declared, just that it is a field_high
that stores 36 bits (on a 64 bit base platform with 48 significant bits).
Visible Field Order
Sometimes it is useful for the constructor to take its arguments in a particular order that is different from the field layout order in memory. It is possible to specify such an order, enumerating the constructor field order in parentheses after the block name:
block page_table_cap(capPTBasePtr, capPTMappedASID) {
field capPTMappedASID 9
field_high capPTBasePtr 20
padding 3
}
All fields must be enumerated, including the tag field in tagged unions described in the next section.
Only the constructor is affected, the memory layout as well as getters and setters remain the same.
Tagged Unions
Multiple blocks that share a type field can be aggregated as a type in C into a tagged union. For example to express an algebraic data type such as the following
datatype cap = NullCap | ObjectCap (ref : "20 word")
in C, one might specify the following:
block null_cap {
padding 28
cap_type 4
}
block object_cap {
field ref 20
padding 8
cap_type 4
}
tagged_union cap cap_type {
tag null_cap 0
tag object_cap 1
}
Note that both of the block declarations for
null_cap
andobject_cap
contain a field with the namecap_type
at the same bit position with the same size, and that all blocks have the same size. We will later relax the size of the tag field, but the tool does require all blocks in a tagged union to be of the same size and the tag field to always be at the same position.
The fragment tagged_union cap cap_type
means that we want the generator to
produce a C type cap_t
that is a tagged union, and that we want to use
cap_type
as the tag field that distinguishes which variant of the union we are
operating on.
Listing tag null_cap 0
means that we want the block null_cap
to be part of
the tagged union and that the value 0
in field cap_type
will indicate that
the rest of the fields are null_cap
fields. Similarly tag object_cap 1
means that a value of 1
in the tag field makes the fields of the block
object_cap
available.
This mapping must be unique, i.e., it is an error for a block to be mentioned in more than one tagged union.
If a block is mentioned in a tagged union, its memory layout stays the same as
before, but the code that is generated for it changes types to represent the
tagged union instead. All getters and setters for the block will be prefixed
with the name of the union, and instead of the block type, they will take and
return the union type instead (e.g. cap_t
instead of object_cap_t
or
null_cap_t
). The tag field will be omitted from the constructor, because it is
determined by the constructor name, and no setter methods for the tag field will
be generated. Instead there will be a generic getter function for the tag field
which does not need to know which variant of the type it is presented with. It
returns an enum whose names correspond to the block names and whose values
correspond to the associated tag values.
For instance for the specification above, we will get:
// variant tags:
enum cap_tag {
cap_null_cap = 0,
cap_object_cap = 1
};
typedef enum cap_tag cap_tag_t;
// discrimator:
cap_tag_t cap_get_cap_type(cap_t cap);
// constructors:
cap_t cap_null_cap_new(void);
cap_t cap_object_cap_new(uint64_t ref);
// getters and setters for object_cap variant:
uint64_t cap_object_cap_get_ref(cap_t cap);
cap_t cap_object_cap_set_ref(cap_t cap, uint64_t ref);
uint64_t cap_object_cap_ptr_get_ref(cap_t *cap);
void cap_object_cap_ptr_set_ref(cap_t *cap, uint64_t ref);
The getter names are constructed as <union>_<block>_get_<field>
. Setters and
pointer versions have names with set
, ptr_get
, and ptr_set
respectively.
All functions for the tagged union cap
will start with cap_
.
It is a runtime error to use a getter or setter function on the wrong block variant. That means it is the caller’s responsibility to know or check which variant a value of
cap_t
represents before using e.g.cap_object_cap_get_ref()
on it. The tool generatesassert
statements in the getters and setters. Common idioms are to check the tag in aswitch
orif
statement and work on the contents accordingly.
Advanced Tag Fields
Tag Slices
Occasionally the tag value for a tagged union cannot be encoded in a single field of a block, for instance when the hardware prescribes using e.g. bits 0 and 5..6 for the type of a structure and bits 1..4 for something else.
An example would be the following
block null {
padding 25
field typ_h 2
padding 4
field typ_l 1
}
block content {
padding 25
field typ_h 2
field some 4
field typ_l 1
}
tagged_union example ex_type(typ_h, typ_l) {
null (0,0)
content (2,0)
}
Instead of a single tag field, the example
union has two tag fields. We also
say that the tag ex_type
has two slices typ_h
and typ_l
. The same
restrictions as for normal tags apply: they must fit the tag values and all tag
slice fields must be mentioned in all blocks of the union.
The fragment ex_type(typ_h, typ_l)
means that we want the discriminator
function to be called example_get_ex_type
and that the tag is made up of the
two fields (typ_h, typ_l)
. Any positive number of fields is allowed. The
generated enumeration will still be called example_tag
as it would be for a
single tag, and it will still contain a name for each alternative bound to a
value unique for that variant. (The values consist of the concatenation of the
bit pattern of the tag slice values).
Similarly to a single tag field, the tool will not generate getters and setters for any of the tag fields mentioned in the union specification.
The values in the null (0,0)
and content (2,0)
are positional, i.e., the
example assigns the bit pattern 0b10
in field typ_h
and 0
in field typ_l
to mean the content
block and 0
in both to mean the null
block.
Masks
The restriction that all tag fields have to have the same size can be relaxed slightly. One can declare multiple possible tag sizes together with a bit mask each that determines which of these sizes applies.
Say for instance that we have more than 16 different cap
types, and that some
of the cap
alternatives only have 4 bits of storage left to store the cap tag,
but others have further bits available, e.g. 8. We can declare a bit mask that
determines if the tag size is 4 or 8 and use the size 4 tags in the blocks that
have fewer bits left.
The following example achieves this:
block null_cap {
padding 24
cap_type 8
}
block object_cap {
field ref 28
cap_type 4
}
tagged_union cap cap_type {
mask 4 0x0e
mask 8 0x0e
tag null_cap 15
tag object_cap 1
}
The block null_cap
has an 8-bit tag field and the block object_cap
has a
4-bit tag field of the same name. Note that both still must start at the same
bit positions.
The declarations mask 4 0x0e
and mask 8 0x0e
are read from the smallest to
the largest mask size. mask 4 0x0e
declares that the default tag size is 4
bits wide, but if the bits 0b1110
(0x0e) are set, i.e. if tag_value & 0x0e ==
0x0e
, then the next larger mask will be used. This means values 0 to 13 are
using 4 bit tags, 14 and 15 are using 8 bit tags, and all 8 bit tag values must
have at least the bits 1, 2, and 3 set.
Since there is no larger mask size after 8 in the example, the mask value there
is not important. However, the generator expects the mask of each larger level
to have at least all bits set that the masks of all smaller sizes have set, so
we chose the same value 0x0e
in the example. If we had another level, e.g. 16
bits wide afterwards, we might have chosen 0b11110
(0x1e) for the mask value
at 8 bit.
The sub mask inclusion means that the generator can check mask sizes
consecutively. I.e. if tag_value & 0xe != 0xe
, we always know it is a 4 bit
tag, no matter how many other tag sizes are declared.
The tag mask value declared for each alternative in the tagged union must fit
the mask for the size of the tag field in the block. Remember that with mask 4
0x0e
the 0x0e
bit pattern means that size 4 does not apply, and instead
the next larger size applies. So in the example, because object_cap
has a
4-bit tag field, object_cap
must have a tag value where the bits 0x0e
are
not all set (e.g. 1 is a legal value). Conversely, null_cap
has an 8-bit tag,
and so at least the bits 0x0e
must all be set in the tag value (e.g. 14 or 15
works, as does 255, but not 1).
Masks and tag slices are mutually exclusive, i.e. only one of these can be used for any particular tagged union.
Raw Value Access
The C representation of blocks, whether in a tagged union or not, is a struct
containing an array of the architecture word type (uint32
or uint64
) of the
length computed from the number and width of fields in the block. Given that the
memory layout is fully specified it is possible to operate on these raw values
directly without breaking the semantics of the generated functions.
This is usually neither recommended nor necessary, and for code undergoing formal verification strongly discouraged, because it introduces additional verification overhead.
However, one special case is occasionally convenient: when initialising an area of memory with 0, and reinterpreting that memory as a bitfield structure, we are guaranteed to get a value where all fields are 0. Similarly, returning an all-0 struct from a function instead of calling the constructor with only 0 parameters may be acceptable, although one should check the generated assembly if the compiler was not able to optimise the constructor away – if it does then the constructor form is clearer and easier to use in proofs.
The tool will not access padding fields, but is not guaranteed to preserve them when copying. Writing/reading padding fields is not guaranteed to be stable.
Architecture Parameters and Canonical Pointers
The tool supports architectures with 32 and 64 bit word sizes. The command
base 32
selects a 32 bit native word size. The declaration covers all subsequent blocks and unions in the specification. It is possible to give multiple declarations, each covering the subsequent blocks and unions, but it rarely makes sense to pick any other value than the word size of the architecture.
Many 64 bit architectures have the concept of canonical pointers where not all 64 bits of the available space are usable for pointers, but only for example 48 bits. In this setting a pointer value would be canonical if and only if all top bits are set when bit 47 (counting from 0) is set. This is equivalent to sign extension, i.e. treating bit 47 as the sign bit and extending that sign bit up to bit 63.
A potential use for this is for kernel pointers to always have bit 47 set (and all bits above) and for user pointers to always have bit 47 unset (and all bits above).
The bitfield generator allows specifying this canonical bit and whether the top bits for pointers should be sign extended or not like this:
base 64(48,1)
This means: word size is 64, pointers are 48 bits [0..47]
, and bit 48 and
above should be set to 1 if bit 47 is set.
This only affects field_high
fields which are assumed to be pointers.
base 64(48,0)
means word size is 64, pointers are at most 48 bits, and no sign extension. This
is currently equivalent to just base 64
.
It may make sense to use the base
command multiple times for the sign
extension case. For instance, if the intention is to usually store pointers on
x64, one would declare base 64(48,1)
, but if there is a specific block that
wants to not store a pointer, but some other value as field_high
, one could
declare base 64
just before and base 64(48,1)
again just after that block.
Syntax Reference
Lexical Structure
The generator distinguishes identifiers, integer literals, comments, white space, keywords, and operators. Keywords and operators are shown inline in the grammar in the next section. The rest are defined below.
A multi-character ident must start with a letter or underscore, followed by
letters, underscores or numbers. A one-character ident must consist of one
letter. Letters are Latin letters [a-z]
and [A-Z]
, digits are [0-9]
.
IDENT ::= [A-Za-z_][a-zA-Z0-9_]+ | [A-Za-z]
In addition to the regular expression above, identifiers that consist only of
underscores are invalid. Valid examples are a
, a_20
, _a_20_b
. Invalid
examples are 0
, _
, 0a
or 0_a
.
Integer literals can be decimal, octal, hex, or binary, followed by an optional
l
or L
specifier.
INTLIT ::= ( [1-9][0-9]*
| 0[oO]?[0-7]+
| 0[xX][0-9a-fA-F]+
| 0[bB][01]+
| 0 ) [lL]?'
Valid examples are 15
, 017
, 0o17
, 0x0F
, 0xF
, 0XF
, 0b1111
, 0
,
0l
, and 15L
. Invalid examples are 10A0
, 0xFG0
, 0b20
, 090
.
Comments start with --
or with #
and go until the end of the line. There
are no block comments.
-- this is a valid comment
### as is this
Whitespace are spaces and tab characters. They delimit other tokens. Consecutive whitespace is ignored.
Syntactic Structure
spec :== ( base | block | tagged_union )*
base ::= "base" ("32"|"64") canonical_spec?
canonical_spec ::= "(" INTLIT "," ("0"|"1") ")"
block ::= "block" IDENT visible_order_spec_opt? "{" fields "}"
visible_order_spec_opt ::= "(" visible_order_spec? ")"
visible_order_spec ::= IDENT ("," IDENT)*
fields ::= ( "field" IDENT INTLIT | "field_high" IDENT INTLIT | "padding" INTLIT )*
tagged_union ::= "tagged_union" IDENT IDENT tag_slices? "{" masks tags "}"
tag_slices ::= "(" IDENT ("," IDENT)* ")"
masks ::= ( "mask" INTLIT INTLIT )*
tags ::= ( "tag" IDENT tag_value )*
tag_value ::= INTLIT | "(" INTLIT ( "," INTLIT )* ")"
Forward references to names that are declared later in the file are allowed.
Restrictions
The following restrictions are not represented by the grammar in the previous section, but will be checked by the generator.
- the canonical bit must be smaller than the
base
size - field sizes must not be larger than the
base
size - the field sizes in a block must add up to multiples of the
base
size - all identifiers mentioned as tags in a union must be blocks
- all blocks in a tagged union must have the same size
- all blocks in a tagged union must mention the tag field(s)
- all blocks in a tagged union must have the tag field(s) at the same bit position
- all blocks in a tagged union must have tag fields of one of the sizes declared in the union, or if no mask size is declared, the tag fields must all be of the same size.
- each block can be used in at most one tagged union
- visible_order_spec must enumerate all fields if used
- tag values must fit into the size of the tag fields
- tag values must not be larger than an
int
of the underlying platform, because they will be used in anenum
in C.
Correctness
Since the tool generates both the implementation and the specification, the question arises what kind of correctness property we gain. Even if the generated proofs are checked for correctness by Isabelle/HOL, they could still be expressing properties that are not useful or specify behaviour that we do not want the code to have. What we do know after Isabelle/HOL has checked the proofs is that the code implements the generated specification and that the generated C code is free from undefined behaviour.
To gain confidence that the generated specification is meaningful and itself has the right properties it needs human inspection or it needs to be used in further proofs. In the seL4 verification both are the case. In particular, the generated specification together with the generated implementation proofs are the interface that the C verification uses when it encounters bitfield functions. If the generated specifications are trivial, not useful, wrong, or do not express the behaviour of the generated C (e.g. if the generated correctness proof was trivial), these proofs could not succeed, because the specification would then not provide the needed semantics of the code (e.g. that loading a value after storing it does not lose information). If the specification provides strong enough properties to make the rest of the C verification succeed, we know that the specification is strong enough for our needs, even though it might still be missing additional facts or properties that the generated C code might have. This is sufficient for strong functional correctness.
Limitations
The tool does not currently support proofs for all combinations of pointer access functions for variants, but it does support all those that are used in seL4. If proof support for new combinations is needed, the tool could be extended relatively easily.
Command Line
Usage
The tool can be invoked from the command line as follows
bitfield_gen.py [options] <input.bf> <outfile>
where <input.bf>
is a bitfield specification file as described in this manual.
If no file is given, the tool reads from stdin
, and if no output file is given,
the tool prints to stdout
.
By default, the output is generated C code. See options below for how to additionally produce Isabelle/HOL theory files for specifications and proofs.
Options
-h, --help
:
Show help message and exit.
--environment={seL4,libseL4}
:
Whether to produce code for the seL4
or libseL4
environment. This
influences which header files are included, what statement to use for
assert
, what type names uint32
etc have, and how a function is declared
inline. The values are determined by the INCLUDES
, ASSERTS
, INLINE
,
and TYPES
dictionaries in the generator source.
--hol_defs
:
Produce Isabelle/HOL theory files with specifications and definitions.
--hol_proofs
:
Produce Isabelle/HOL theory files with proofs, assuming specification
and definition files exist. Requires the --umm_types
option.
--sorry_lemmas
:
When producing proofs, only state lemmas, but do not prove them. This
requires explicitly setting quick_and_dirty
in Isabelle and
is useful for reducing processing time during other proof development.
--prune=<file>
:
Append <file>
to the list of files with pruning information, which contain
the names of bitfield functions that are used in the client code base. The
option can be used multiple times. If this option is used, the generator
will only produce the functions that are listed in one of these files. This
is useful to reduce code size, processing time, and especially proof
processing time drastically. It does make it harder to discover which
functions are generated by the tool, so if that is important, consider
leaving this option off until proofs are needed.
--toplevel=<type>
:
Append Isabelle/HOL <type>
name to the list of top-level heap types. The
proofs will generate appropriate frame conditions for all of these top-level
heap types. The option can be used multiple times.
--umm_types=<file>
:
Path to the file umm_types.txt
which contains dependency information
between the top-level heap types (e.g. which structs contain which other
structs or type fields). The file can be generated using
spec/cspec/mk_umm_types.py
in the l4v repository. (UMM = unified memory
model).
--cspec-dir=<dir>
:
Location of the cspec
directory containing the theory file
KernelState_C
. This is specific to the seL4 proof setup.
--thy-output-path=<path>
:
Where to put the generated theory files.
--skip_modifies
:
Do not emit “modifies” proofs. These proofs state and show which global
state is modified by each generated function. They are weaker versions
of the full specifications and may not be needed if the full specifications
are always used instead.
--showclasses
:
Print the parsed classes and fields.
--debug
:
Print debug information during code and proof generation.
--from_file=<FROM_FILE>
:
Original source file before preprocessing. Use this to produce a
Generated from FROM_FILE
comment in the output.
seL4 build
In the seL4 build system, .bf
files are usually first passed through the C
preprocessor to process #include
and #ifdef
statements, as well as perform
macro expansion, which are currently not part of the bitfield specification
language.
The source files containing C preprocessor macros and includes are called .bf
.
There are the following main files in seL4:
include/structures_32.bf
andinclude/structures_64.bf
contain generic kernel objects and capabilities for 32 and 64 bit systemsinclude/arch/**/object/structures.bf
contain architecture dependent object and capability definitions.- some
plat/**
directories containhardware.bf
files for hardware structures. libseL4
contains some bitfield definitions that are shared with the kernel inlibsel4/sel4_arch_include/*/sel4/sel4_arch/types.bf
,libsel4/arch_include/*/sel4/arch/shared_types.bf
, andlibsel4/mode_include/*32*/sel4/shared_types.bf
.
The generated C code contains a comment which .bf
include hierarchy the
definitions are coming from.
The generated code can be found in a build directory under the generated
directory, where e.g. structures_gen.h
will have been generated from
structures.bf.pbf
, which in turn is the preprocessed source of
structures.bf
.
Note that the seL4 build uses the --prune
option, which means that the
generator will only produce functions that are mentioned in the rest of the seL4
sources. To generate a function that is not yet mentioned, simply use a
supported function name pattern in the source (e.g. <block>_get_<field>
) and
the generator will supply it. To discover which functions can be generated,
either use the name template descriptions in this manual or consider temporarily
removing the --prune
option in the cmake
build to look at the generated
sources. This should only be done when no proofs are required, because it will
drastically increase proof processing time.