capDL Language Specification
This document defines capDL revision 1.0, a language for capability distributions.
Background and Purpose
The purpose of capDL is describing snapshots of a system running on the seL4 microkernel. In particular, it can be used to describe which entities have access to which seL4 capabilities.
The main objectives are to use these descriptions for specifying systems, for security analysis of systems, as input for automated bootstrapping of systems, and for debugging applications running on seL4. A capDL specification can be complete, i.e. give a full picture of all capabilities in the system, or the system can be underspecified, showing only the capabilities accessible to a particular component. It is possible to describe system states in capDL that are not reachable via actual system executions. It is also possible to leave out details in a system specification that for instance are not important for a security analysis (e.g. because the relevant authority is already represented otherwise), but that would be necessary for an actual implementation of the system.
Syntax
This section gives a complete definition of the capDL syntax in BNF.
Lexical Tokens
The lexical tokens of capDL are numbers (number
), identifiers
(identifier
), and verbatim symbols and operators typeset between
quotes ”
in the syntax section below.
Numbers are in decimal, hexadecimal (with prefix 0x), or octal (with
prefix 0) form. Identifiers start with a letter [a-zA-Z]
, followed by
a sequence of letters [a-zA-Z]
, numbers [0-9]
, the underscore
character _
, or a @
character.
The language is case sensitive.
Comments can occur between any two tokens and can be nested. A
potentially multiline comment starts with the two characters /*
and
ends with */
. A one-line comment starts with –
. All characters after
–
up to the end of the same line are ignored.
Whitespace (a sequence of space, newline and tab characters) can occur between any two tokens and is ignored.
Syntax
This section describes the concrete syntax of capDL using the tokens
defined above. The main syntactic entity of a file is module
defined
in section Modules.
Names and Ranges
name ::= identifier
num ::= '[' number ']'
range ::= number '..' number | '..' number | number '..' | number
ranges ::= '[' ']' | '[' range (',' range)* ']'
name_ref ::= name ranges?
qname ::= name_ref ('/' name_ref)*
Object Declarations
obj_decls ::= 'objects' '{' obj_decl* '}'
obj_decl ::= qname '=' object
object ::= object_type object_params? opt_obj_decls?
object_type ::= 'ep' | 'notification' | 'tcb' | 'cnode' | 'ut' | 'irq' |
'asid_pool' | 'pt' | 'pd' | 'frame' | 'io_ports' |
'io_device' | 'io_pt' | 'vcpu'
object_params ::= '(' (object_param (',' object_param)*)? ')'
object_param ::= obj_bit_size | obj_vm_size | io_pt_level | obj_ports_size |
init_arguments | tcb_dom | obj_paddr | domain_id |
pci_device
obj_bit_size ::= number 'bits'
obj_vm_size ::= number ('k' | 'M')
io_pt_level ::= 'level' ':' number
obj_ports_size ::= number 'k' 'ports'
init_arguments ::= 'init' ':' '[' (number (',' number)*)? ']'
tcb_dom ::= 'dom' ':' number
obj_paddr ::= 'paddr' ':' number
domain_id ::= 'domainID' ':' number
pci_device ::= number ':' number '.' number
opt_obj_decls ::= '{' ((obj_decl | name_ref) ','?)* '}'
Capability Declarations
cap_decls ::= 'caps' '{' (cap_name_decl | cap_decl)* '}'
cap_name_decl ::= name '=' '(' name_ref ',' slot ')'
slot ::= number | symbolic_slot
symbolic_slot ::= 'cspace' | 'vspace' | 'reply_slot' | 'caller_slot' |
'ipc_buffer_slot'
cap_decl ::= name_ref '{' (cap_mapping_or_ref ';'?)* '}'
cap_mapping_or_ref ::= (slot ':')? cap_name? (cap_mapping | cap_name_ref)
cap_name ::= name_ref '='
cap_mapping ::= name_ref cap_params? parent?
cap_params ::= '(' cap_param (',' cap_param)* ')'
cap_param ::= right+
| 'masked' ':' right+
| 'guard' ':' number
| 'guard_size' ':' number
| 'badge' ':' number
| 'ports' ':' ranges
| 'reply'
| 'master_reply'
| 'asid' ':' asid
| 'cached'
| 'uncached'
right ::= 'R' | 'W' | 'G' | 'X'
asid ::= '(' number ',' number ')'
parent ::= '- child_of' slot_ref
slot_ref ::= '(' name_ref ',' slot ')' | name_ref
cap_name_ref ::= '<' name_ref '>' cap_params? parent?
IRQ Declarations
irq_decls ::='irq_maps' '{' (irq_mapping ';'?)* '}'
irq_mapping ::= (number ':')? name_ref
CDT Declarations
cdt_decls ::='cdt' '{' cdt_decl* '}'
cdt_decl ::= slot_ref '{' ((slot_ref | cdt_decl) ';'?)* '}'
Modules
arch ::= 'arch' ('ia32' | 'arm11')
module ::= arch (obj_decls | cap_decls | irq_decls | cdt_decls)+
Semantics
In this section we show the semi-formal semantics of capDL in the form of its underlying data model in Haskell and give a brief rationale for its contents. This data model can be made fully formal by mapping it directly into the Isabelle theorem prover via Data61’s existing Haskell-to-Isabelle translation tool. The data model can also be mapped into various output formats.
After describing the data model below, we outline a semi-formal definition of the semantics: mapping the capDL syntax into the capDL data model.
Data Model
Rationale
The purpose of capDL is describing a snapshot of the capability distribution in a system running on the seL4 microkernel. For this purpose, we need to know which objects exist in the system and which capabilities they have access to. The second purpose of capDL is to serve as sufficiently detailed input to automated code generation tools. Therefore we need to include all information about capability arguments that such implementations will need. Some of this information will not be relevant for security analysis. For instance, for a security analysis it may be necessary to know which frames of physical memory an entity in the system can access via virtual memory, but it is not necessary to know under which virtual address each of these physical addresses is visible to the process. For a concrete implementation of a bootstrapping component on the other hand, this latter information is crucial.
The Model
We refer to objects by name. This name could be of any type; for
convenience we either use a plain string or a string with an index. We
use the Maybe
type of Haskell to express this.
types ObjID = (String, Maybe Word)
With this the capability state of a system is fully described by a map
from ObjID
to Object
. Since not all objects and capabilities are
supported by seL4 on all machine architectures, we also store which
architecture the system is intended for.
data Model = Model {
arch :: Arch,
objects :: Map ObjID Object
}
data Arch = IA32 | ARM11
Objects are described by the following data type. We are mainly interested in what capabilities an object contains. We also store information that is relevant for creating an object such as its type and its size when the size is configurable. This means in contrast to the security model, where we only talk about abstract entities, we give more detailed information in capDL.
types CapMap = Map Word Cap
data Object = Endpoint
| Notification
| TCB { slots :: CapMap, initArguments :: [Word] }
| CNode { slots :: CapMap, sizeBits :: Word }
| Untyped { maybeSizeBits :: Maybe Word }
| ASIDPool { slots :: CapMap }
| PT { slots :: CapMap }
| PD { slots :: CapMap }
| Frame { vmSizeBits :: Word }
| IOPorts { size :: Word }
| IOPT { slots :: CapMap, level :: Word }
| IODevice { slots :: CapMap }
The first two objects, communication endpoints and notifications, do not
store any capabilities and have a fixed size. Thread Control Blocks
(TCB) contain a small number of capability registers, modelled by the
field slots
, a map from a machine word (the slot/register number) to
the capability that is stored in that slot. Slots may be empty. CNode
objects are the main capability storage object in seL4. They have
configurable size. Untyped objects are conceptual containers for
dynamically created objects. They cover a set of objects referred to by
their ObjID
. This set is called the covering set. In the
implementation untyped objects must have a size, but in capDL
specifications we often want to leave the size unspecified (as large as
necessary).
The next batch of objects in the data type above are architecture
dependent. Most of these data structures and devices do not store actual
capabilities in the kernel and hardware implementation. Instead they
store pointers or mappings to other resources in the system. Since these
resources are already represented as objects, we use capabilities to
model these mappings and to express the access, for instance, a device
may have to physical memory. In detail, the architecture dependent
objects are as follows. ASID pools store which page directories (PD) can
be activated by the machine. As mentioned, we represent this mapping as
a capability in the model. Analogously, we model the mappings in virtual
memory objects page directory (PD) and page table (PT). Physical memory
frames (Frame
) come in different sizes, depending on architecture, but
do not contain further capabilities. IOMMU page tables (IOPT
) come in
multiple levels and again store mappings to frames or further page
tables. Finally, hardware devices are represented by the object type
IODevice
. Again, we model the access hardware may have to resources in
the system as capabilities. Specifically, IODevices may have a
capability to a CNode for IRQ delivery, potentially multiple
capabilities to physical memory frames for memory mapped device
registers, and one capability for the root IOMMU space.
To conclude the description of the capDL data model, it remains to
define the data type of capabilities Cap
. Almost all capabilities in
seL4 refer to one specific object. Some may refer to a set of objects or
implicitly to all objects of a given type. Many of the capabilities
store explicit access rights. These are modelled as follows:
data Rights = Read | Write | Grant | GrantReply
types CapRights = Set Right
Again in contrast to the security model of seL4, we explicitly distinguish different types of capabilities in capDL and store slightly different kinds of additional information with each. As a side effect, we do not need an explicit representation of the create right. The create right is conferred by the possession of an untyped capability.
data Cap = NullCap
| UntypedCap { capObj :: ObjID }
| EndpointCap { capObj :: ObjID, capBadge :: Word,
capRights :: CapRights }
| NotificationCap { capObj :: ObjID, capBadge :: Word,
capRights :: CapRights }
| ReplyCap { capObj :: ObjID }
| MasterReplyCap { capObj :: ObjID }
| CNodeCap { capObj :: ObjID, capGuard :: Word,
capGuardSize :: Word }
| TCBCap { capObj :: ObjID }
| IRQControlCap
| IRQHandlerCap { capObj :: ObjID }
| FrameCap { capObj :: ObjID, capRights :: CapRights }
| PTCap { capObj :: ObjID }
| PDCap { capObj :: ObjID }
| ASIDControlCap
| ASIDPoolCap { capObj :: ObjID }
| IOPortsCap { capObjs :: Set Word }
| IOSpaceMasterCap
| IOSpaceCap { capObj :: ObjID }
| IOPTCap { capObj :: ObjID }
In detail, the capabilities are as follows. We go through the list of
capabilities and give a brief indication of the authority they convey.
The NullCap
is occasionally used to represent the absence of a
capability. An untyped capability points to an untyped object and
confers the right to issue create/retype and revoke/delete operations.
Endpoint and notification capabilities point to their respective object
types. They explicitly store which Read/Write/Grant access is conferred
to the communications channel and they may carry one machine word of
additional information, called the badge. CNode capabilities in addition
to their CNode object reference carry information that influences the
lookup of capabilities in the node they point to. This information
consists of a guard in the sense of guarded page tables, the size of
that guard, and finally the rights to the CNode object itself. A TCB cap
gives authority over one TCB object with given access rights. The
IRQControl capability gives authority to create specific IRQHandler
capabilities. Each device may have a special CNode associated with it
that may contain a notification which in turn is used to deliver
interrupts to user level. IRQHandler caps in capDL point to these
specific CNodes. It confers the authority to change which endpoint the
interrupt is delivered to. Frame capabilities confer authority to map
and unmap physical memory frames into and from a page directory or page
table with the specified access rights. PT caps give authority to
map/unmap page tables into/from page directories and PD caps give the
authority to map/unmap page directories into ASID pools and to install
them as virtual memory space of a process. The ASIDControlCap together
with an Untyped cap confers authority to create new ASID pools. Only a
limited, fixed number can be created in the system. ASIDPool caps give
the authority to map/unmap page directories into/from the ASID pool.
Mapping a frame into a page table for instance, needs the frame cap to
specify which frame to map with which rights and the PT cap to specify
in which page table. The IOPorts capability gives access to a set of
IOPorts on the ia32 architecture. The IOSpaceMasterCap confers the
authority to create IOSpace caps for specific devices. IOSpaceCaps point
to IODevices and confer the authority to set the root IOPageTable for
that device. IOPTCaps are the IOMMU analogue to PT and PD caps in normal
virtual memory.
Semantics
In this section, we define a mapping from the syntax of capDL to the data model of capDL. The mapping is defined by going over the relevant productions in the capDL grammar and describing the corresponding model.
Names and Ranges
name ::= identifier
Names are mapped to ObjID
s in the model. A name without index maps to
(name, Nothing)
. Qualified names will be defined in the object
declaration section.
num ::= '[' number ']'
This production maps to a single index and is used to either declare a
set of objects, in which case the number is the number of objects, or to
refer to a specific index. The term name[x] maps to the ObjID
(name, Just x)
.
range ::= number '..' number | '..' number | number '..' | number
ranges ::= '[' ']' | '[' range (',' range)* ']'
name_ref ::= name ranges?
Ranges stand for sets of ObjID
s. For a set declared as name[n]
, the
following mappings apply:
-
name[..b]
refers toname[0]
,name[1]
, ..,name[b]
-
name[a..]
refers toname[a]
,name[a+1]
, ..,name[n]
-
name[a..b]
refers toname[a]
,name[a+1]
, ..,name[b]
-
name[]
refers toname[0]
,name[1]
, ..,name[n]
A list of ranges name[r1,r2,..r3]
refers to the union of the ranges
name[r1]
, name[r2]
, .., name[r3]
.
Module
arch ::= 'arch' ('ia32' | 'arm11')
module ::= arch (obj_decls | cap_decls)+
A module maps to a full Model
in the data model. Its Arch
component
is determined by arch
, its mapping from ObjID
to Object
by the
object and capability declarations described below.
Object Declarations
obj_decls ::= 'objects' obj_decl*
An object declaration section defines a set of objects. Each object name of typed objects should be declared only once. The covering set of untyped objects may be declared multiple times. The total covering set is the union of all covering sets mentioned for that object.
obj_decl ::= qname '=' object
An object declaration has a potentially qualified name and dimension on
the left hand side and an object content declaration on the right hand
side. A qualified name name1/name2/name3[n]
implicitly declares the
untyped objects name1
and name2
. It also declares that name2
is
covered by name1
and name3
covered by name2
. Giving a dimension
[n]
means that n objects (name3, Just 0)
to (name3, Just (n-1))
are declared, each with the same content described on the right hand
side of the equation.
The right hand side is defined by the following syntax productions:
object ::= object_type object_params? obj_decls?
object_type ::= 'ep' | 'notification' | 'tcb' | 'cnode' | 'ut' |
'asid_pool' | 'pt' | 'pd' | 'frame' | 'io_ports' |
'io_device' | 'io_pt'
object_params ::= '(' object_param (',' object_param)* ')'
object_param ::= obj_bit_size | obj_vm_size | io_pt_level | obj_ports_size
obj_bit_size ::= number 'bits'
obj_vm_size ::= number ('k' | 'M')
io_pt_level ::= 'level' ':' ('1' | '2' | '3')
obj_ports_size ::= number 'k' 'ports'
obj_decls ::= '{' ((obj_decl | name_ref) ','?)* '}'
The object content is given by the type of the object mapping to its corresponding object type in the data model. Some of the object types in the model require a size parameter, given in bits (where n bits means a size of 2^n entries). Frames’ sizes are specified directly as are the levels of IOMMU page tables. Untyped objects can be followed by a nested object declaration. Writing
name1 = ut {
name2/name3 [n] = object
name2/name4 = object
...
}
is equivalent to writing
name1/name2/name3 [n] = object
name1/name2/name4 = object
...
Capability Declarations
The capability content of objects is declared in its own section, separately from the list of objects itself.
cap_decls = 'caps' (cap_name_decl | cap_decl)*
cap_name_decl ::= name '=' '(' name_ref ',' slot ')'
slot ::= number | symbolic_slot
symbolic_slot ::= 'cspace' | 'vspace'
Next to capability content declarations, there are name declarations for
capability slots. A capability slot is defined by the container object,
referred to by a name reference (mapping to a single ObjID
) and a slot
inside the container specified by its slot number. The symbolic slot
names cspace
, vspace
stand for slot numbers 0 and 1.
A capability declaration starts with a set of container objects, each of which is declared to contain the capability mappings specified on the right hand side. Container objects may occur more than once in a specification. Their mappings are the union of all mappings for that container object.
cap_decl ::= name_ref '{' (cap_mapping_or_ref ';'?)* '}'
A cap mapping is specified by its slot and its capability and potentially an inline name declaration for the specified slot in the specified container.
cap_mapping_or_ref ::= slot ':' cap_name? (cap_mapping | cap_name_ref)
cap_name ::= name '='
cap_mapping ::= name_ref cap_params?
cap_params ::= '(' cap_param (',' cap_param)* ')'
cap_param ::= right+
| 'masked' ':' right+
| 'guard' ':' number
| 'guard_size' ':' number
| 'irq'
| 'badge' ':' number
| 'reply'
| 'master_reply'
right ::= 'R' | 'W' | 'G'
cap_name_ref ::= '<' name_ref '>' cap_params?
The capability declaration is either a cap_mapping
or cap_name_ref
.
In the latter case the mapping is a copy of the capability in the slot
referred to by name
(possibly masked
with an access rights mask). In
the former case, a cap is defined by the object it points to, together
with optional parameters depending on the type of the object in the data
model. For instance, an endpoint cap is specified if the ObjID
given
in the cap declaration refers to an endpoint object. If no parameters
are mentioned, default parameters are substituted if possible. These
defaults are: empty set of rights for access rights, full set of rights
for rights masks, guard 0, guard size 0, and badge 0. The IRQControl cap
is specified using the reserved ObjID
irq_control
, similarly
ASIDControlCap is specified by asid_control
and IOSpaceMasterCap by
io_space_master
.