CDuce: documentation: Quick reference
Identifiers
- Type and Pattern identifiers: words formed by of Unicode letters and the underscore "_" character, starting by an uppercase letter.
- value identifiers: words formed by of Unicode letters and the underscore " _" character, starting by a lowercase letter or underscore.
Scalars
- Large integers:
- Values: 0,1,2,3,...
- Types: intervals -*--10, 20--30, 50--*, ..., singletons 0,1,2,3,...
- Operators: +,-,/,*,div,mod, int_of
- Floats:
- Values: none built-in.
- Types: only Float.
- Operators: float_of : String -> Float
- Unicode characters:
- Values: quoted characters ('a', 'b', 'c', ...,'あ', 'い', ... , '私', ... , '⊆', ...), codepoint-defined characters ('\xh;' '\d;' where h and d are hexadecimal and decimal integers respectively), and backslash-escaped characters ('\t' tab, '\n' newline, '\r' return, '\\' backslash).
- Types: intervals 'a'--'z', '0'--'9', singletons 'a','b','c',...
- Operators: char_of_int : Int -> Char, int_of_char : Char -> Int
- Symbolic atoms:
- Values: `A, `B, `a, `b, `true, `false, ...
- Types: singletons `A, `B, ...
- Operators: make_atom : (String,String) -> Atom, split_atom : Atom -> (String,String)
- CDuce also supports XML Namespaces
Operators, built-in functions
- Infix:
@ : concatenation of sequences
+,*,-,div,mod : Integer,Integer -> Integer
=, <<, <=, >>, >= : t,t -> Bool = `true | `false (any non functional type t)
||, && : Bool,Bool -> Bool
not: Bool -> Bool - Prefix:
load_xml : Latin1 -> AnyXml,
load_html : Latin1 -> [ Any* ],
load_file : Latin1 -> Latin1,
load_file_utf8 : Latin1 -> String,
dump_to_file : Latin1 -> String -> [],
dump_to_file_utf8 : Latin1 -> String -> [],
print_xml : Any -> Latin1,
print_xml_utf8 : Any -> String,
print : Latin1 -> [],
print_utf8 : String -> [],
dump_xml : Any -> [],
dump_xml_utf8 : Any -> [],
int_of : String -> Int,
float_of : String -> Float,
string_of : Any -> Latin1,
char_of_int : Int -> Char,
make_atom : (String,String) -> Atom,
split_atom : Atom -> (String,String),
system : Latin1 -> { stdout = Latin1; stderr = Latin1; status = (`exited,Int) | (`stopped,Int) | (`signaled,Int) },
exit : 0--255 -> Empty,
getenv : Latin1 -> Latin1,
argv : [] -> [ String* ],
raise : Any -> Empty
Pairs
- Expressions: (e1,e2)
- Types and patterns: (t1,t2)
- Note: tuples are right-associative pairs; e.g.: (1,2,3)=(1,(2,3))
- When a capture variable appears on both side of a pair pattern, the two captured values are paired together (e.g. match (1,2,3) with (x,(_,x)) -> x ==> (1,3)).
Sequences
- Expressions: [ 1 2 3 ], which is syntactic sugar for (1,(2,(3,`nil)))
- A sub-sequence can be escaped by !: [ 1 2 ![ 3 4 ] 5 ] is then equal to [ 1 2 3 4 5 ] .
- Types and patterns : [ R ] where R is
a regular expression built on types and patterns:
- A type or a pattern is a regexp by itself, matching a single element of the sequence
- Postfix repetition operators: *,+,? and the ungreedy variants (for patterns) *?, +? ,??
- Concatenation of regexps
- For patterns, sequence capture variable x::R
- It is possible to specify a tail, for expressions, types, and patterns; e.g.: [ x::Int*; q ]
- Map: map e with p1 -> e1 | ... | pn -> en. Each element of e must be matched.
- Transform: transform e with p1 -> e1 | ... | pn -> en. Unmatched elements are discarded; each branch returns a sequence and all the resulting sequences are concatenated together.
- Selection: : select e from p1 in e1 ... pn in en where e'. SQL-like selection with the possibility of using CDuce patterns instead of variables. e1 ... en must be sequences and e' a boolean expression.
- Operators: concatenation e1 @ e2 = [ !e1 !e2 ], flattening flatten e = transform e with x -> x.
Record
- Records literal { l1 = e1; ...; ln = en }
- Types: { l1 = t1; ...; ln = tn } (closed, no more fields allowed), { l1 = t1; ...; ln = tn; .. } (open, any other field allowed). Optional fields: li =? ti instead of li = ti. Semi-colons are optional.
- Record concatenation: e1 + e2 (priority to the fields from the right argument)
- Field removal: e1 \ l (does nothing if the field l is not present)
- Field access: e1.l
- Labels are in fact Qualified Names (see XML Namespaces)
Strings
- Strings are actually sequences of characters.
- Expressions: "abc", [ 'abc' ], [ 'a' 'b' 'c' ].
- Operators: string_of, print, dump_to_file
- PCDATA means Char* inside regular expressions
XML elements
- Expressions: <(tag) (attr)>content
- If the tag is an atom `X, it can be written X (without the (..)). Similarly, parenthesis and curly braces may be omitted when attr is a record l1=e1;...;ln=en (semicolon can also be omitted in this case). E.g: <a href="abc">[ 'abc' ].
- Types and patterns: same notations.
- XPath like projection: e/t. For every XML tree in e it returns the sequence of children of type t
- Tree transformation: xtransform e with p1 -> e1 | ... | pn -> en. Applies to sequences of XML trees. Unmatched elements are left unchanged and the transformation is recursively applied to the sequence of children of the unmatched element; as for transform, each branch returns a sequence and all the resulting sequences are concatenated together.
- Operators: load_xml : Latin1 -> AnyXml; print_xml : Any -> Latin1; dump_xml : Any -> []
Functions
- Expressions:
- General form: fun f (t1->s1;...;tn->sn) p1 -> e1 | ... | pm -> em (f is optional)
- Simple function: fun f (p : t) : s = e, equivalent to fun f (t -> s) p -> e
- Multiple arguments: fun f (p1 : t1, p2 : t2,...) : s = e, equivalent to fun f ((p1,p2,...):(t1,t2,...)) : s = e (note the blank spaces around colons to avoid ambiguity with namespaces)
- Currified function: fun f (p1 : t1) (p2 : t2) ... : s = e (can be combined with the multiple arguments syntax).
- Types: t -> s
Pattern matching, exceptions, ...
- Type restriction: (e : t) (forgets any more precise type for e; note the blank spaces around colons to avoid ambiguity with namespaces)
- Pattern matching: match e with p1 -> e1 | ... | pn -> en
- Local binding: let p = e1 in e2, equivalent to match e1 with p -> e2; let p : t = e1 in e2 equivalent to let p = (e1 : t) in e2
- If-then-else: if e1 then e2 else e3, equivalent to match e1 with `true -> e2 | `false -> e3
- Exceptions:
- Raise exception: raise e
- Handle exception: try e with p1 -> e1 | ... | pn -> en
More about types and patterns
- Boolean connectives: &,|,\ (| is first-match).
- Empty and universal types: Empty,Any or _.
- Recursive types and patterns: t where T1 = t2 and ... and Tn = tn.
- Capture variable: x.
- Default values: (x := c).
References
- Type: ref T.
- Construction: ref T e.
- Dereferencing: !e1.
- Assignment: e1 := e2.
Toplevel statements
- Global expression to evaluate.
- Global let-binding.
- Global function declaration.
- Type declarations: type T = t.
- Global namespace: namespace p = "...", namespace "...".
- Source inclusion: include filename_string.
- Debug directives: debug directive argument
where directive is one of the following: accept, subtype, compile, sample, filter. - Toplevel directives: #env, #quit, #reinit_ns.
CDuce: documentation: Quick reference