Syntax Document
Source Code Structure
Nature source code uses .n
as the file extension, for example main.n
. Here's a simple program example:
import fmt // import module declaration, fmt module is a standard library string formatting module
fn main() { // fn is used for function declaration, main function serves as the program entry point
fmt.printf('hello %s', 'world') // call printf function in fmt module for formatted output
}
This program's functionality is to output hello world
in the console.
Only
fn
,import
,var
,const
,type
statements can be declared directly at the file top level and have global scope, which can be referenced by other modules. Other statements likeif
,for
,return
can only be declared within function scope.
import fmt
var global_v = 1
const PI = 3.14
type global_t = int
fn main() {
var localv = 2
type local_t = int
if true {
}
for true {
}
}
if true { // x not allowed to declare if statement in global scope
}
Variables
Automatic type inference:
var foo = 1 // v declare variable foo and assign value, foo type is automatically inferred as int
var foo = 1 // x duplicate variable declaration not allowed in the same scope
if (true) {
var foo = 2 // v duplicate declaration allowed in different scopes
}
Direct type declaration without type inference:
int foo = 1 // v
u8 foo2 = 1 // v literal automatic type inference
float bar = 2.2 // v
string car = 'hello world' // v strings use single quotes
string car2 = "hello world" // v double quotes supported
string car3 = `hello
world` // v backticks supported
foo = 2 // v variables allow reassignment
foo = 'hello world' // x foo is already defined as int type, cannot assign string
i8 f2 = 12 // v literals can automatically convert according to type
i16 f3 // x variable declaration must include assignment
Compound type variable definition:
string bar = '' // v use this method to declare an empty string
[int] baz = [] // v declare empty vec
var baz2 = vec_new<string>('default value', 10) // method 2
{float} s = {} // v declare empty set
var s = set_new<float>() // v method 2
{string:float} m = {} // v declare empty map
var m = map_new<string,float>() // v method 2
(int,bool,bool) t = (1, true, false) // v define tuple, use as t[0], t[1], t[2]
var (t1, t2, t3) = (1, true, 'hello') // tuple destructuring declaration
var sum = fn(int a, int b):int { // closure declaration
return a + b
}
var baz = [] // x, cannot infer vec element type
bar = null // x cannot assign null to any compound type or simple type
How to assign null?
// method 1 nullable
string? s = null
s = 'i am string'
// method 2
nullable<string> s2 = null
s2 = 'hello world'
// method 3 union
type union_test = int|float|null
union_test u = null
// method 3 any
any s3 = null // any is a special union type that includes all types by default
Constants
const PI = 3.14 // v float type, constant definition cannot declare type, compiler will automatically infer
const STR = 'hello world' // v string type
const INDEX = 1 // v int type
const B = true // v bool type
const STR2 = STR + 'byebye' // v constant folding calculation
const AREA = PI * 2 * 2 // v calculation example, supports common operators
const RESULT = test() // x constant definition only supports literal integer/float/string/bool types
Constant definition recommends using uppercase underscore separation.
Control Structures
if syntax
if condition {
// code executed when condition is true
} else {
// code executed when condition is false
}
The condition type must be bool.
You can use else syntax to check multiple conditions, example:
int foo = 23
if foo > 100 {
print('foo > 100')
} else if foo > 20 {
print('foo > 20') // matches
} else {
print('else handle')
}
for syntax
The for
statement is used for loop execution of code blocks. The for
statement in nature language has three main forms: classic loop, conditional loop and iteration loop.
Classic loop
Classic loop is used for executing a fixed number of iterations. Basic syntax:
var sum = 0
for int i = 1; i <= 100; i += 1 {
sum += i
}
println('1 +..+100 = ', sum)
❗️Note: Nature does not have
++
syntax, please usei+=1
instead ofi++
. The expression afterfor
does not need parentheses.
Conditional loop
Conditional loop is used for executing loops based on conditions, similar to while
expression in C language. Basic syntax:
var sum = 0
var i = 0
for i <= 100 {
sum += i
i += 1
}
println('1 +..+100 = ', sum)
In this example, the loop will continue until i
is greater than 100. The final output is the same as the classic loop.
Iteration loop
Iteration loop is used for traversing collection types, supports vec, map, string, chan. Basic syntax:
Traversing vec:
var list = [1, 1, 2, 3, 5, 8, 13, 21]
for v in list {
println(v)
}
Traversing map:
var m = {1:10, 2:20, 3:30, 4:40}
for k in m {
println(k)
}
In this example, the loop traverses each key in the map
and outputs them.
Traversing both keys and values, for vec k is the index:
for k,v in m {
println(k, v)
}
Traversing channel:
// example: var ch = chan_new<int>()
// yield until ch.recv() receives message to wake up current coroutine
for v in ch {
println('recv v ->', v)
}
Break and Continue
Keyword break
is used to exit the current loop, continue
skips the current loop logic and immediately enters the loop condition logic.
Functions
Functions are first-class citizens in nature and can be passed as values. Nature supports various function definition and usage methods.
Function Definition
Basic function definition syntax:
fn function_name(parameter_list):return_type {
// function body
}
A simple addition function:
fn sum(int a, int b):int {
return a + b
}
Anonymous Functions
Nature supports anonymous functions and performs closure processing on anonymous functions.
fn main() {
int c = 3
var f = fn(int a, int b):int {
return a + b + c
}
var result = f(1, 2) // call anonymous function
}
Variadic Parameters
Functions support variable number of parameters, using fixed format ...
+ vec<T>
to create a variadic parameter:
fn sum(...[int] numbers):int {
var result = 0
for v in numbers {
result += v
}
return result
}
println(sum(1, 2, 3, 4, 5))
Parameter Destructuring
Supports destructuring parameters when calling functions:
fn printf(string fmt, ...[any] args) {
var str = sprintf(fmt, ...args)
print(str)
}
Multiple Return Values
Nature programming language does not support multiple return values, but you can use tuple destructuring syntax to simulate multiple return values. Tuple data structure will be introduced in more detail later:
fn divide(int a, int b):(int, int) {
return (a / b, a % b)
}
// Use tuple destructuring to receive return values
var (quotient, remainder) = divide(10, 3)
Function Types
Use fn to declare function types, function types can usually omit parameter names.
type calc_fn = fn(int,int):int // declare fn type
fn apply(calc_fn f, int x, int y):int {
return f(x, y)
}
fn main() {
var result = apply(fn(int a, int b):int {
return a + b
}, 3, 4)
println(result)
}
Additional Notes
Functions in nature must explicitly declare parameter types and return types. If a function does not need to return a value, you can omit the return type declaration, or use void type to declare that the function has no return value:
fn print() { // v
}
fn print():void { // v
}
Nature programming language always uses type prefix, including return value types. If you think return_type after parameters is type postfix and confusing, you can understand it this way:
// This is return type postfix
func sum(a int, b int) (c int) {
c = a + b
return c
}
// This is return type prefix
fn sum(int a, int b):(int c) {
c = a + b
return c
}
In any case, the position of return value in function definition has nothing to do with type prefix or postfix!
Comments
Single line comments:
// This is a single line comment
var str = `hello world` // This is also a single line comment
Multi-line comments:
/*
This is a multi-line comment
can span multiple lines
*/
var str = `hello world`
Type System
Numeric Types
Type | Bytes | Description |
---|---|---|
int | - | Signed integer, consistent with platform CPU width (8 bytes on 64-bit platform) |
i8 | 1 | 8-bit signed integer |
i16 | 2 | 16-bit signed integer |
i32 | 4 | 32-bit signed integer |
i64 | 8 | 64-bit signed integer |
uint | - | Unsigned integer, consistent with platform CPU width |
u8 | 1 | 8-bit unsigned integer |
u16 | 2 | 16-bit unsigned integer |
u32 | 4 | 32-bit unsigned integer |
u64 | 8 | 64-bit unsigned integer |
float | - | Floating point number, consistent with platform CPU width (equivalent to f64 on 64-bit platform) |
f32 | 4 | Single precision floating point |
f64 | 8 | Double precision floating point |
bool | 1 | Boolean type, values are true/false |
Compound Types
Type Name | Storage Location | Syntax | Example | Description |
---|---|---|---|---|
string | heap | string | string str = 'hello' | String type, can use single quotes, double quotes, backticks to declare string type |
vec | heap | [T] | [int] list = [1, 2, 3] | Dynamic array |
map | heap | {T:T} | {int:string} m = {1:'a'} | Map can be iterated with for |
set | heap | {T} | {int} s = {1, 2, 3} | Set |
tup | heap | (T) | (int, bool) t = (1, true) | Tuple |
function | heap | fn(T):T | fn(int,int):int f = fn(a,b){...} | Function type |
channel | heap | chan<T> | var c = chan_new<T>() | Communication channel |
struct | stack | struct {T field} | struct{} | Struct |
array | stack | [T;n] | [int;3] a = [1,2,3] | Fixed length array |
Special Types
Type Name | Description | Example |
---|---|---|
self | Reference to self in struct methods, can only be used in fn extend | |
ptr | Safe pointer, cannot be null | ptr<person> p = new person() |
anyptr | Unsafe int pointer, equivalent to uintptr, commonly used for C language interaction and unsafe conversion | Any type except float can be converted to anyptr type via as |
rawptr | Unsafe nullable pointer, use & load addr syntax to get rawptr, use * indirect addr to dereference | rawptr<int> len_ptr = &len |
union | Union type, only supports declaration via type definition | type number = float|int |
any | Special union type, union of all types |
Type Operations
Type Definition
type my_int = int
type person_t = struct{
int age
float height
}
type node_t = struct{
int id
[node_t] children // v use dynamic array or pointer nested reference type
rawptr<node_t> next // v reference through raw pointer
ptr<node_t>? next // v pointer+nullable reference
node_t next // x circular reference
[node_t;2] foo // x circular reference
node2_t bar // x circular reference
}
type nullable<T> = T|null // custom union type + type parameter generics
type throwable = interface{} // interface definition, detailed introduction later
Type Conversion
Use as
keyword for explicit type conversion:
- Supports mutual conversion between integer/float types
- Supports mutual conversion between string and
[u8]
types - Supports mutual conversion between anyptr and any type except float
- Custom types and primitive types can convert to each other when underlying data structure is the same
as
is used for type conversion as well as union, any type assertion
int i = 42.5 as int // float to integer
[u8] bytes = "hello" as [u8] // string to vec
string str = bytes as string // vec<u8> to string
anyptr ptr = &i as anyptr // rawptr<int> to anyptr
type myint = int
myint foo = 12
int bar = foo as int // myint -> int
int baz = bar as myint // int -> myint
Literals support implicit type conversion in most scenarios. If the specific type cannot be identified, it defaults to int or float type.
f32 a = 1.1 // automatic inference conversion
f64 b = 2.2 // automatic inference conversion
any c = 1.1 // cannot infer, defaults to float type
i8 d = 1
i16 e = 1
i32 f = 1
any g = 1 // cannot infer, defaults to int type
Type Extension
Nature supports method extension for built-in types and custom types.
Built-in Types
Supported built-in types for extension include: bool
, string
, int
, int8
, int16
, int32
, int64
, uint
, uint8
, uint16
, uint32
, uint64
, float
, float32
, float64
, chan
, vec
, map
, set
, tuple
.
Built-in Type extend
fn string.find_char(u8 char, int after):int {
int len = self.len()
for int k = after; k < len; k += 1 {
if self[k] == char {
return k
}
}
return -1
}
Custom Type extend
type square = struct {
int length
int width
}
fn square.area():int {
// Use self for data reference in type extension
return self.length * self.width
}
type box<T> = struct{
T length
T width
}
// Generic parameters need to be consistent with type declaration
fn box<T>.area():T{
return self.length * self.width
}
What type is self? self always points to a safe pointer in the heap, so if it's a data structure like string/vec/map that is stored in the heap by default, the type of self remains consistent with the original type. Allocation in the heap means the data will be managed by GC, so it's safe data.
If it's a type like struct/int/float that is stored on the stack by default, self always references its safe pointer, which is ptr<T>
. So how to call type extension methods?
// For string this is simple, string is stored in the heap, so it can be called directly
var str = 'hello world'
var c = str.find_char()
// But for scalar types and structs, you need to use the new keyword to allocate data in the heap, like
var s = new square(length=5, width=10) // s type is ptr<square>
var a = s.area() // self is ptr<square>
But what if you have a stack-allocated data type and want to call a type extension function?
var s = square{}
// If method won't be referenced across threads, maybe we can cheat the compiler with an unsafe pointer via as conversion
(&s as anyptr as ptr<square>).area()
// We can also use the built-in macro @ula(unsafe load addr), which does the above type conversion
@ula(s).area()
// If we want memory safety while not calling new keyword for heap allocation?
// We can also use @sla(safe load addr) macro, which triggers simple escape analysis and allocates s in the heap by default.
@sla(s).area()
But having @sla macros everywhere can be annoying, so finally we can do this!
var s = square{}
s.area() // v eq @sla(s).area()
Yes, when calling methods on stack data types, @sla macro is automatically added. But this is not intelligent enough, this is a compromise, the reason for this compromise is that there will be more intelligent and safe auto @sla based on escape analysis in the future.
The above documentation is somewhat lengthy, but I hope users can remember that the self keyword in methods is always a safe pointer pointing to the heap (except for @ula of course).
Arithmetic Operators
Priority | Keyword | Usage Example | Description |
---|---|---|---|
1 | () | (1 + 1) | Expression grouping |
2 | - | -12 | Negative number |
2 | ! | !true | Logical NOT |
2 | ~ | ~12 | Bitwise NOT |
2 | & | &q | Load data address |
2 | * | *p | *ptr_var dereference |
3 | / | 1 / 2 | Division |
3 | * | 1 * 2 | Multiplication |
3 | % | 5 % 2 | Remainder |
4 | + | 1 + 1 | Addition |
4 | - | 1 - 1 | Subtraction |
5 | << | 100 << 2 | Bitwise left shift |
5 | >> | 100 >> 2 | Bitwise right shift |
6 | > | 1 > 2 | Greater than |
6 | >= | 1 >= 2 | Greater than or equal |
6 | < | 1 < 2 | Less than |
6 | <= | 1 <= 2 | Less than or equal |
7 | == | 1 == 2 | Equal |
7 | != | 1 != 2 | Not equal |
8 | & | 1 & 2 | Bitwise AND |
9 | ^ | 1 ^ 2 | Bitwise XOR |
10 | | | 1 | 2 | Bitwise OR |
11 | && | true && true | Logical AND |
12 | || | true || true | Logical OR |
13 | = | a = 1 | Assignment operator |
13 | %= | a %= 1 | Equivalent to a = a % 1 |
13 | *= | a *= 1 | a = a * 1 |
13 | /= | a /= 1 | a = a / 1 |
13 | += | a += 1 | a = a + 1 |
13 | -= | a -= 1 | a = a - 1 |
13 | |= | a |= 1 | a = a | 1 |
13 | &= | a &= 1 | a = a & 1 |
13 | ^= | a ^= 1 | a = a ^ 1 |
13 | <<= | a <<= 1 | a = a << 1 |
13 | >>= | a >>= 1 | a = a >> 1 |
& gets address type rawptr<T>
, rawptr is unsafe raw pointer that allows null. Pointer may dangle or point to invalid memory areas. Developers need to ensure pointer safety themselves, so raw pointers should only be used when necessary, such as when interacting with memory-unsafe languages like C.
Nature programming language development recommends using the new keyword for GC heap allocation to get safe pointers.
Struct
Structs must be declared through the type
keyword, which means defining a new type. Anonymous structs are not supported.
Basic Syntax
// Struct declaration
type person_t = struct {
string name
int age
bool active
}
// Struct initialization, struct is initialized on stack by default.
var p = person_t{
name = 'Alice',
age = 25
}
// Using new can initialize on heap and get struct pointer
ptr<person> p2 = new person(name = 'Bob', age = 30)
p2.name = 'Tom' // automatic dereference
Default Values
Struct default values only support simple constants, do not support closures, function calls and other complex default values.
type sub_t = struct{
var name = 'alice' // supports automatic type inference when default value exists
}
type person_t = struct{
string name = 'unnamed'
bool active = true
sub_t sub1 // Note! no default value here
var sub2 = sub_t{} // includes default value
}
// Initialize with default values
// p.name is 'unnamed', p.active is true,
// p.sub1.name is '', p.sub2.name is 'alice'
var p = person_t{}
Nesting and Composition
type outer = struct {
int x
rect r = rect{}
}
type animal = struct {
string name
}
type dog = struct {
animal base
string breed
}
type dog = struct {
struct {
string name
} base // x don't use anonymous struct, although it can be declared it cannot be initialized, only used for memory structure conversion
string breed
}
Data Structures
string
String is a built-in data type in nature language, data is stored on the heap. String and dynamic array [u8]
have the same storage structure on the heap, so strings and [u8]
can be converted to each other arbitrarily.
string s1 = 'hello world' // use single quotes to declare strings in non-special cases
string s2 = "hello world"
string s3 = `hello
world`
// use + sign for string concatenation
var s4 = s1 + ' one piece'
// string comparison
bool b1 = 'hello' == 'hello' // true
bool b2 = 'a' < 'b' // true
// get string length
int len = s1.len()
// access and modify through index
s1[0] = 72 // modify first character to 'H' (ASCII code 72)
// string and byte array conversion
[u8] bytes = s1 as [u8]
string s5 = bytes as string
// string traversal, traverse by char
for v in s1 {
// v type is u8, represents ASCII code value
println(v)
}
Since single quotes are also used to declare strings, if you need to get the char type of a string, you can use the following methods:
// method 1 through constant definition
const A = 97
const B = 99
u8 c = A
// method 2 through array reference
u8 c2 = '@'[0]
More string operations can call the string processing library in standard library import strings
:
import strings
fn main() {
var s = 'hello world world'
var index = s.find('wo')
var list = s.split(' ')
var list2 = ['nice', 'to', 'meet', 'you']
var s2 = strings.join(list2, '-')
var b = s.starts_with('hello')
// more methods can refer to standard library documentation
}
vec
vec is nature's built-in dynamic array type, supports dynamic expansion, stored on the heap. Concurrent calls are not safe, need to actively add locks.
// Declaration and initialization
[int] list = [1, 2, 3] // Syntax 1: recommended using [T] declaration
vec<int> list2 = [1, 2, 3] // Syntax 2: using vec<T> declaration
// Create empty vec, first parameter is initialization default type
var list3 = vec_new<int>(0, 10)
// Initialize string array, default value is string
var list7 = vec_new<string>("hello", 10)
// Similar to vec_new, automatically infer based on default parameter type
var list5 = [0;10]
// Initialize with specified cap, len = 0
var list6 = vec_cap<int>(10)
// Automatically infer empty vec type
[int] list4 = []
list[0] = 10 // modify element
var first = list[0] // get element
list.push(4) // add element
var len = list.len() // get vec length
var cap = list.cap() // get vec capacity
// Slicing and merging
var slice0 = list[1..3] // get slice from index 1 to 3(exclusive)
var slice2 = list[..3] // get slice from 0 to 3(exclusive)
var slice3 = list[1..] // get slice from 1 to len(exclusive)
// Merge two vecs into a new vec
var new_list = list.concat([4, 5, 6])
// Append to list
list.append([4, 5, 6])
// Traversal
for value in list {
}
for index,value in list {
}
map
map is nature's built-in mapping table type, used to store key-value pairs. Concurrent calls are not safe, need to actively add locks.
// Declaration and initialization
var m3 = {'a': 1, 'b': 2} // type inference
{string:int} m1 = {'a': 1, 'b': 2} // using {T:T} declaration
map<string,int> m2 = {'a': 1, 'b': 2} // using map<T,T> declaration
{string:int} empty = {} // empty map declaration method 1
var empty = map_new<string,int>() // empty map declaration method 2
// Insert/update element
m1['c'] = 3
// Get element value, if element doesn't exist will produce panic error, need to catch panic via catch
var v = m1['a']
var v2 = m1['a'] catch e {
// handle notfound
}
m1.del('b') // delete element
var exists = m1.contains('a') // check if key exists
var size = m1.len() // get element count
// Traversal
for k in m1 { // traverse keys only
println(k)
}
for k, v in m1 { // traverse both keys and values
println(k, v)
}
set
set is nature's built-in collection type, used to store unique elements. set is stored on the heap, elements are not allowed to be duplicated. Concurrent calls are not safe, need to actively add locks.
// Declaration and initialization
{int} s1 = {-32, -64, 13} // using {T} declaration
set<int> s3 = {1, 2, 3} // using set<T> declaration
// Empty set declaration
{int} empty = {}
var empty = set_new<int>()
// Basic operations
s1.add(111) // add element
s1.del(13) // delete element
var exists = s1.contains(13) // check if element exists
var found = {1, 2, 3}.contains(2) // supports method chaining
// Traversal
for v in s1 {
println(v)
}
tup
tuple is nature's built-in type, used to aggregate a group of different types of data in one structure. tuple is stored on the heap, only pointer is kept on stack.
var tup = (1, 1.1, true) // v declare and assign, multiple elements separated by comma
var tup = (1) // x tuple must contain at least two elements, otherwise cannot distinguish between expression and tup
var foo = tup[0] // v index 0 represents the first element in tuple, and so on
// x element access in tuple does not allow expressions, only int literals are allowed
var foo = tup[1 + 1]
tup[0] = 2 // v modify value in tuple
tup supports destructuring assignment:
// v values can be used with var for automatic type inference to continuously create multiple variables
var (foo, bar, car) = (1, 2, true)
// x prohibit type declaration, only allow automatic type inference through var
(custom_type, int, bool) (foo, bar, car) = (1, 2, true)
// v nested form of creating multiple variables
var (foo, (bar, car)) = (1, (2, true))
// x when creating variables, left side does not allow expressions
var list = [1, 2, 3]
var (list[0], list[1]) = (2, 4)
// v modify variable foo,bar values, can perform quick variable value swapping
(foo, bar) = (bar, foo)
// v nested form of modifying variable values
(foo, (bar, car)) = (2, (4, false))
// x left value and right value type mismatch
(foo, bar, car) = (2, (4, false))
// v tuple assignment operation allows left value expressions like ident/ident[T]/ident.T
(list[0], list[2]) = (1, 2)
// x 1+1 belongs to right value expression
(1 + 1, 2 + 2) = (1, 2)
arr
arr is fixed-length array, same as data structure in C language.
[u8;3] array = [1, 2, 3] // declare an array with length 3, element type u8
array[0] = 12
array[1] = 24
var a = array[7]
arr is allocated by default on stack or in data structures, vec is allocated on heap. This can be reflected in size calculation:
type t1 = struct {
[u8;12] array
}
var size = @sizeof(t1) // 12 * 1 = 12byte
type t2 = struct {
[u8] list
}
var size = @sizeof(t2) // vec is pointer size = 8byte
Unlike C language, when using array as parameter in nature it's based on value passing.
Error Handling
Basic Syntax
throw syntax usage example:
// x wrong usage, can't use throw stmt in a fn without an errable! declaration. example: fn rem(...):void!
fn rem(int dividend, int divisor):int {
throw errorf('....')
}
// v correct usage, function must declare containing errable(!), use `!` symbol to indicate function may throw error
fn rem(int dividend, int divisor):int! {
if divisor == 0 {
throw errorf('divisor cannot zero')
}
return dividend % divisor
}
You can use catch syntax to catch errors:
var result = rem(10, 0) catch e { // e implements throwable interface, can call related methods
println(e.msg())
1 // last expression in catch body has value passing effect, can assign 1 to result
}
You can also use try catch to catch multi-line expression errors:
try {
var a = 1
var b = a + 1
rem(10, 0)
var c = a + b
} catch e { // e implements throwable interface, can call related methods
println('catch err:', e.msg())
}
If error is not caught, it will propagate up the function call stack until caught by coroutine scheduler and exit the program.
fn bar():void! {
throw errorf('error in bar')
}
fn foo():void! {
bar()
}
fn main():void! {
foo()
}
Compile and run will get error trace stack:
coroutine 'main' uncaught error: 'error in bar' at nature-test/main.n:2:22
stack backtrace:
0: main.bar
at nature-test/main.n:2:22
1: main.foo
at nature-test/main.n:6:11
2: main.main
at nature-test/main.n:10:11
main function as unified program entry automatically adds
errable(!)
by default, no need for additional declaration
Design Philosophy
Nature's error handling is based on a design philosophy similar to Rust's Result<T,E>
, but the actual performance and syntax API design is closer to the try + catch + throw pattern.
Comparing with Rust:
// int! is equivalent to Result<int,err> in rust
// int? is equivalent to Option<int> in rust
fn rem(int dividend, int divisor):int! {
if divisor == 0 {
// equivalent to return Err('xxx')
throw errorf('divisor cannot zero')
}
// equivalent to return Ok(xxx)
return dividend % divisor
}
fn main() {
// equivalent to int result = rem(10, 0)?
int result = rem(10, 0)
/**
equivalent to
let result = match rem(10, 0) {
Err(e) => {
debug('has err', e)
0
}
Ok(r) => r
}
*/
int result = rem(10, 0) catch e {
println('has err', e.msg())
0
}
}
Since nature automatically performs error destructuring and upward propagation, in most cases you don't need to handle errors at every level, only catch when really needed.
throwable
throwable is a built-in interface, the expression after throw keyword must implement this interface.
import fmt
type throwable = interface{
fn msg():string
}
type errort:throwable = struct{
string message
bool is_panic
}
fn errort.msg():string {
return self.message
}
fn errorf(string format, ...[any] args):ptr<errort> {
var msg = fmt.sprintf(format, ...args)
return new errort(message = msg)
}
The return value of built-in function errorf is errort type, which implements the throwable interface, so errorf function can be used with throw keyword. Based on throwable interface, we can define error types more flexibly.
panic
panic is a special error type, panic does not automatically propagate along the function call chain, but directly causes the program to crash and exit. Most common is index out of bounds:
var list = [1, 2, 3]
var a = list[4]
Compile and run will get error:
coroutine 'main' panic: 'index out of vec [4] with length 3' at nature-test/main.n:3:18
panic can also be caught with catch or try catch, but since panic does not propagate along the call chain, it must be caught immediately and cannot be caught in upper layer functions of the call chain:
// panic catching method is same as error
var a = list[4] catch e {
println(e.msg())
}
// use built-in panic function to manually throw panic
panic('failed')
Union Types
Union types allow a variable to hold one of multiple possible types, nature provides a flexible union type system.
Basic syntax:
// use | operator to declare union of multiple types
type nullable<T> = T|null // nullable type
type number = int|float // numeric type
❗ Union types can only be declared in global type definitions, anonymous declaration is not supported
nullable
nullable is implemented based on union type, since it's commonly used, syntax is optimized, using ?
symbol can quickly declare nullable type:
int? foo = null // equivalent to nullable<int> foo = null
string? bar = "hello" // equivalent to nullable<string> bar = "hello"
Type Assertion
Type assertion uses the same as syntax as type conversion:
int? foo = 42
int val = foo as int // assert union type to specific type
Type Checking
is syntax is used to check the current stored type of union type:
int? foo = 42
bool is_int = foo is int // true
bool is_null = foo is null // false
// when using type checking in conditional statements, if foo's specific type can be inferred through logical reasoning, automatic type assertion will be triggered at compile time
if foo is int {
// auto as: foo = foo as int
println("foo is an integer", foo + 1)
}
// x cannot perform automatic type inference
if !(foo is null) {
var bar = foo + 1 // binary type inconsistency, left is 'union', right is 'i64'
}
Pattern Matching
match syntax has special optimization for union type, can quickly perform type matching and automatic assertion.
int? foo = null
int result = match foo {
is int -> foo // auto: foo = foo as int
is null -> -1 // auto: foo = foo as null
}
any
any is a special union type that can contain any type. Small-range union types can be assigned to large-range union types that contain these types, any contains the largest type range:
any foo = 1
int? bar = null
foo = bar // v
bar = foo // x bar's range is smaller than foo
type nullable2 = int|bool|null|string
nullable2 bar2 = bar // v, nullable2 contains larger type range
Interface
Declaration method:
type measurable = interface{
fn area():int
fn perimeter():int
}
// generic parameter declaration
type measurable<T> = interface{
fn area():T
fn perimeter():T
}
Complete declaration example:
type measurable<T> = interface{
fn perimeter():T
fn area():T
}
// type implements interface
type rectangle: measurable<i64> = struct{
i64 width
i64 height
}
fn rectangle.area():i64 {
return self.width * self.height
}
fn rectangle.perimeter():i64 {
return 2 * (self.width + self.height)
}
Interface can be used as function parameter, as long as it implements the interface it can pass function parameter example:
fn print_shape(measurable<i64> s) {
println(s.area(), s.perimeter())
}
fn main() {
// value passing
var r = rectangle{width=3, height=4}
print_shape(r)
// pointer reference passing
var r1 = new rectangle(width=15, height=18)
print_shape(r1)
}
If the passed parameter is ptr, it will automatically destructure whether the type contained in ptr implements measurable. type can implement multiple interfaces, multiple interfaces are separated by ,
:
type measurable<T> = interface{
fn perimeter():T
fn area():T
}
type updatable = interface{
fn update(i64)
}
type rectangle: measurable<i64>,updatable = struct{
i64 width
i64 height
}
fn rectangle.area():i64 {
return self.width * self.height
}
fn rectangle.perimeter():i64 {
return 2 * (self.width + self.height)
}
fn rectangle.update(i64 i) {
self.width = i
self.height = i
}
You can use is
to determine the specific type of interface:
fn use_com(combination c):int {
if c is square {
// auto as: square c = c as square, c is interface
c.unique()
return 1
}
if c is ptr<square> {
// auto as: ptr<square> c = c as ptr<square>
c.unique()
return 2
}
return 0
}
// also supports match is
fn use_com(combination c):int {
return match c {
is square -> 10
is ptr<square> -> 20
_ -> 0
}
}
Interface supports nullable:
fn use(testable? test) {
if (test is testable) { // test = test as testable
println('testable value is', test.nice())
} else {
println('test not testable')
}
}
Interface can also be used as generic constraints, introduced in later generic sections. Nature programming language does not support duck typing, must actively declare implementing an interface.
Interface supports composition for quick declaration:
type measurable<T> = interface{
fn perimeter():T
fn area():T
}
type updatable = interface{
fn update(i64)
fn area():i64
}
type combination: measurable<i64>,updatable = interface{
fn to_str():string
}
// square needs to implement to_str + area + update + perimeter methods.
type square:combination = i64
Due to composition relationship, if square implements interface combination, it also implements interfaces measurable and updatable by default. As shown in the following example:
type measurable<T> = interface{
fn perimeter():T
fn area():T
}
type combination: measurable<i64> = interface{
fn to_str():string
}
type square:combination = i64
fn square.area():i64 {
i64 length = *self as i64
return length * length
}
fn square.perimeter():i64 {
return (*self * 4) as i64
}
fn square.to_str():string {
return 'hello world'
}
fn use_mea(measurable<i64> m):int {
return m.perimeter()
}
fn main():void! {
var sp = new square(8)
println(use_mea(sp))
}
Pattern Matching
Nature provides powerful pattern matching functionality through match
expressions to implement complex conditional branch logic.
Basic Syntax
Basic syntax of match expression:
match subject {
pattern1 -> expression1
pattern2 -> expression2
...
_ -> default_expression // default branch
}
Value Matching
Can directly match literal values:
var a = 12
match a {
1 -> println('one')
12 -> println('twelve') // matches successfully
20 -> println('twenty')
_ -> println('other')
}
Supports string matching:
match 'hello world' {
'hello' -> println('greeting')
'hello world' -> println('full greeting') // matches successfully
_ -> println('other')
}
Expression Matching
Can carry no subject, in this case as long as pattern expression result is true it can match, but only matches the first expression that results in true and executes corresponding body:
match {
12 > 0 && 0 > 0 -> println('case 1')
(13|(1|2)) == 15 -> println('case 2') // matches successfully
(1|2) > 3 -> println('case 3')
_ -> println('default')
}
Automatic Assertion
For union types, can use is
for type matching, when match subject is var, after successful matching automatic type assertion will be performed:
any value = 2.33
var result = match value {
is int -> 0.0
is float -> value // auto as: var value = value as float
_ -> 0.0
}
result type will be automatically inferred based on the return type of the first branch.
Code Blocks and Return Values
Match branches can use code blocks, the last line expression will serve as the return value of the block scope:
fn main() {
string result = match {
(12 > 13) -> {
var msg = 'case 1'
msg
}
(12 > 11) -> {
var msg = 'case 2'
msg // matches successfully, return value msg will be assigned to result
}
_ -> 'default'
}
println(result)
}
Generics
Type Parameters
Basic syntax uses <T>
to declare type parameters, where T is the type parameter name. Multiple type parameters can be declared, separated by commas.
// single type parameter
type box<T> = struct {
T value
}
// multiple type parameters
type pair<T, U> = struct {
T first
U second
}
type result<T> = T|error // generic union type
type list<T> = [T] // generic array type
Type parameters support nesting:
type wrapper<T> = struct {
box<T> inner // nested use of generic type
}
// using nested generics
var w = wrapper<int>{
inner = box<int>{value = 42}
}
Generic Functions
// generic function declaration
fn sum<T>(T a, T b):T {
return a + b
}
// generic function call
var result = sum<int>(1, 2) // explicitly specify type
var result2 = sum(1.1, 2.2) // automatic type inference
// type parameter definition method
type box<T> = struct {
T value
}
fn box<T>.get():T {
return self.value
}
// type parameter method and fn generic conflict, cannot be used simultaneously
fn box<T>.get<U>():U { // x method does not support generic parameters temporarily
}
Generic Constraints
Nature's generic constraints are not yet complete, only validates whether parameters passed to generics satisfy the generic constraint declaration type, but does not validate whether usage in generic functions satisfies generic constraint usage, this issue will be resolved in future updates.
Nature's generic constraints support three types, these three constraint types cannot be combined, only one constraint type can be selected:
// union constraint
type test_union = int|bool|float
fn test<T:test_union>(T param) {
println(param)
}
// union constraint can be abbreviated
fn test<T:int|bool|float>(T param) {
println(param)
}
// interface constraint
type test_interface = interface{
fn bar()
}
type test_interface2 = interface{
fn bar()
}
// parameter needs to implement methods contained in interface.
fn test<T:test_interface&test_interface2>(T param) {
println(param)
}
Usage Example
// define generic struct
type pair<T, U> = struct {
T first
U second
}
// define generic method
fn pair<T, U>.swap():(U, T) {
return (self.second, self.first)
}
fn main() {
// create generic instance
var p = pair<int, string>{
first = 42,
second = "hello"
}
// call generic method
var (s, i) = p.swap()
}
Coroutines
Coroutines are user-space lightweight threads that can run multiple coroutines on a single system thread.
Basic Usage
Using go
keyword:
var fut = go sum(1, 2) // create a shared coroutine
Using @async
macro, can carry flag parameters to define coroutine behavior:
var fut = @async(sum(1, 2), co.SAME) // SAME means new coroutine shares processor with current coroutine
future
After coroutine creation, it returns a future object. At this time the coroutine is already running, but it won't block the current coroutine. You can use await()
method to block and wait for coroutine execution completion and get return value:
import co // standard library co package contains some common coroutine functions
fn sum(int a, int b):int {
co.sleep(1000) // simulate time-consuming operation, sleep unit is ms
return a + b
}
fn main() {
var fut = go sum(1, 2)
var result = fut.await() // wait for coroutine execution completion and get result
println(result) // output: 3
}
Using
co.sleep()
can make current coroutine yield and sleep for specified milliseconds Usingco.yield()
can directly yield current coroutine's execution right and wait for next scheduling
mutex
mutex (mutual exclusion lock) is a concurrency control mechanism used to protect shared resources, ensuring that only one coroutine can access protected resources at the same time.
import co.mutex as m
// create mutex
var mu = m.mutex_t{}
// lock
mu.lock()
// critical section code
// ...
// unlock
mu.unlock()
Error Handling
Errors in coroutines can also be caught using catch
syntax:
fn div(int a, int b):int! {
if b == 0 {
throw errorf("division by zero")
}
return a / b
}
fn main() {
var fut = go div(10, 0)
var result = fut.await() catch e {
println("error:", e.msg())
0 // return default value
}
}
If errors in coroutines are not caught, the program will terminate.
Channel
channel is a communication mechanism provided by nature for inter-coroutine communication, used to safely pass data between different coroutines.
Basic Usage
// create unbuffered channel
var ch = chan_new<int>() // create channel for passing int type data
var ch_str = chan_new<string>() // create channel for passing string type data
// create buffered channel
var ch_buf = chan_new<int>(5) // create channel with buffer size 5
// send data
ch.send(42) // send data to channel
ch_str.send('hello')
// receive data
var value = ch.recv() // receive data from channel
var msg = ch_str.recv()
channel status:
ch.close() // close channel
bool closed = ch.is_closed() // check if coroutine is closed
var ok = ch.is_successful() // in closed state can check if recent read or write operation was successful
Sending data in closed state will produce error, can use catch to catch. Unfinished chan buf can continue recv, after completion recv again will throw error.
channel recv operation supports using for iterator iteration:
fn handle(chan<int> ch) {
for v in ch { // compiler automatically calls recv and yield until data is received
println('recv v ->', v)
}
}
When using unbuffered channel, data must be processed by the peer coroutine to continue execution. When there's no handler on the peer, current coroutine will yield and wait for data in channel to be successfully sent or received by peer.
When using channel with buffer, yield waiting only occurs when channel is full.
select Statement
select statement is used to simultaneously monitor multiple channel operations, syntax structure is similar to match, but only used for channel operations.
select {
ch1.on_recv() -> msg {
// handle data received from ch1
}
ch2.on_send(value) -> {
// handle after ch2 send success
}
_ -> {
// default branch, executed when all channels are not operable
}
}
When multiple cases are ready simultaneously, select will randomly choose one branch to execute. Default branch is not mandatory, when there's no default branch and no case is ready, current coroutine will yield and wait for case to be ready.
select automatically catches closed error, can check if current awakened channel operation was successful through ch.is_successful()
.
Usage Examples
Simple producer-consumer pattern
// producer
go (fn(chan<int> ch):void! {
ch.send(42)
})(ch)
// consumer
var value = ch.recv()
Using buffered channel to implement rate limiter
var limiter = chan_new<u8>(10) // allow maximum 10 concurrent tasks
for u8 i = 0; i < 100; i+=1 {
limiter.send(i) // acquire token
go (fn():void! {
// process task
limiter.recv() // release token
})()
}
Using select to implement timeout control
var ch = chan_new<string>()
var timeout = chan_new<bool>()
select {
ch.on_recv() -> msg {
println("received:", msg)
}
timeout.on_recv() -> {
println("operation timeout")
}
}
Built-in Macros
Nature programming language uses @ symbol for macro calls. The current version does not support custom macros, but has built-in some necessary macro functions:
var size = @sizeof(i8) // sizeof reads type stack memory usage
type double = f64
var hash = @reflect_hash(double) // read type hash value.
@async(delay_sum(1, 2), 0) // create coroutine
// use ula to avoid heap allocation for package struct
@ula(package).set_age(25)
var a = @default(T) // initialize default value, can be used for default value assignment in generics
Function Tags
Function tags are a special function declaration syntax used to add metadata to functions or modify function behavior. Tags start with #
symbol and must be placed before function declaration.
#linkid
#linkid
tag is used to customize function's linker symbol name:
#linkid print_message
fn log(string message):void {
// function implementation
}
If log is defined in main.n file, then by default log's symbol in executable file is main.log
. If linkid is defined, then the symbol in executable file is print_message
.
linkid is most commonly used to declare C language header files, for example:
#linkid sleep
fn sleep(int second)
When function has no body, nature treats the function as template function, can be called directly in nature code. Linker will find the corresponding sleep symbol location for correct guided calls.
#local
#local
tag is used to mark function visibility, indicating that the function is only visible within current module:
#local
fn internal_helper():void {
// function implementation
}
Compiler actually does not add any restrictions for local, this is a conventional agreement
Modules
Modules are the basic unit for organizing code in nature. Each .n
file is an independent module.
main Module
Every nature program must contain a main
function as the program entry point:
// main.n
import fmt
fn main() {
fmt.printf("Hello, World!")
}
The entry file specified in compilation command is treated as main module, such as nature build main.n
import
Use import keyword to import other modules or standard library:
// basic import, by default takes user as module ident
import "user.n"
// custom module keyword
import "user.n" as u
// import standard library
import fmt
fn main() {
var new_user = user.create_user("alice")
var another = u.create_user("bob")
fmt.printf('name is %s', another.name)
}
File-based module import has strict path restrictions, only supports relative paths, and does not support using ./
or ../
to reference. Therefore import files can only import modules in current directory or subdirectories, cannot import modules in parent directories.
Package Management
Nature installation package includes npkg program as package management software, npkg needs to work with package.toml.
package.toml
Create package.toml
in project root directory to automatically enable package management functionality, this file defines project information and dependencies:
# basic information
name = "myproject" # project name
version = "1.0.0" # version number
authors = ["Alice <a@example.com>"]
description = "project description"
license = "MIT"
type = "bin" # bin or lib
entry = "main" # library entry file (used when type = "lib")
# dependency packages, can be specified via git or local path
[dependencies]
rand = { type = "git", version = "v1.0.1", url = "jihulab.com/nature-lang/rand" }
local_pkg = { type = "local", version = "v1.0.0", path = "./local" }
Dependency Management
Use npkg sync
command in directory containing package.toml to synchronize packages in dependency management. Packages will be synced to $HOME/.nature/package directory.
$HOME/.nature/package
├── caches
└── sources
├── jihulab.com.nature-lang.os@v1.0.1
│ ├── main.n
│ └── package.toml
└── local@v1.0.0
├── main.linux_amd64.n
├── main.linux.n
├── main.n
└── package.toml
Import Syntax
Nature uses file name as module ident:
import rand // import package main module (equivalent to import rand.main)
import rand.utils.seed // import specified module, i.e., rand/utils/seed.n file
import rand.utils.seed as s // custom module name
import
searches for modules in the following order:
- Current project's name field in package.toml, i.e., referencing other modules of current project
- Project dependencies (third-party packages defined in dependencies)
- Standard library
Cross-platform Support
Can distinguish application platform through file names, for example when using import syscall
, modules will be searched and imported in the following order:
syscall.{os}_{arch}.n
syscall.{os}.n
syscall.n
Currently supported platforms:
- os: linux、darwin
- arch: amd64、arm64、riscv64
Conflict Resolution
When imported package names conflict, can use different key names in dependencies:
[dependencies]
rand_v1 = { type = "git", version = "v1.0", url = "jihulab.com/nature-lang/rand" }
rand_v2 = { type = "git", version = "v2.0", url = "jihulab.com/nature-lang/rand" }
Then import using different names:
import rand_v1
import rand_v2
Since based on file modules, editor does not detect circular imports, but in actual development it's recommended to distinguish code hierarchy relationships and avoid circular imports.
Interacting with C
Besides nature's built-in libc/libuv libraries, we can also reference other static library files in package.toml, compiler will automatically link related architecture static library files. In nature code, declare corresponding function templates through #linkid
tags and call them. Linker will automatically perform correct linking.
Not only C language, as long as programming languages can generate static libraries, nature can conveniently interact with them. But nature is based on musl libc for static compilation, so static libraries also need to be compiled purely static based on musl libc. Recommend using musl-gcc component to compile static libraries.
Since nature can customize linker and link parameters, static libraries can also be referenced through link parameters, such as:
nature build --ld '/usr/bin/ld' --ldflags '-nostdlib -static -lm -luv' main.n
Nature integrates musl libc and macOS C library by default, can directly use related functions, directly call related functions through import libc
:
import libc
fn main() {
i32 r = libc.rand()
}
Static Libraries and Template Function Declaration
Define static libraries to be linked through [links]
section in package.toml
:
[links]
libz = {
linux_amd64 = 'libs/libz_linux_amd64.a',
darwin_amd64 = 'libs/libz_darwin_amd64.a',
linux_arm64 = 'libs/libz_linux_arm64.a',
darwin_arm64 = 'libs/libz_darwin_arm64.a'
}
Use #linkid
tags and function templates to declare C function id and related parameters to be called:
#linkid gzopen
fn gzopen(anyptr fname, anyptr mode):anyptr
#linkid sleep
fn sleep(int second)
Call example:
// zlib.n
#linkid gzopen
fn gzopen(anyptr fname, anyptr mode):anyptr
// main.n
import zlib
import libc
fn main() {
var output = "output.gz"
var gzfile = zlib.gzopen(output.to_cstr(), "wb".to_cstr())
if gzfile == null {
throw errorf("failed to open gzip file")
}
// ...
}
Type Mapping
Type mapping relationship between nature and C language:
nature type | C type | Description |
---|---|---|
anyptr | uintptr | Universal pointer type |
rawptr<T> | T* | Typed pointer |
i8/u8 | int8_t/uint8_t | 8-bit integer |
i16/u16 | int16_t/uint16_t | 16-bit integer |
i32/u32 | int32_t/uint32_t | 32-bit integer |
i64/u64 | int64_t/uint64_t | 64-bit integer |
i32 | int | |
int | size_t | Platform-dependent integer, equivalent to int64_t on 64-bit system |
f32 | float | 32-bit floating point |
f64 | double | 64-bit floating point |
[T;n] | T[n] | Fixed length array, N is compile-time constant |
struct | struct | nature struct uses same alignment and ABI handling as C struct |
Get C language strings and pointers:
import libc
var str = "hello"
libc.cstr ptr = str.to_cstr() // get string address
string str2 = ptr.to_string() // cstr convert to nature string
// get rawptr type
rawptr<tm_t> time_ptr = &time_info
// get anyptr type
// any nature type (except floating point) can be converted to anyptr type
anyptr c_ptr = time_info as anyptr
Notes
- C language is memory unsafe, so need to pay special attention to memory-related issues. Using rawptr and anyptr can easily bypass nature programming language's safety checks
- Nature programming language is based on cooperative scheduling, when calling blocking C functions like sleep, read, write etc., will cause nature scheduler to block. Scheduler blocking will prevent other coroutines from running and unable to perform GC processing.
Formatting
Nature fmt tool has not been developed yet, so need some simple writing conventions.
var bar = '' // stmt ending does not need to carry ;
var global_v = 12 // except constant definitions, all other idents recommend lowercase underscore separation (including file names)
const GLOBAL_V = 12 // constants recommend uppercase underscore separation
if true {
var foo = 1 // use 4 spaces for indentation
}
call_test(
1,
2, // multi-line parameters, need to add , on last line
)
var v = [
1,
2,
3, // same as above
]
var m = {
"a": 1,
"b": 2, // same as above
}
type person_t:io.reader,io.writer = struct{ // no space needed between struct and {
var f = fn() {
}
int a = 1
int b
bool c
}
var s = person_t{ // no space needed between struct ident and {
name: "john",
age: 18, // same as above
}
// 1. Function definition '{' and function declaration need to be on same line
// 2. Space needed between return parameter and ')'
// 3. Space needed between each parameter
// 4. No space needed for return value
fn test(int arg1, int arg2):int {
}
// for loop format
for int i = 0; i < 12; i += 1 {
}
// match format
match a {
v -> {
}
_ -> {
}
}
Keywords
Type keywords:
- void, any, null, bool, ptr, rawptr, anyptr
- int, i64, i32, i16, i8
- uint, u64, u32, u16, u8
- float, f64, f32
- struct, interface
- vec, map, set, tup, chan
Declaration keywords:
- var - variable declaration
- const - constant definition
- type - type definition
- fn - function definition
- import - import module
- new - create instance
Control flow keywords:
- if, else, else if
- for, in, break, continue
- return
- match, select
- try, catch, throw
Other keywords:
- go - concurrency primitive
- as - type conversion
- is - type judgment
- true, false - bool values
- null - null value
Reserved keywords:
impl, let, pub, package, static, macro, alias