Syntax Document

Source Code Structure

Nature source code uses .n as the file extension, for example main.n. Here's a simple program example:

import fmt // import module declaration, fmt module is a standard library string formatting module
 
fn main() { // fn is used for function declaration, main function serves as the program entry point
	fmt.printf('hello %s', 'world') // call printf function in fmt module for formatted output
}

This program's functionality is to output hello world in the console.

Only fn, import, var, const, type statements can be declared directly at the file top level and have global scope, which can be referenced by other modules. Other statements like if, for, return can only be declared within function scope.

import fmt
 
var global_v = 1
 
const PI = 3.14
 
type global_t = int
 
fn main() {
	var localv = 2
	type local_t = int
	
	if true {
	}
 
	for true {
	}
}
 
if true { // x not allowed to declare if statement in global scope
}

Variables

Automatic type inference:

var foo = 1 // v declare variable foo and assign value, foo type is automatically inferred as int  
  
var foo = 1 // x duplicate variable declaration not allowed in the same scope  
  
if (true) {  
    var foo = 2 // v duplicate declaration allowed in different scopes  
}

Direct type declaration without type inference:

int foo = 1 // v
u8 foo2 = 1 // v literal automatic type inference
 
float bar = 2.2 // v  
 
string car = 'hello world' // v strings use single quotes  
string car2 = "hello world" // v double quotes supported
string car3 = `hello
world` // v backticks supported
  
foo = 2 // v variables allow reassignment  
 
foo = 'hello world' // x foo is already defined as int type, cannot assign string  
  
i8 f2 = 12 // v literals can automatically convert according to type  
i16 f3 // x variable declaration must include assignment

Compound type variable definition:

string bar = '' // v use this method to declare an empty string
 
[int] baz = [] // v declare empty vec
var baz2 = vec_new<string>('default value', 10) // method 2
 
{float} s = {} // v declare empty set
var s = set_new<float>() // v method 2
 
{string:float} m = {} // v declare empty map
var m = map_new<string,float>() // v method 2
 
(int,bool,bool) t = (1, true, false) // v define tuple, use as t[0], t[1], t[2]
 
var (t1, t2, t3) = (1, true, 'hello') // tuple destructuring declaration
 
var sum = fn(int a, int b):int { // closure declaration
	return a + b
}
 
 
var baz = [] // x, cannot infer vec element type
 
bar = null // x cannot assign null to any compound type or simple type

How to assign null?

// method 1 nullable
string? s = null
s = 'i am string'
 
// method 2
nullable<string> s2 = null 
s2 = 'hello world'
 
// method 3 union
type union_test = int|float|null
union_test u = null
 
// method 3 any
any s3 = null // any is a special union type that includes all types by default

Constants

const PI = 3.14 // v float type, constant definition cannot declare type, compiler will automatically infer
const STR = 'hello world' // v string type
const INDEX = 1 // v int type
const B = true // v bool type
 
const STR2 = STR + 'byebye' // v constant folding calculation
const AREA = PI * 2 * 2 // v calculation example, supports common operators
 
const RESULT = test() // x constant definition only supports literal integer/float/string/bool types

Constant definition recommends using uppercase underscore separation.

Control Structures

if syntax

if condition {
    // code executed when condition is true
} else {
    // code executed when condition is false
}

The condition type must be bool.

You can use else syntax to check multiple conditions, example:

int foo = 23
 
if foo > 100 {
    print('foo > 100')
} else if foo > 20 {
    print('foo > 20')  // matches
} else {
    print('else handle')
}

for syntax

The for statement is used for loop execution of code blocks. The for statement in nature language has three main forms: classic loop, conditional loop and iteration loop.

Classic loop

Classic loop is used for executing a fixed number of iterations. Basic syntax:

var sum = 0
for int i = 1; i <= 100; i += 1 {
    sum += i
}
println('1 +..+100 = ', sum)

❗️Note: Nature does not have ++ syntax, please use i+=1 instead of i++. The expression after for does not need parentheses.

Conditional loop

Conditional loop is used for executing loops based on conditions, similar to while expression in C language. Basic syntax:

var sum = 0
var i = 0
for i <= 100 {
    sum += i
    i += 1
}
println('1 +..+100 = ', sum)

In this example, the loop will continue until i is greater than 100. The final output is the same as the classic loop.

Iteration loop

Iteration loop is used for traversing collection types, supports vec, map, string, chan. Basic syntax:

Traversing vec:

var list = [1, 1, 2, 3, 5, 8, 13, 21]
for v in list {
    println(v)
}

Traversing map:

var m = {1:10, 2:20, 3:30, 4:40}
for k in m {
    println(k)
}

In this example, the loop traverses each key in the map and outputs them.

Traversing both keys and values, for vec k is the index:

for k,v in m {
    println(k, v)
}

Traversing channel:

// example: var ch = chan_new<int>()
// yield until ch.recv() receives message to wake up current coroutine
for v in ch {
    println('recv v ->', v)  
}

Break and Continue

Keyword break is used to exit the current loop, continue skips the current loop logic and immediately enters the loop condition logic.

Functions

Functions are first-class citizens in nature and can be passed as values. Nature supports various function definition and usage methods.

Function Definition

Basic function definition syntax:

fn function_name(parameter_list):return_type {
    // function body
}

A simple addition function:

fn sum(int a, int b):int {
    return a + b
}

Anonymous Functions

Nature supports anonymous functions and performs closure processing on anonymous functions.

fn main() {
	int c = 3
	
	var f = fn(int a, int b):int {
	    return a + b + c
	}
	
	var result = f(1, 2) // call anonymous function
}

Variadic Parameters

Functions support variable number of parameters, using fixed format ... + vec<T> to create a variadic parameter:

fn sum(...[int] numbers):int {
    var result = 0
    for v in numbers {
        result += v
    }
    return result
}
 
println(sum(1, 2, 3, 4, 5))

Parameter Destructuring

Supports destructuring parameters when calling functions:

fn printf(string fmt, ...[any] args) {
    var str = sprintf(fmt, ...args)
    print(str)
}

Multiple Return Values

Nature programming language does not support multiple return values, but you can use tuple destructuring syntax to simulate multiple return values. Tuple data structure will be introduced in more detail later:

fn divide(int a, int b):(int, int) {
    return (a / b, a % b)
}
 
// Use tuple destructuring to receive return values
var (quotient, remainder) = divide(10, 3)

Function Types

Use fn to declare function types, function types can usually omit parameter names.

type calc_fn = fn(int,int):int  // declare fn type
  
fn apply(calc_fn f, int x, int y):int {  
    return f(x, y)  
}  
  
fn main() {  
    var result = apply(fn(int a, int b):int {  
        return a + b  
    }, 3, 4)  
  
    println(result)  
}

Additional Notes

Functions in nature must explicitly declare parameter types and return types. If a function does not need to return a value, you can omit the return type declaration, or use void type to declare that the function has no return value:

fn print() { // v
}
 
fn print():void { // v
 
}

Nature programming language always uses type prefix, including return value types. If you think return_type after parameters is type postfix and confusing, you can understand it this way:

// This is return type postfix
func sum(a int, b int) (c int) {  
    c = a + b  
    return c  
}
 
// This is return type prefix
fn sum(int a, int b):(int c) {
	c = a + b
	return c
}

In any case, the position of return value in function definition has nothing to do with type prefix or postfix!

Comments

Single line comments:

// This is a single line comment
var str = `hello world` // This is also a single line comment

Multi-line comments:

/*
This is a multi-line comment
can span multiple lines
*/
var str = `hello world`

Type System

Numeric Types

TypeBytesDescription
int-Signed integer, consistent with platform CPU width (8 bytes on 64-bit platform)
i818-bit signed integer
i16216-bit signed integer
i32432-bit signed integer
i64864-bit signed integer
uint-Unsigned integer, consistent with platform CPU width
u818-bit unsigned integer
u16216-bit unsigned integer
u32432-bit unsigned integer
u64864-bit unsigned integer
float-Floating point number, consistent with platform CPU width (equivalent to f64 on 64-bit platform)
f324Single precision floating point
f648Double precision floating point
bool1Boolean type, values are true/false

Compound Types

Type NameStorage LocationSyntaxExampleDescription
stringheapstringstring str = 'hello'String type, can use single quotes, double quotes, backticks to declare string type
vecheap[T][int] list = [1, 2, 3]Dynamic array
mapheap{T:T}{int:string} m = {1:'a'}Map can be iterated with for
setheap{T}{int} s = {1, 2, 3}Set
tupheap(T)(int, bool) t = (1, true)Tuple
functionheapfn(T):Tfn(int,int):int f = fn(a,b){...}Function type
channelheapchan<T>var c = chan_new<T>()Communication channel
structstackstruct {T field}struct{}Struct
arraystack[T;n][int;3] a = [1,2,3]Fixed length array

Special Types

Type NameDescriptionExample
selfReference to self in struct methods, can only be used in fn extend
ptrSafe pointer, cannot be nullptr<person> p = new person()
anyptrUnsafe int pointer, equivalent to uintptr, commonly used for C language interaction and unsafe conversionAny type except float can be converted to anyptr type via as
rawptrUnsafe nullable pointer, use & load addr syntax to get rawptr, use * indirect addr to dereferencerawptr<int> len_ptr = &len
unionUnion type, only supports declaration via type definitiontype number = float|int
anySpecial union type, union of all types

Type Operations

Type Definition

type my_int = int
 
type person_t = struct{
	int age
	float height
}
 
type node_t = struct{
	int id
	[node_t] children // v use dynamic array or pointer nested reference type
	rawptr<node_t> next // v reference through raw pointer
	ptr<node_t>? next // v pointer+nullable reference
	node_t next // x circular reference
	[node_t;2] foo // x circular reference
	node2_t bar // x circular reference
}
 
type nullable<T> = T|null // custom union type + type parameter generics
 
type throwable = interface{} // interface definition, detailed introduction later

Type Conversion

Use as keyword for explicit type conversion:

  • Supports mutual conversion between integer/float types
  • Supports mutual conversion between string and [u8] types
  • Supports mutual conversion between anyptr and any type except float
  • Custom types and primitive types can convert to each other when underlying data structure is the same
  • as is used for type conversion as well as union, any type assertion
int i = 42.5 as int  // float to integer
 
[u8] bytes = "hello" as [u8]  // string to vec
 
string str = bytes as string // vec<u8> to string
 
anyptr ptr = &i as anyptr  // rawptr<int> to anyptr
 
type myint = int
myint foo = 12
 
int bar = foo as int // myint -> int
int baz = bar as myint // int -> myint

Literals support implicit type conversion in most scenarios. If the specific type cannot be identified, it defaults to int or float type.

f32 a = 1.1 // automatic inference conversion
f64 b = 2.2 // automatic inference conversion
any c = 1.1 // cannot infer, defaults to float type
 
i8 d = 1
i16 e = 1
i32 f = 1
any g = 1 // cannot infer, defaults to int type

Type Extension

Nature supports method extension for built-in types and custom types.

Built-in Types

Supported built-in types for extension include: bool, string, int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, float, float32, float64, chan, vec, map, set, tuple.

Built-in Type extend

fn string.find_char(u8 char, int after):int {
    int len = self.len()
    for int k = after; k < len; k += 1 {
        if self[k] == char {
            return k
        }
    }
    return -1
}

Custom Type extend

type square = struct {
    int length
    int width
}
 
fn square.area():int {
	// Use self for data reference in type extension
    return self.length * self.width
}
 
type box<T> = struct{
	T length
	T width
}
 
// Generic parameters need to be consistent with type declaration
fn box<T>.area():T{
	return self.length * self.width
}

What type is self? self always points to a safe pointer in the heap, so if it's a data structure like string/vec/map that is stored in the heap by default, the type of self remains consistent with the original type. Allocation in the heap means the data will be managed by GC, so it's safe data.

If it's a type like struct/int/float that is stored on the stack by default, self always references its safe pointer, which is ptr<T>. So how to call type extension methods?

// For string this is simple, string is stored in the heap, so it can be called directly
var str = 'hello world'
var c = str.find_char()
 
// But for scalar types and structs, you need to use the new keyword to allocate data in the heap, like
var s = new square(length=5, width=10) // s type is ptr<square>
var a = s.area() // self is ptr<square>

But what if you have a stack-allocated data type and want to call a type extension function?

var s = square{}
 
// If method won't be referenced across threads, maybe we can cheat the compiler with an unsafe pointer via as conversion
(&s as anyptr as ptr<square>).area()
 
// We can also use the built-in macro @ula(unsafe load addr), which does the above type conversion
@ula(s).area()
 
// If we want memory safety while not calling new keyword for heap allocation?
// We can also use @sla(safe load addr) macro, which triggers simple escape analysis and allocates s in the heap by default.
@sla(s).area() 

But having @sla macros everywhere can be annoying, so finally we can do this!

var s = square{}
 
s.area() // v eq @sla(s).area()

Yes, when calling methods on stack data types, @sla macro is automatically added. But this is not intelligent enough, this is a compromise, the reason for this compromise is that there will be more intelligent and safe auto @sla based on escape analysis in the future.

The above documentation is somewhat lengthy, but I hope users can remember that the self keyword in methods is always a safe pointer pointing to the heap (except for @ula of course).

Arithmetic Operators

PriorityKeywordUsage ExampleDescription
1()(1 + 1)Expression grouping
2--12Negative number
2!!trueLogical NOT
2~~12Bitwise NOT
2&&qLoad data address
2**p*ptr_var dereference
3/1 / 2Division
3*1 * 2Multiplication
3%5 % 2Remainder
4+1 + 1Addition
4-1 - 1Subtraction
5<<100 << 2Bitwise left shift
5>>100 >> 2Bitwise right shift
6>1 > 2Greater than
6>=1 >= 2Greater than or equal
6<1 < 2Less than
6<=1 <= 2Less than or equal
7==1 == 2Equal
7!=1 != 2Not equal
8&1 & 2Bitwise AND
9^1 ^ 2Bitwise XOR
10|1 | 2Bitwise OR
11&&true && trueLogical AND
12||true || trueLogical OR
13=a = 1Assignment operator
13%=a %= 1Equivalent to a = a % 1
13*=a *= 1a = a * 1
13/=a /= 1a = a / 1
13+=a += 1a = a + 1
13-=a -= 1a = a - 1
13|=a |= 1a = a | 1
13&=a &= 1a = a & 1
13^=a ^= 1a = a ^ 1
13<<=a <<= 1a = a << 1
13>>=a >>= 1a = a >> 1

& gets address type rawptr<T>, rawptr is unsafe raw pointer that allows null. Pointer may dangle or point to invalid memory areas. Developers need to ensure pointer safety themselves, so raw pointers should only be used when necessary, such as when interacting with memory-unsafe languages like C.

Nature programming language development recommends using the new keyword for GC heap allocation to get safe pointers.

Struct

Structs must be declared through the type keyword, which means defining a new type. Anonymous structs are not supported.

Basic Syntax

// Struct declaration
type person_t = struct {
    string name
    int age
    bool active
}
 
// Struct initialization, struct is initialized on stack by default.
var p = person_t{
    name = 'Alice',
    age = 25
}
 
//  Using new can initialize on heap and get struct pointer
ptr<person> p2 = new person(name = 'Bob', age = 30)
p2.name = 'Tom'  // automatic dereference

Default Values

Struct default values only support simple constants, do not support closures, function calls and other complex default values.

type sub_t = struct{
	var name = 'alice' // supports automatic type inference when default value exists
}
 
type person_t = struct{
    string name = 'unnamed'
    bool active = true
 
    sub_t sub1         // Note! no default value here
    var sub2 = sub_t{} // includes default value
}
 
// Initialize with default values
// p.name is 'unnamed', p.active is true, 
// p.sub1.name is '', p.sub2.name is 'alice'
var p = person_t{}

Nesting and Composition

type outer = struct {
    int x
    rect r = rect{} 
}
 
type animal = struct {
    string name
}
 
type dog = struct {
    animal base
    string breed
}
 
 
type dog = struct {
    struct {
        string name
    } base // x don't use anonymous struct, although it can be declared it cannot be initialized, only used for memory structure conversion
    string breed
}

Data Structures

string

String is a built-in data type in nature language, data is stored on the heap. String and dynamic array [u8] have the same storage structure on the heap, so strings and [u8] can be converted to each other arbitrarily.

string s1 = 'hello world' // use single quotes to declare strings in non-special cases
string s2 = "hello world"
string s3 = `hello 
world`
 
// use + sign for string concatenation
var s4 = s1 + ' one piece'
 
// string comparison
bool b1 = 'hello' == 'hello'  // true
bool b2 = 'a' < 'b'          // true
 
// get string length
int len = s1.len()
 
// access and modify through index
s1[0] = 72  // modify first character to 'H' (ASCII code 72)
 
// string and byte array conversion
[u8] bytes = s1 as [u8]
string s5 = bytes as string
 
// string traversal, traverse by char
for v in s1 {
    // v type is u8, represents ASCII code value
    println(v)
}

Since single quotes are also used to declare strings, if you need to get the char type of a string, you can use the following methods:

// method 1 through constant definition
const A = 97
const B = 99
 
u8 c = A
 
// method 2 through array reference
u8 c2 = '@'[0]

More string operations can call the string processing library in standard library import strings:

import strings
 
fn main() {
	var s = 'hello world world'  
	var index = s.find('wo')
	
	var list = s.split(' ')
	
	var list2 = ['nice', 'to', 'meet', 'you']  
	var s2 = strings.join(list2, '-')
	
	var b = s.starts_with('hello')
	
	// more methods can refer to standard library documentation
}

vec

vec is nature's built-in dynamic array type, supports dynamic expansion, stored on the heap. Concurrent calls are not safe, need to actively add locks.

// Declaration and initialization
[int] list = [1, 2, 3]        // Syntax 1: recommended using [T] declaration
vec<int> list2 = [1, 2, 3]    // Syntax 2: using vec<T> declaration
 
 // Create empty vec, first parameter is initialization default type
var list3 = vec_new<int>(0, 10)
 
// Initialize string array, default value is string
var list7 = vec_new<string>("hello", 10) 
 
// Similar to vec_new, automatically infer based on default parameter type
var list5 = [0;10]
 
// Initialize with specified cap, len = 0
var list6 = vec_cap<int>(10)
 
// Automatically infer empty vec type
[int] list4 = []
 
list[0] = 10                  // modify element
var first = list[0]           // get element
list.push(4)                  // add element
var len = list.len()          // get vec length
var cap = list.cap()          // get vec capacity
 
// Slicing and merging
var slice0 = list[1..3] // get slice from index 1 to 3(exclusive)
var slice2 = list[..3] // get slice from 0 to 3(exclusive)
var slice3 = list[1..] // get slice from 1 to len(exclusive)
 
// Merge two vecs into a new vec
var new_list = list.concat([4, 5, 6]) 
 
 // Append to list
list.append([4, 5, 6])
 
 
// Traversal
for value in list {
}
 
for index,value in list {
}

map

map is nature's built-in mapping table type, used to store key-value pairs. Concurrent calls are not safe, need to actively add locks.

// Declaration and initialization
var m3 = {'a': 1, 'b': 2}              // type inference
{string:int} m1 = {'a': 1, 'b': 2}     // using {T:T} declaration
map<string,int> m2 = {'a': 1, 'b': 2}  // using map<T,T> declaration
{string:int} empty = {}                // empty map declaration method 1
var empty = map_new<string,int>()      // empty map declaration method 2
 
// Insert/update element
m1['c'] = 3                   
 
// Get element value, if element doesn't exist will produce panic error, need to catch panic via catch
var v = m1['a']                
var v2 = m1['a'] catch e {
	// handle notfound
}
 
m1.del('b')                    // delete element
 
var exists = m1.contains('a')  // check if key exists
var size = m1.len()            // get element count
 
// Traversal
for k in m1 {                  // traverse keys only
    println(k)
}
 
for k, v in m1 {              // traverse both keys and values
    println(k, v)
}

set

set is nature's built-in collection type, used to store unique elements. set is stored on the heap, elements are not allowed to be duplicated. Concurrent calls are not safe, need to actively add locks.

// Declaration and initialization
{int} s1 = {-32, -64, 13}     // using {T} declaration
set<int> s3 = {1, 2, 3}       // using set<T> declaration
 
// Empty set declaration
{int} empty = {}              
var empty = set_new<int>()     
 
// Basic operations
s1.add(111)                   // add element
s1.del(13)                    // delete element
var exists = s1.contains(13)  // check if element exists
 
var found = {1, 2, 3}.contains(2)  // supports method chaining
 
// Traversal
for v in s1 {
    println(v)
}

tup

tuple is nature's built-in type, used to aggregate a group of different types of data in one structure. tuple is stored on the heap, only pointer is kept on stack.

var tup = (1, 1.1, true) // v declare and assign, multiple elements separated by comma
 
var tup = (1) // x tuple must contain at least two elements, otherwise cannot distinguish between expression and tup
 
var foo = tup[0] // v index 0 represents the first element in tuple, and so on
 
// x element access in tuple does not allow expressions, only int literals are allowed
var foo = tup[1 + 1]
 
tup[0] = 2 // v modify value in tuple

tup supports destructuring assignment:

 
// v values can be used with var for automatic type inference to continuously create multiple variables
var (foo, bar, car) = (1, 2, true)
 
// x prohibit type declaration, only allow automatic type inference through var
(custom_type, int, bool) (foo, bar, car) = (1, 2, true)
 
// v nested form of creating multiple variables
var (foo, (bar, car)) = (1, (2, true)) 
 
// x when creating variables, left side does not allow expressions
var list = [1, 2, 3]
var (list[0], list[1]) = (2, 4)
 
// v modify variable foo,bar values, can perform quick variable value swapping
(foo, bar) = (bar, foo)
 
// v nested form of modifying variable values
(foo, (bar, car)) = (2, (4, false)) 
 
// x left value and right value type mismatch
(foo, bar, car) = (2, (4, false)) 
 
 // v tuple assignment operation allows left value expressions like ident/ident[T]/ident.T
(list[0], list[2]) = (1, 2)
 
// x 1+1 belongs to right value expression
(1 + 1, 2 + 2) = (1, 2)

arr

arr is fixed-length array, same as data structure in C language.

[u8;3] array = [1, 2, 3] // declare an array with length 3, element type u8
array[0] = 12
array[1] = 24
var a = array[7]

arr is allocated by default on stack or in data structures, vec is allocated on heap. This can be reflected in size calculation:

type t1 = struct {
    [u8;12] array
}
 
var size = @sizeof(t1) // 12 * 1 = 12byte
 
type t2 = struct {
    [u8] list
}
 
var size = @sizeof(t2) // vec is pointer size = 8byte

Unlike C language, when using array as parameter in nature it's based on value passing.

Error Handling

Basic Syntax

throw syntax usage example:

// x wrong usage, can't use throw stmt in a fn without an errable! declaration. example: fn rem(...):void!
fn rem(int dividend, int divisor):int {
	throw errorf('....')
}
 
// v correct usage, function must declare containing errable(!), use `!` symbol to indicate function may throw error
fn rem(int dividend, int divisor):int! {
	if divisor == 0 {
		throw errorf('divisor cannot zero')
	}
	return dividend % divisor
}

You can use catch syntax to catch errors:

var result = rem(10, 0) catch e { // e implements throwable interface, can call related methods
    println(e.msg())
    1 // last expression in catch body has value passing effect, can assign 1 to result
}

You can also use try catch to catch multi-line expression errors:

try {
    var a = 1
    var b = a + 1
    rem(10, 0)
    var c = a + b
} catch e { // e implements throwable interface, can call related methods
    println('catch err:', e.msg())
}

If error is not caught, it will propagate up the function call stack until caught by coroutine scheduler and exit the program.

fn bar():void! {
    throw errorf('error in bar')
}
 
fn foo():void! {
    bar()
}
 
fn main():void! {
    foo()
}

Compile and run will get error trace stack:

coroutine 'main' uncaught error: 'error in bar' at nature-test/main.n:2:22
stack backtrace:
0:	main.bar
		at nature-test/main.n:2:22
1:	main.foo
		at nature-test/main.n:6:11
2:	main.main
		at nature-test/main.n:10:11

main function as unified program entry automatically adds errable(!) by default, no need for additional declaration

Design Philosophy

Nature's error handling is based on a design philosophy similar to Rust's Result<T,E>, but the actual performance and syntax API design is closer to the try + catch + throw pattern.

Comparing with Rust:

 
// int! is equivalent to Result<int,err> in rust
// int? is equivalent to Option<int> in rust
fn rem(int dividend, int divisor):int! {
	if divisor == 0 {
		// equivalent to return Err('xxx')
		throw errorf('divisor cannot zero')
	}
	
	// equivalent to return Ok(xxx)
	return dividend % divisor
}
 
fn main() {
	// equivalent to int result = rem(10, 0)?
	int result = rem(10, 0)
	
	/**
	equivalent to
	let result = match rem(10, 0) {
		Err(e) => {
			debug('has err', e)
			0
		}
		Ok(r) => r
	}
	*/
	int result = rem(10, 0) catch e {
		println('has err', e.msg())
		0
	}
}

Since nature automatically performs error destructuring and upward propagation, in most cases you don't need to handle errors at every level, only catch when really needed.

throwable

throwable is a built-in interface, the expression after throw keyword must implement this interface.

import fmt
 
type throwable = interface{  
    fn msg():string  
}  
  
type errort:throwable = struct{  
    string message  
    bool is_panic  
}  
  
fn errort.msg():string {  
    return self.message  
}  
  
fn errorf(string format, ...[any] args):ptr<errort> {  
    var msg = fmt.sprintf(format, ...args)  
    return new errort(message = msg)  
}

The return value of built-in function errorf is errort type, which implements the throwable interface, so errorf function can be used with throw keyword. Based on throwable interface, we can define error types more flexibly.

panic

panic is a special error type, panic does not automatically propagate along the function call chain, but directly causes the program to crash and exit. Most common is index out of bounds:

var list = [1, 2, 3]
var a = list[4] 

Compile and run will get error:

coroutine 'main' panic: 'index out of vec [4] with length 3' at nature-test/main.n:3:18

panic can also be caught with catch or try catch, but since panic does not propagate along the call chain, it must be caught immediately and cannot be caught in upper layer functions of the call chain:

// panic catching method is same as error
var a = list[4] catch e {
    println(e.msg())
}
 
// use built-in panic function to manually throw panic
panic('failed')

Union Types

Union types allow a variable to hold one of multiple possible types, nature provides a flexible union type system.

Basic syntax:

// use | operator to declare union of multiple types
type nullable<T> = T|null  // nullable type
type number = int|float    // numeric type

❗ Union types can only be declared in global type definitions, anonymous declaration is not supported

nullable

nullable is implemented based on union type, since it's commonly used, syntax is optimized, using ? symbol can quickly declare nullable type:

int? foo = null        // equivalent to nullable<int> foo = null
string? bar = "hello"  // equivalent to nullable<string> bar = "hello"

Type Assertion

Type assertion uses the same as syntax as type conversion:

int? foo = 42
int val = foo as int  // assert union type to specific type

Type Checking

is syntax is used to check the current stored type of union type:

int? foo = 42
bool is_int = foo is int    // true
bool is_null = foo is null  // false
 
// when using type checking in conditional statements, if foo's specific type can be inferred through logical reasoning, automatic type assertion will be triggered at compile time
 
if foo is int {
	// auto as: foo = foo as int
    println("foo is an integer", foo + 1)
}
 
// x cannot perform automatic type inference
if !(foo is null) {  
	var bar = foo + 1 // binary type inconsistency, left is 'union', right is 'i64'
}

Pattern Matching

match syntax has special optimization for union type, can quickly perform type matching and automatic assertion.

int? foo = null
int result = match foo {
    is int -> foo        // auto: foo = foo as int
    is null -> -1        // auto: foo = foo as null
}

any

any is a special union type that can contain any type. Small-range union types can be assigned to large-range union types that contain these types, any contains the largest type range:

any foo = 1
int? bar = null
foo = bar // v
bar = foo // x bar's range is smaller than foo
 
type nullable2 = int|bool|null|string  
nullable2 bar2 = bar // v, nullable2 contains larger type range

Interface

Declaration method:

type measurable = interface{  
    fn area():int  
    fn perimeter():int
}
 
// generic parameter declaration
type measurable<T> = interface{  
    fn area():T  
    fn perimeter():T  
}

Complete declaration example:

type measurable<T> = interface{  
    fn perimeter():T  
    fn area():T  
}  
 
// type implements interface
type rectangle: measurable<i64> = struct{  
    i64 width  
    i64 height  
}  
 
fn rectangle.area():i64 {  
    return self.width * self.height  
}
 
fn rectangle.perimeter():i64 {  
    return 2 * (self.width + self.height)  
}

Interface can be used as function parameter, as long as it implements the interface it can pass function parameter example:

fn print_shape(measurable<i64> s) {  
    println(s.area(), s.perimeter())  
}  
 
fn main() {
	// value passing
	var r = rectangle{width=3, height=4}  
	print_shape(r)  
 
	// pointer reference passing
	var r1 = new rectangle(width=15, height=18)  
	print_shape(r1)
}

If the passed parameter is ptr, it will automatically destructure whether the type contained in ptr implements measurable. type can implement multiple interfaces, multiple interfaces are separated by ,:

type measurable<T> = interface{  
    fn perimeter():T  
    fn area():T  
}  
  
type updatable = interface{  
    fn update(i64)  
}  
  
type rectangle: measurable<i64>,updatable = struct{  
    i64 width  
    i64 height  
}  
fn rectangle.area():i64 {  
    return self.width * self.height  
}  
fn rectangle.perimeter():i64 {  
    return 2 * (self.width + self.height)  
}  
fn rectangle.update(i64 i) {  
   self.width = i  
   self.height = i  
}

You can use is to determine the specific type of interface:

fn use_com(combination c):int {  
    if c is square {  
        // auto as: square c = c as square, c is interface  
        c.unique()  
        return 1  
    }  
    if c is ptr<square> {  
        // auto as: ptr<square> c = c as ptr<square>  
        c.unique()  
        return 2  
    }  
    return 0  
}
 
// also supports match is
fn use_com(combination c):int {  
    return match c {  
        is square -> 10  
        is ptr<square> -> 20  
        _ -> 0  
    }  
}

Interface supports nullable:

fn use(testable? test) {  
    if (test is testable) { // test = test as testable  
        println('testable value is', test.nice())  
    } else {  
        println('test not testable')  
    }  
}

Interface can also be used as generic constraints, introduced in later generic sections. Nature programming language does not support duck typing, must actively declare implementing an interface.

Interface supports composition for quick declaration:

type measurable<T> = interface{  
    fn perimeter():T  
    fn area():T  
}  
  
type updatable = interface{  
    fn update(i64)  
    fn area():i64  
}  
  
type combination: measurable<i64>,updatable = interface{  
    fn to_str():string  
}
 
// square needs to implement to_str + area + update + perimeter methods.
type square:combination = i64

Due to composition relationship, if square implements interface combination, it also implements interfaces measurable and updatable by default. As shown in the following example:

type measurable<T> = interface{  
    fn perimeter():T  
    fn area():T  
}  
type combination: measurable<i64> = interface{  
    fn to_str():string  
}  
  
type square:combination = i64  
  
fn square.area():i64 {  
    i64 length = *self as i64  
    return length * length  
}  
fn square.perimeter():i64 {  
    return (*self * 4) as i64  
}  
fn square.to_str():string {  
    return 'hello world'  
}
 
fn use_mea(measurable<i64> m):int {
    return m.perimeter()  
}  
  
fn main():void! {  
    var sp = new square(8)  
    println(use_mea(sp))  
}

Pattern Matching

Nature provides powerful pattern matching functionality through match expressions to implement complex conditional branch logic.

Basic Syntax

Basic syntax of match expression:

match subject {
    pattern1 -> expression1
    pattern2 -> expression2
    ...
    _ -> default_expression  // default branch
}

Value Matching

Can directly match literal values:

var a = 12
match a {
    1 -> println('one')
    12 -> println('twelve')  // matches successfully
    20 -> println('twenty')
    _ -> println('other')
}

Supports string matching:

match 'hello world' {
    'hello' -> println('greeting')
    'hello world' -> println('full greeting')  // matches successfully
    _ -> println('other')
}

Expression Matching

Can carry no subject, in this case as long as pattern expression result is true it can match, but only matches the first expression that results in true and executes corresponding body:

match {
    12 > 0 && 0 > 0 -> println('case 1')
    (13|(1|2)) == 15 -> println('case 2')  // matches successfully
    (1|2) > 3 -> println('case 3')
    _ -> println('default')
}

Automatic Assertion

For union types, can use is for type matching, when match subject is var, after successful matching automatic type assertion will be performed:

any value = 2.33
var result = match value {
    is int -> 0.0
    is float -> value  // auto as: var value = value as float
    _ -> 0.0
}

result type will be automatically inferred based on the return type of the first branch.

Code Blocks and Return Values

Match branches can use code blocks, the last line expression will serve as the return value of the block scope:

fn main() {  
    string result = match {  
        (12 > 13) -> {  
            var msg = 'case 1'  
            msg  
        }  
        (12 > 11) -> {  
            var msg = 'case 2'  
            msg  // matches successfully, return value msg will be assigned to result 
        }  
        _ -> 'default'  
    }  
    println(result)  
}

Generics

Type Parameters

Basic syntax uses <T> to declare type parameters, where T is the type parameter name. Multiple type parameters can be declared, separated by commas.

// single type parameter
type box<T> = struct {
    T value
}
 
// multiple type parameters
type pair<T, U> = struct {
    T first
    U second
}
 
type result<T> = T|error    // generic union type
 
type list<T> = [T]         // generic array type

Type parameters support nesting:

type wrapper<T> = struct {
    box<T> inner    // nested use of generic type
}
 
// using nested generics
var w = wrapper<int>{
    inner = box<int>{value = 42}
}

Generic Functions

// generic function declaration
fn sum<T>(T a, T b):T {
    return a + b
}
 
// generic function call
var result = sum<int>(1, 2)      // explicitly specify type
var result2 = sum(1.1, 2.2)      // automatic type inference
 
 
//  type parameter definition method
type box<T> = struct {
    T value
}
 
fn box<T>.get():T {
    return self.value
}
 
// type parameter method and fn generic conflict, cannot be used simultaneously
 
fn box<T>.get<U>():U { // x method does not support generic parameters temporarily
}

Generic Constraints

Nature's generic constraints are not yet complete, only validates whether parameters passed to generics satisfy the generic constraint declaration type, but does not validate whether usage in generic functions satisfies generic constraint usage, this issue will be resolved in future updates.

Nature's generic constraints support three types, these three constraint types cannot be combined, only one constraint type can be selected:

// union constraint
type test_union = int|bool|float  
  
fn test<T:test_union>(T param) {  
    println(param)  
}
 
// union constraint can be abbreviated
fn test<T:int|bool|float>(T param) {  
    println(param)  
}
 
// interface constraint
type test_interface = interface{
	fn bar()
}  
type test_interface2 = interface{
	fn bar()
}  
  
// parameter needs to implement methods contained in interface.
fn test<T:test_interface&test_interface2>(T param) {  
    println(param)  
}

Usage Example

// define generic struct
type pair<T, U> = struct {
    T first
    U second
}
 
// define generic method
fn pair<T, U>.swap():(U, T) {
    return (self.second, self.first)
}
 
fn main() {
    // create generic instance
    var p = pair<int, string>{
        first = 42,
        second = "hello"
    }
    
    // call generic method
    var (s, i) = p.swap()
}

Coroutines

Coroutines are user-space lightweight threads that can run multiple coroutines on a single system thread.

Basic Usage

Using go keyword:

var fut = go sum(1, 2)  // create a shared coroutine

Using @async macro, can carry flag parameters to define coroutine behavior:

var fut = @async(sum(1, 2), co.SAME)  // SAME means new coroutine shares processor with current coroutine

future

After coroutine creation, it returns a future object. At this time the coroutine is already running, but it won't block the current coroutine. You can use await() method to block and wait for coroutine execution completion and get return value:

import co // standard library co package contains some common coroutine functions
 
fn sum(int a, int b):int {
    co.sleep(1000)  // simulate time-consuming operation, sleep unit is ms
    return a + b
}
 
fn main() {
    var fut = go sum(1, 2)
    var result = fut.await()  // wait for coroutine execution completion and get result
    println(result)  // output: 3
}

Using co.sleep() can make current coroutine yield and sleep for specified milliseconds Using co.yield() can directly yield current coroutine's execution right and wait for next scheduling

mutex

mutex (mutual exclusion lock) is a concurrency control mechanism used to protect shared resources, ensuring that only one coroutine can access protected resources at the same time.

import co.mutex as m
 
// create mutex
var mu = m.mutex_t{}
 
// lock
mu.lock()
 
// critical section code
// ...
 
// unlock
mu.unlock()

Error Handling

Errors in coroutines can also be caught using catch syntax:

fn div(int a, int b):int! {
    if b == 0 {
        throw errorf("division by zero")
    }
    return a / b
}
 
fn main() {
    var fut = go div(10, 0)
    var result = fut.await() catch e {
        println("error:", e.msg())
        0  // return default value
    }
}

If errors in coroutines are not caught, the program will terminate.

Channel

channel is a communication mechanism provided by nature for inter-coroutine communication, used to safely pass data between different coroutines.

Basic Usage

// create unbuffered channel
var ch = chan_new<int>()      // create channel for passing int type data
var ch_str = chan_new<string>() // create channel for passing string type data
 
// create buffered channel
var ch_buf = chan_new<int>(5)  // create channel with buffer size 5
 
// send data
ch.send(42)        // send data to channel
ch_str.send('hello') 
 
// receive data
var value = ch.recv()     // receive data from channel
var msg = ch_str.recv()

channel status:

ch.close()                    // close channel
bool closed = ch.is_closed()  // check if coroutine is closed
var ok = ch.is_successful()   //  in closed state can check if recent read or write operation was successful

Sending data in closed state will produce error, can use catch to catch. Unfinished chan buf can continue recv, after completion recv again will throw error.

channel recv operation supports using for iterator iteration:

fn handle(chan<int> ch) {  
    for v in ch { // compiler automatically calls recv and yield until data is received
        println('recv v ->', v)  
    }  
}

When using unbuffered channel, data must be processed by the peer coroutine to continue execution. When there's no handler on the peer, current coroutine will yield and wait for data in channel to be successfully sent or received by peer.

When using channel with buffer, yield waiting only occurs when channel is full.

select Statement

select statement is used to simultaneously monitor multiple channel operations, syntax structure is similar to match, but only used for channel operations.

select {
    ch1.on_recv() -> msg {
        // handle data received from ch1
    }
    ch2.on_send(value) -> {
        // handle after ch2 send success
    }
    _ -> {
        // default branch, executed when all channels are not operable
    }
}

When multiple cases are ready simultaneously, select will randomly choose one branch to execute. Default branch is not mandatory, when there's no default branch and no case is ready, current coroutine will yield and wait for case to be ready.

select automatically catches closed error, can check if current awakened channel operation was successful through ch.is_successful().

Usage Examples

Simple producer-consumer pattern

// producer
go (fn(chan<int> ch):void! {
    ch.send(42)
})(ch)
 
// consumer
var value = ch.recv()

Using buffered channel to implement rate limiter

var limiter = chan_new<u8>(10)  // allow maximum 10 concurrent tasks
for u8 i = 0; i < 100; i+=1 {
    limiter.send(i)             // acquire token
    go (fn():void! {
        // process task
        limiter.recv()          // release token
    })()
}

Using select to implement timeout control

var ch = chan_new<string>()
var timeout = chan_new<bool>()
 
select {
    ch.on_recv() -> msg {
        println("received:", msg)
    }
    timeout.on_recv() -> {
        println("operation timeout")
    }
}

Built-in Macros

Nature programming language uses @ symbol for macro calls. The current version does not support custom macros, but has built-in some necessary macro functions:

var size = @sizeof(i8) // sizeof reads type stack memory usage
 
type double = f64
var hash = @reflect_hash(double) // read type hash value.
 
@async(delay_sum(1, 2), 0) //  create coroutine
 
// use ula to avoid heap allocation for package struct  
@ula(package).set_age(25) 
 
var a = @default(T) // initialize default value, can be used for default value assignment in generics

Function Tags

Function tags are a special function declaration syntax used to add metadata to functions or modify function behavior. Tags start with # symbol and must be placed before function declaration.

#linkid

#linkid tag is used to customize function's linker symbol name:

#linkid print_message
fn log(string message):void {
    // function implementation
}

If log is defined in main.n file, then by default log's symbol in executable file is main.log. If linkid is defined, then the symbol in executable file is print_message.

linkid is most commonly used to declare C language header files, for example:

#linkid sleep  
fn sleep(int second)

When function has no body, nature treats the function as template function, can be called directly in nature code. Linker will find the corresponding sleep symbol location for correct guided calls.

#local

#local tag is used to mark function visibility, indicating that the function is only visible within current module:

#local
fn internal_helper():void {
    // function implementation
}

Compiler actually does not add any restrictions for local, this is a conventional agreement

Modules

Modules are the basic unit for organizing code in nature. Each .n file is an independent module.

main Module

Every nature program must contain a main function as the program entry point:

// main.n
import fmt
 
fn main() {
    fmt.printf("Hello, World!")
}

The entry file specified in compilation command is treated as main module, such as nature build main.n

import

Use import keyword to import other modules or standard library:

// basic import, by default takes user as module ident
import "user.n"
 
// custom module keyword
import "user.n" as u
 
// import standard library
import fmt
 
fn main() {
    var new_user = user.create_user("alice")
    
    var another = u.create_user("bob")
    
    fmt.printf('name is %s', another.name)
}

File-based module import has strict path restrictions, only supports relative paths, and does not support using ./ or ../ to reference. Therefore import files can only import modules in current directory or subdirectories, cannot import modules in parent directories.

Package Management

Nature installation package includes npkg program as package management software, npkg needs to work with package.toml.

package.toml

Create package.toml in project root directory to automatically enable package management functionality, this file defines project information and dependencies:

# basic information
name = "myproject"        # project name
version = "1.0.0"        # version number
authors = ["Alice <a@example.com>"]
description = "project description"
license = "MIT"
type = "bin"             # bin or lib
entry = "main"           # library entry file (used when type = "lib")
 
# dependency packages, can be specified via git or local path
[dependencies]
rand = { type = "git", version = "v1.0.1", url = "jihulab.com/nature-lang/rand" }
local_pkg = { type = "local", version = "v1.0.0", path = "./local" }

Dependency Management

Use npkg sync command in directory containing package.toml to synchronize packages in dependency management. Packages will be synced to $HOME/.nature/package directory.

$HOME/.nature/package
├── caches
└── sources
    ├── jihulab.com.nature-lang.os@v1.0.1
   ├── main.n
   └── package.toml
    └── local@v1.0.0
        ├── main.linux_amd64.n
        ├── main.linux.n
        ├── main.n
        └── package.toml
 

Import Syntax

Nature uses file name as module ident:

import rand                    // import package main module (equivalent to import rand.main)
import rand.utils.seed         // import specified module, i.e., rand/utils/seed.n file
import rand.utils.seed as s    // custom module name

import searches for modules in the following order:

  • Current project's name field in package.toml, i.e., referencing other modules of current project
  • Project dependencies (third-party packages defined in dependencies)
  • Standard library

Cross-platform Support

Can distinguish application platform through file names, for example when using import syscall, modules will be searched and imported in the following order:

  1. syscall.{os}_{arch}.n
  2. syscall.{os}.n
  3. syscall.n

Currently supported platforms:

  • os: linux、darwin
  • arch: amd64、arm64、riscv64

Conflict Resolution

When imported package names conflict, can use different key names in dependencies:

[dependencies]
rand_v1 = { type = "git", version = "v1.0", url = "jihulab.com/nature-lang/rand" }
rand_v2 = { type = "git", version = "v2.0", url = "jihulab.com/nature-lang/rand" }

Then import using different names:

import rand_v1
import rand_v2

Since based on file modules, editor does not detect circular imports, but in actual development it's recommended to distinguish code hierarchy relationships and avoid circular imports.

Interacting with C

Besides nature's built-in libc/libuv libraries, we can also reference other static library files in package.toml, compiler will automatically link related architecture static library files. In nature code, declare corresponding function templates through #linkid tags and call them. Linker will automatically perform correct linking.

Not only C language, as long as programming languages can generate static libraries, nature can conveniently interact with them. But nature is based on musl libc for static compilation, so static libraries also need to be compiled purely static based on musl libc. Recommend using musl-gcc component to compile static libraries.

Since nature can customize linker and link parameters, static libraries can also be referenced through link parameters, such as:

nature build --ld '/usr/bin/ld' --ldflags '-nostdlib -static -lm -luv' main.n

Nature integrates musl libc and macOS C library by default, can directly use related functions, directly call related functions through import libc:

import libc
 
fn main() {
	i32 r = libc.rand()
}

Static Libraries and Template Function Declaration

Define static libraries to be linked through [links] section in package.toml:

[links]
libz = { 
    linux_amd64 = 'libs/libz_linux_amd64.a',
    darwin_amd64 = 'libs/libz_darwin_amd64.a',
    linux_arm64 = 'libs/libz_linux_arm64.a', 
    darwin_arm64 = 'libs/libz_darwin_arm64.a'
}

Use #linkid tags and function templates to declare C function id and related parameters to be called:

#linkid gzopen
fn gzopen(anyptr fname, anyptr mode):anyptr
 
#linkid sleep
fn sleep(int second)

Call example:

// zlib.n
#linkid gzopen
fn gzopen(anyptr fname, anyptr mode):anyptr
 
// main.n
import zlib
import libc
 
fn main() {
    var output = "output.gz"
    var gzfile = zlib.gzopen(output.to_cstr(), "wb".to_cstr())
    if gzfile == null {
        throw errorf("failed to open gzip file")
    }
    // ...
}

Type Mapping

Type mapping relationship between nature and C language:

nature typeC typeDescription
anyptruintptrUniversal pointer type
rawptr<T>T*Typed pointer
i8/u8int8_t/uint8_t8-bit integer
i16/u16int16_t/uint16_t16-bit integer
i32/u32int32_t/uint32_t32-bit integer
i64/u64int64_t/uint64_t64-bit integer
i32int
intsize_tPlatform-dependent integer, equivalent to int64_t on 64-bit system
f32float32-bit floating point
f64double64-bit floating point
[T;n]T[n]Fixed length array, N is compile-time constant
structstructnature struct uses same alignment and ABI handling as C struct

Get C language strings and pointers:

import libc
 
var str = "hello"
libc.cstr ptr = str.to_cstr()  // get string address
string str2 = ptr.to_string() // cstr convert to nature string
 
// get rawptr type
rawptr<tm_t> time_ptr = &time_info
 
// get anyptr type
// any nature type (except floating point) can be converted to anyptr type
anyptr c_ptr = time_info as anyptr 

Notes

  • C language is memory unsafe, so need to pay special attention to memory-related issues. Using rawptr and anyptr can easily bypass nature programming language's safety checks
  • Nature programming language is based on cooperative scheduling, when calling blocking C functions like sleep, read, write etc., will cause nature scheduler to block. Scheduler blocking will prevent other coroutines from running and unable to perform GC processing.

Formatting

Nature fmt tool has not been developed yet, so need some simple writing conventions.

var bar = '' // stmt ending does not need to carry ; 
 
var global_v = 12 // except constant definitions, all other idents recommend lowercase underscore separation (including file names)
 
const GLOBAL_V = 12 // constants recommend uppercase underscore separation
 
if true {
    var foo = 1 // use 4 spaces for indentation
}
 
call_test(
    1,
    2, // multi-line parameters, need to add , on last line
)
 
var v = [
    1,
    2,
    3, // same as above
]
 
var m = {
    "a": 1,
    "b": 2, // same as above
}
 
type person_t:io.reader,io.writer = struct{ // no space needed between struct and {
	var f = fn() {
	
	}
	int a = 1
	int b
	bool c
}
 
var s = person_t{ // no space needed between struct ident and {
    name: "john",
    age: 18, // same as above
}
 
// 1. Function definition '{' and function declaration need to be on same line
// 2. Space needed between return parameter and ')'
// 3. Space needed between each parameter
// 4. No space needed for return value
fn test(int arg1, int arg2):int {
 
}
 
// for loop format
for int i = 0; i < 12; i += 1 {
 
}
 
// match format
match a {
	v -> {
	
	}
	_ -> {
	
	}
}

Keywords

Type keywords:

  • void, any, null, bool, ptr, rawptr, anyptr
  • int, i64, i32, i16, i8
  • uint, u64, u32, u16, u8
  • float, f64, f32
  • struct, interface
  • vec, map, set, tup, chan

Declaration keywords:

  • var - variable declaration
  • const - constant definition
  • type - type definition
  • fn - function definition
  • import - import module
  • new - create instance

Control flow keywords:

  • if, else, else if
  • for, in, break, continue
  • return
  • match, select
  • try, catch, throw

Other keywords:

  • go - concurrency primitive
  • as - type conversion
  • is - type judgment
  • true, false - bool values
  • null - null value

Reserved keywords:

impl, let, pub, package, static, macro, alias