Tree data structure: proof of concept in Go

| updated

Proof of concept Go implementation for the data structure described in Tree data structure for small data sets

Representation

The tree is represented as a node, which can have other nodes as children:

type Node struct {
    name     string
    value    string
    children []Node
}

Leaf nodes don’t have any children:

func (n *Node) isLeaf() bool  {
    return len(n.children) == 0
}

A node represents a list iff none of its children has a name:

func (n *Node) isList() bool {
    for _, child := range n.children {
        if child.hasName() {
            return false
        }
    }
    return true
}

func (n *Node) hasName() bool {
    return n.name != ""
}

Test suite

To verify that this allows us to represent the data structures we’re interested in, we can use the following test suite to check the YAML representation of a variety of trees:

func TestTreeAsString(t *testing.T) {
    cases := []struct {
        name string
        tree Node
        want string
    }{
        // test cases described below
    }
    for _, tc := range cases {
        t.Run(tc.name, func(t *testing.T) {
            got := tc.tree.String()
            if got != tc.want {
                t.Errorf("\nwant: %q\ngot:  %q", tc.want, got)
            }
        })
    }
}

Conversion to YAML

The string representation is generated as follows:

func (n *Node) String() string {
    var b bytes.Buffer
    if n.hasName() {
        fmt.Fprintf(&b, "%s: ", n.name)
    }
    if n.isLeaf() {
        fmt.Fprint(&b, n.value)
    } else {
        open, close := '{', '}'
        if n.isList() {
            open, close = '[', ']'
        }
        fmt.Fprintf(&b, "%c ", open)
        for i, ch := range n.children {
            if i > 0 {
                fmt.Fprint(&b, ", ")
            }
            fmt.Fprint(&b, ch.String())
        }
        fmt.Fprintf(&b, " %c", close)
    }
    return b.String()
}

Test cases

Test cases used in TestTreeAsString, with string representations in the comments

Plain values

// false
Node{name: "", value: "false"}
// true
Node{name: "", value: "true"}

Key-value pairs

// A: aardvark
Node{name: "A", value: "aardvark"}
// B: baboon
Node{name: "B", value: "baboon"}

Simple list

// Bools: [ false, true, false ]
leafF = Node{name: "", value: "false"}
leafT = Node{name: "", value: "true"}
Node{"Bools", "", []Node{leafF, leafT, leafF}}

Simple dict

// Words: { A: aardvark, B: baboon }
leafA = Node{name: "A", value: "aardvark"}
leafB = Node{name: "B", value: "baboon"}
Node{"Words", "", []Node{leafA, leafB}}

List of lists

// Lists: [ [ A, B ], [ aardvark, baboon ] ]
lettA = Node{name: "", value: "A"}
lettB = Node{name: "", value: "B"}
anAar = Node{name: "", value: "aardvark"}
anBab = Node{name: "", value: "baboon"}
Node{"Lists", "", []Node{
    Node{"", "", []Node{lettA, lettB}},
    Node{"", "", []Node{anAar, anBab}},
}}

Dict of lists

// Lists: { Letters: [ A, B ], Animals: [ aardvark, baboon ] }
Node{"Lists", "", []Node{
    Node{"Letters", "", []Node{lettA, lettB}},
    Node{"Animals", "", []Node{anAar, anBab}},
}}

List of dicts

// Dicts: [ { A: aardvark, B: baboon }, { 0: no, 1: yes } ]
leafA = Node{name: "A", value: "aardvark"}
leafB = Node{name: "B", value: "baboon"}
leafN = Node{name: "0", value: "no"}
leafY = Node{name: "1", value: "yes"}
Node{"Dicts", "", []Node{
    Node{"", "", []Node{leafA, leafB}},
    Node{"", "", []Node{leafN, leafY}},
}}

Dict of dicts

// Dicts: { Words: { A: aardvark, B: baboon }, Codes: { 0: no, 1: yes } }
Node{"Dicts", "", []Node{
    Node{"Words", "", []Node{leafA, leafB}},
    Node{"Codes", "", []Node{leafN, leafY}},
}}

Appendix

Here is the source code as a single file: tree_test.go

Run the tests with:

go test -v tree_test.go

Update

I’ve published an experimental Go library using the above data structure for representing a parsed YAML document: javiljoen/miniyaml

It also incorporates the lexer described in YAML lexer: proof of concept in Go.