Nhan Nguyen

Understanding slices in Golang

Oct 25, 2022 golang programming
image

Slice is one of the most used data structures in Golang. Understanding how slices work will make you a better Go programmer and prevent undesirable mistakes because slices in Go have some unusual properties compared to arrays in other languages.


Arrays

As slices are built on top of arrays, to understand slices, we first need to know some basics about arrays in Go.

An array is a collection of elements of the same type and an array’s size is fixed, we can not resize an array once defined. For example, [2]int represents an array of two integers, [2]int and [3]int are different, incompatible types.

To access an element of an array arr at index n we can use the expression arr[n]

var arr [2]int
arr[0] = 9
fmt.Println(arr[0]) // 9

Arrays do not need to be initialized explicitly; the zero value of an array will have all its elements in the zero values of their type. For example, var [2]int will define an array with 2 elements having value 0 (zero value of type int).

Go’s Arrays are values. That means that when we assign or pass around the array value we actually make a copy of its contents. Changes made to the copied value won’t affect the original array.

var arr1 [2]int
arr1[0] = 1

var arr2 = arr1
arr2[0] = 9

fmt.Println(arr1[0], arr2[0]) // 1 9

Arrays can be declared literally like below.

arr1 := [2]int{1, 2}
// or we can have the compiler count the array elements for us
arr2 := [...]int{1, 2}

Slices

Arrays are used in some cases, but we won’t see they are used as often as slices. Just like an array, a slice is a sequence of elements of the same type and has a variable length. A slice definition is just like an array but without a size. We can declare a slice in multiple ways:

var s []int // slice of integers

Declare a slice literally

s := []int{1, 2, 3}

Create a slice by slicing another array, slice

var arr [5]int
var s1 = arr[1:] 

var s2 := []int{1, 2, 3, 4, 5, 6}
var s3 = s2[2:4]

Note:

  • len(s) returns the length of a slice which is the number of elements in a slice
  • cap(s) returns the capacity of a slice which is the length of the underlying array that slice s is built on top
  • Slicing operator _s[i:j]_, where 0 < i < j < cap(s), creates a new slice that refers to elements i through j-1 of the sequence s, which may be an array variable, a pointer to an array, or another slice. If i is omitted, it’s 0, and if j is omitted, it’s len(s)

Declare a slice with make

s1 := make([]int, 5) // creating a slice with length = 5, capacity = 5
s2 := make([]int, 5, 10) // creating a slice with length = 5, capacity = 10

Slice internals

Internally, a slice data structure contains 3 pieces of information: the length, capacity, and the pointer to the underlying array as this struct:

type StringSlice struct {
    ptr *string // points to the first element of the underlying array
    len, cap int
}

So, here is what happens when we create a new slice with make

s := make([]string, 5, 10)

// 1. create an empty string array arr of size 10
// var arr [10]string

// 2. create a slice data structure with len = 5, cap = 10, and
// ptr point to the first element of arr
/*
type StringSlice struct {
    ptr: &arr[0],
    len: 5,
    cap: 10
}
*/

Multiple slices can share the same underlying array and may even refer to overlapping parts of that array. For example, q2 and summer share the same underlying array months and refer to the overlapping parts of it.

months := [12]string{
    "", 
    "January", 
    "February", 
    "March", 
    "April", 
    "May", 
    "June", 
    "July", 
    "August", 
    "September", 
    "October", 
    "November", 
    "December",
}
q2 := months[4:7] // "April", "May", "June"
summer := moths[6:9] // "June", "July", "August"

Nil vs empty slices, how to check a collection is empty correctly

Declare a slice with the syntax below will create a nil slice

var s []string // nil slice

A nil slice doesn’t link to any array underline. You can think of what the internal of this slice will look like below

sInternal := StringSlice {
    ptr: nil,
    len: 0, 
    cap: 0,
}

The next declarations will create empty slices

s2 := []string{}
// or
s2 := make([]string, 0)
// or
s2 := make([]string, 0, 10)

An empty slice also has zero length like a nil slice, but instead of linking to nothing, it links to an array underline.

s2Internal := StringSlice {
    ptr: #links_to_an_array,
    len: 0, 
    cap: 0, // or 10
}

**To check if a collection is empty, use **len(s) == 0** instead of ****s == nil**

var s1 []string
s2 := []string{}

fmt.Println(len(s1) == 0) // true
fmt.Println(len(s2) == 0) // true

fmt.Println(s1 == nil) // true
fmt.Println(s2 == nil) // false

As a special case, it also will copy bytes from a string to a slice of bytes.

The copy function

The copy function signature

func copy(dst, src []Type) int

The copy function is a built-in function that copies elements from a source slice into a destination slice. The source and destination may overlap. Copy returns the number of elements copied, which will be the minimum of len(src) and len(dst).

s1 := []int{1, 2, 3, 4, 5, 6}
s2 := []int{9, 9, 9}
n := copy(s1, s2)

fmt.Println(n) // 3
fmt.Println(s1) // [9 9 9 4 5 6]

The append function

The append function signature

func append(slice []Type, elems ...Type) []Type

The append function is a built-in function that appends elements to the end of a slice. If it has sufficient capacity, the destination is resliced to accommodate the new elements. If it does not, a new underlying array will be allocated. Append returns the updated slice, maybe with a new reference to the new underlying array. So, it is therefore necessary to store the result of append, often in the variable holding the slice itself:

slice = append(slice, elem1, elem2) 
slice = append(slice, anotherSlice...) 

As a special case, it is legal to append a string to a byte slice, like this:

slice = append([]byte("hello "), "world"...)

Below is the implementation of the append function that applied for integer slices. This is for demonstration purposes, in the built-in function, the compiler will take care about the element type of slices.

func Append(slice []int, elements ...int) []int {
    n := len(slice)
    total := len(slice) + len(elements)
    if total > cap(slice) {
        // Reallocate. Grow to 1.5 times the new size, so we can still grow.
        newSize := total*3/2 + 1
        newSlice := make([]int, total, newSize)
        copy(newSlice, slice)
        slice = newSlice
    }
    slice = slice[:total]
    copy(slice[n:], elements)
    return slice
}

Slices are not “pure” reference type

Remember the struct that represents the internal data structure of slices above?

type StringSlice struct {
    ptr *string // points to the first element of the underlying array
    len, cap int
}

This is called a slice header, contains information of a slice and a link to its underlying array.

Slices are not a “pure” reference type and when assigning or passing a slice, we actually made a copy of its header. Consider the example below

s1 := make([]string, 1, 2)
s1[0] = "mango"

// chang to s2 will also be visible to s1
s2 := s1
s2[0] = "durian"
fmt.Println(s1) // [durian]
fmt.Println(s2) // [durian]

// chang to s2 will NOT be visible to s1
// but they still reference to the same array
s2 = append(s2, "orange")
fmt.Println(len(s1), len(s2), cap(s1), cap(s2)) // 1 2 2 2
fmt.Println(s1) // [durian]
fmt.Println(s1[:2]) // [durian orange]
fmt.Println(s2) // [durian orange]

// append a new element to s2, the append function make a new array has 
// doubled length and link to s2, s1 still links to the old array
s2 = append(s2, "banana")
fmt.Println(len(s1), len(s2), cap(s1), cap(s2)) // 1 3 2 4
// fmt.Println(s1[:3]) // will panic: slice bounds out of range [:3] with capacity 2
fmt.Println(s2) // [durian orange banana]

When assign s1 to s2, we made a copy of the s1 header to s2. Both headers have a link to the same underlying array. Change to s2[0] will also be visible to s1[0].

But when we append a new element to s2, this changes the length of s2 but does not change the length of s1, capacities still the same as both slices are still linking to the old array.

Now we have len(s2) == cap(s2), append a new element to s2 will cause the compiler to allocate a new larger array for it, s2 now have length of 3 and capacity of 4. s1 still references to the old array, length and capacity still remain unchanged.

A note when using make and append functions together

Consider below example

s := make([]int, 5)
s = append(s, 0, 1, 2, 3, 4)
fmt.Println(s) // [0 0 0 0 0 0 1 2 3 4]

Oop! You may expect the output would be [0 1 2 3 4] right?

Actually, make creates a slice with 5 elements at zero-value of int, and append will add new elements after that 5.

The below code would work

s := make([]int, 5)
for i, _ := range s {
    s[i] = i
}
fmt.Println(s) // [0 1 2 3 4]

A possible “gotcha”

Re-slicing a slice doesn’t make a copy of the underlying array. The full array will be kept in memory until it is no longer referenced. Occasionally this can cause the program to hold all the data in memory when only a small piece of it is needed.

func getData(limit int)[]interface{} {
    data = /*a very large data*/
    return data[:limit] 
    // the large array is still get referenced underline
}

To fix this problem we can copy the interesting data to a new slice before returning it:

func getData()[]interface{} {
    data = /*a very large data*/
    result := make([]interface{}, limit)
  copy(result, data)

    return result 
    // the large underlying array is released by the garbage collector
}

Reference