Show Menu

Data Science In Go: A Cheat Sheet by

golang     numpy     datascience

Introd­uction

Go is the future for doing data science. In this cheats­heet, we look at 2 libraries that will allow you to do that.
Note on panic and error behavi­our:
1. Most tensor operations return error.
2. gonum has a good policy of when errors are returned and when panics happen.

What To Use

I ever only want a float64 matrix or vector
use gonum­/mat or gorgo­nia­/te­nsor
I want to focus on doing statis­tic­al/­sci­entific work
use gonum­/mat
I want to focus on doing machine learning work
use gonum­/mat or gorgo­nia­/te­nsor.
I want to focus on deep learning work
use gorgo­nia­/te­nsor
I want multid­ime­nsional arrays
use gorgo­nia­/te­nsor, or []mat.Ma­trix
I want to work with different data types
use gorgo­nia­/te­nsor
I want to wrangle data like in Pandas or R - with data frames
use knire­n/gota

Default Values

Numpy
a = np.Zer­os(­(2,3))
gonum/mat
a := mat.Ne­wDe­nse(2, 3, nil)
tensor
a := ts.New­(ts.Of­(Fl­oat32), ts.Wit­hSh­ape­(2,3))
A Range...
Numpy
a = np.ara­nge(0, 9).res­hap­e(3,3)
gonum
a := mat.NewDense(3, 3, floats.Span(make([]float64, 9), 0, 8)
tensor
a := ts.New­(ts.Wi­thB­ack­ing­(ts.Ra­nge­(ts.Int, 0, 9), ts.WithShape(3,3))
Identity Matrices
Numpy
a = np.eye­(3,3)
gonum/mat
a := mat.Ne­wDi­ago­nal(3, []floa­t64{1, 1, 1})
tensor
a := ts.I(3, 3, 0)

Elemen­twise Arithmetic Operations

Addition
Numpy
c = a + b
c = np.add(a, b)
a += b  ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ ­ # in-place
np.add(a, b, out=c) # reuse array
gonum/mat
c.Add(a, b)
a.Add(a, b) // in-place
tensor
var c *ts.Dense; c, err = a.Add(b)
var c ts.Tensor; c, err = ts.Add(a, b)
a.Add(b, ts.Use­Uns­afe())  ­ ­ ­ ­ // in-place
a.Add(b, ts.Wit­hRe­use(c))  ­ ­ ­ // reuse tensor
ts.Add(a, b, ts.Use­Uns­afe())  // in-place
ts.Add(a, b, ts.Wit­hRe­use(c)) // reuse
Note: The operations all returns a result and an error, omitted for brevity here. It's good habit to check for errors.
Subtraction
Numpy
c = a - b
c = np.sub­tra­ct(a, b)
gonum/mat
c.Sub(a, b)
tensor
c, err:= a.Sub(b)
c, err = ts.Sub(a, b)
Multiplication
Numpy
c = a * b
c = np.mul­tip­ly(a, b)
gonum/mat
c.MulElem(a, b)
tensor
c, err := a.Mul(b)
c, err := ts.Mul(a, b)
Division
Numpy
c = a / b
c = np.div­ide(a, b)
gonum/mat
c.DivElem(a, b)
tensor
c, err := a.Div(b)
c, err := ts.Div(a, b)
Note: When encoun­tering division by 0 for non-fl­oats, an error will be returned, and the value at which the offending value will be 0 in the result.
Note: All variations of arithmetic operations follow the patterns available in Addition for all examples.

Note on Shapes
In all of these functions, a and b has to be of the same shape. In Numpy operations with dissimilar shapes will throw an exception. With gonum­/mat it'd panic. With tensor, it will be returned as an error.

Aggreg­ation

Sum
Numpy
s = a.sum()
s = np.sum(a)
gonum/mat
var s float64 = mat.Su­m(a)
tensor
var s *ts.Dense = a.Sum()
var s ts.Tensor = ts.Sum(a)
Note: The result, which is a scalar value in this case, can be retrieved by calling s.ScalarValue()
Sum Along An Axis
Numpy
s = a.sum(­axi­s=0)
s = np.sum(a, axis=0)
gonum/mat
Write a loop, with manual aid from mat.Col and mat.Row
Note: There's no perfor­mance loss by writing a loop. In fact there arguably may be a cognitive gain in being aware of what one is doing.
tensor
var s *ts.Dense = a.Sum(0)
var s ts.Tensor = ts.Sum(a, 0)
Argmax/Argmin
Numpy
am = a.argmax()
am = np.arg­max(a)
gonum
Write a loop, using mat.Col and mat.Row
tensor
var am *ts.Dense; am, err = a.Argmax(ts.AllAxes)
var am ts.Tensor; am, err = ts.Argmax(a, ts.AllAxes)
Argmax/Argmin Along An Axis
Numpy
am = a.argmax(axis=0)
am = np.arg­max(a, axis=0)
gonum
Write a loop, using mat.Col and mat.Row
tensor
var am *ts.Dense; am, err = a.Argmax(0)
var am ts.Tensor; am, err = ts.Arg­max(a, 0)

Data Structure Creation

Numpy
a = np.arr­ay([1, 2, 3])
gonum/mat
a := mat.Ne­wDe­nse(1, 3, []floa­t64{1, 2, 3}
tensor
a := ts.New­(ts.Wi­thB­ack­ing­([]­int{1, 2, 3})
Creating a float64 matrix
Numpy
a = np.arr­ay([[0, 1, 2], [3, 4, 5]], dtype=­'fl­oat­64')
gonum/mat
a := mat.Ne­wDe­nse(2, 3, []floa­t64{0, 1, 2, 3, 4, 5})
tensor
a := ts.New­(ts.Wi­thB­ack­ing­([]­flo­at64{0, 1, 2, 3, 4, 5}, ts.Wit­hSh­ape(2, 3))
Creating a float32 3-D array
Numpy
a = np.arr­ay(­[[[0, 1, 2], [3, 4, 5]], [[100, 101, 102], [103, 104, 105]]], dtype=­'fl­oat­32')
tensor
a := ts.New(ts.WithShape(2, 2, 3), ts.WithBacking([]float32{0, 1, 2, 3, 4, 5, 100, 101, 102, 103, 104, 105}))
Note: The the tensor package is imported as ts
Additi­onally, gonum­/mat actually offers many different data struct­ures, each being useful to a particular subset of comput­ations. The examples given in this document mainly assumes a dense matrix.

gonum Types

mat.M­atrix
Abstract data type repres­enting any float64 matrix
*mat.D­ense
Data type repres­enting a dense float64 matrix

tensor Types

tensor.Tensor
An abstract data type repres­enting any kind of tensors. Package functions work on these types.
*tensor.Dense
A representation of a densely packed multidimensional array. Methods return *tensor.Dense instead of tensor.Tensor
*tensor.CS
A repres­ent­ation of compressed sparse row/column matrices.
*tens­or.MA
Coming soon - representation of masked multidimensional array. Methods return *tensor.MA instead of tensor.Tensor
tensor.DenseTensor
Utility type that represents densely packed multidimensional arrays
tensor.MaskedTensor
Utility type that represents densely packed multidimensional arrays that are masked by a slice of bool
tensor.Sparse
Utility type that represents any sparsely packed multi-dim arrays (for now: only *CS)
 

Metadata

Meta­data
Numpy
gonum
tensor
Shape
a.shape
a.Dims()
a.Shape()
Stri­des
a.strides
 
a.Strides()
Dims
a.ndim
 
a.Dims()

Tensor Manipu­lation

Zero-op Transpose
Numpy
aT = a.T
gonum/mat
aT := a.T()
tensor
a.T()
Transpose With Data Movement
Numpy
aT = np.tra­nsp­ose(a)
gonum/mat
b := a.T(); aT := mat.De­nse­Cop­yOf(b)
tensor
aT, err := ts.Transpose(a)
or
a.T(); err := a.Tran­spo­se()
Reshape
Numpy
b = a.resh­ape­(2,3)
gonum/mat
b := NewDen­se(2, 3, a.RawM­atr­ix(­).D­ata)
tensor
err := a.Resh­ape­(2,3)

Linear Algebra

Inner Product of Vectors
Numpy
c = np.inn­er(a, b)
gonum
var c float64 = mat.Dot(a, b)
tensor
var c interf­ace{} = ts.Inn­er(a, b)
or
var c interf­ace{} = a.Inner(b)
Note: The tensor package comes with specia­lized execution engines for float64 and float32 which will return float64 or float32 without returning an inter­face{}
Matrix-Vector Multiplication
Numpy
mv = np.dot(m, v)
or
mv = np.mat­mul(m, v)
or
mv = m @ v
or
mv = m.dot(v)
gonum
mv.Mul(m, v)
tensor
var mv ts.Tensor; mv, _ = ts.Mat­Vec­Mul(m, v)
or
var mv *Dense; mv, _ = m.MatV­ecM­ul(v)
Matrix-Matrix Multip­lic­ation
Numpy
mm = np.dot(m1, m2)
or
mm = np.mat­mul(m1, m2)
or
mm = m1 @ m2
or
mm = m1.dot­(m2)
gonum
mm.Mul(m1, m2)
tensor
var mm Tensor; mm, _ = ts.Mat­Mul(m1, m2)
or
var mm *ts.Dense; mm, _ = m1.MatMul(m2)
Magic
Numpy
c = np.dot(a, b)
c = a.dot(b)
tensor
var c ts.Tensor; c, _ = ts.Dot(a, b)
var c *ts.Dense; c, _ = a.Dot(b)
Note: The Dot function and method in package tensor works similarly to dot in Numpy - depending on the number of dimensions of the inputs, different functions will be called. You should treat it as a "magic" function that does products of two multi-dimensional arrays.
gonum has a whole suite of linear­-al­gebra functions and structures that are too many to enumerate here. You should check it out too.

Combin­ations

Concatenation
Numpy
c = np.con­cat­ena­te((a, b), axis=0)
gonum/mat
c.Stack(a,b)
tensor
var c ts.Tensor; c, err = ts.Con­cat(0, a, b)
var c *ts.Dense; c, err = a.Conc­at(0, b)
Vstack
Numpy
c = np.vst­ack((a, b))
gonum/mat
c.Stack(a,b)
tensor
var c *ts.Dense; c, err = a.Vsta­ck(0, b)
Hstack
Numpy
c = np.hst­ack((a, b))
gonum/mat
c.Augment(a,b)
tensor
var c *ts.Dense; c, err = a.Hsta­ck(0, b)
Stack onto a New Axis
Numpy
c = np.sta­ck((a, b))
gonum/mat
var stacked []mat.M­atrix; stacked = append­(st­acked, a, b)
tensor
var c ts.Tensor; c, _ = ts.Sta­ck(0, a, b)
var c *ts.Dense; c, _ = a.Stack(0,b)
Note: Unlike in Numpy, Stack in tensor is a little more strict on the axis. It has to be specified.
Repeats
Numpy
c = np.rep­eat(a, 2) # returns a flat array
c = np.rep­eat(a, 2, axis=0) # repeats along axis 0
c = np.rep­eat(a, 2, axis=1) # repeats along axis 1
gonum
Unsupported for now
tensor
var c ts.Tensor; c, _ = ts.Rep­eat(a, ts.All­Axes, 2) // returns a flat array
c = ts.Rep­eat(a, 0, 2) // repeats along axis 0
c = ts.Rep­eat(a, 1, 2) // repeats along axis 1

Data Access

Value At (Assuming Matrices)
Numpy
val = a[0, 0]
gonum/mat
var va float64 := a.At(0, 0)
tensor
var val interf­ace{}; val, _ = a.At(0,0)
Slice Row or Column (Assuming Matrices)
Numpy
row = a[0]
col = a[:, 0]
gonum/mat
var row mat.Vector = a.RowView(0)
var col mat.Vector = a.ColV­iew(0)
tensor
var row ts.View = a.Slice(s(0))
var col ts.View = a.Slic­e(nil, s(0))
Advanced Slicing (Assuming 9x9 Matrices)
Numpy
b = a[1:4, 3:6]
gonum/mat
var b mat.Matrix = a.Slic­e(1,4, 3,6)
tensor
var b ts.View = a.Slic­e(r­s(1,4), rs(3,6)
Advanced Slicing With Steps
Numpy
b = a[1:4:1, 3:6:2]
gonum/mat
Unsupported
tensor
var b ts.View = a.Slic­e(r­s(1­,4,1), rs(3,6­,2))
Getting Underlying Data
Numpy
b = a.ravel()
gonum/mat
var b []float64 = a.RawM­atr­ix(­).Data
tensor
var b interf­ace{} = a.Data()
Setting One Value (Assuming Matric­es)
Numpy
a[r, c] = 100
gonum/mat
a.Set(r, c, 100)
tensor
a.SetAt(100, r, c)
Setting Row/Col (Assuming 3x3 Matrix)
Numpy
a[r] = [1, 2, 3]
a[:, c] = [1, 2, 3]
gonum/mat
a.SetRow(r, []floa­t64{1, 2, 3})
a.SetCol(c, []floa­t64{1, 2, 3})
tensor
No simple method - requires Itera­tors and multiple lines of code.
Note: in the tensor examples, the a.Slice method take a list of tenso­r.S­lice which is an interface defined here. s, and rs in the examples simply represent types that implement the tenso­r.S­lice type. A nil is treated as a : in Python. There are no default tenso­r.S­lice types provided, and it is up to the user to define their own.

Download the Data Science In Go: A Cheat Sheet

6 Pages
//media.cheatography.com/storage/thumb/chewxy_data-science-in-go-a.750.jpg

PDF (recommended)

Alternative Downloads

Share This Cheat Sheet!

Like this cheat sheet? Check out our sponsors!

Readable.io is a collection of tools to make your writing better. More readable content means higher conversion rates and better reader engagement. Measure website and document readability, measure keyword density and more!

Click Here To Get Started!

 

Comments

No comments yet. Add yours below!

Add a Comment

Your Comment

Please enter your name.

    Please enter your email address

      Please enter your Comment.

          Related Cheat Sheets

          GoLang fmt Printing Cheat Sheet