Welcome to my field notes!
Field notes are notes I leave myself as I go through my day to day work. The hope is that other people will also find these notes useful. Note that these notes are unfiltered and unverified.
Core Julia Language
General observations
- I love the named slurping and splatting, an improvement of the R syntax
- It looks like math! Very little cognitive load on translating some work.
- However, the whole multiple dispatch thing, when I watch videos looks amazing, but when I try to implement makes my brain hurt. I suffer from the curse of seeing OOP in Python and R6 in R. S3 is quite similar but is so much simpler because it’s single dispatch.
- Symbols and strings are separate - amaze! Metaprogramming has been something that is quite easy in R and I suspect quite easy in Julia too with the macros.
- Do-block syntax - interesting, the function is first and then the iterable in Julia’s
map()
as opposed to R’spurrr::map()
. - There’s a lot of reference to language design here and I feel like I’m also learning about it as I read through how to actually use Julia.
Reading through the Julia Language Manual
Variable Scope
- Global and local scope
if
andbegin/end
blocks do not introduce new scopes- Julia uses lexical scoping which means that the function’s scope inherits from where the function was defined (like in R). You can refer to variables outside the scope in the parent.
- Constants can be defined by
const
this does not allow changing the value after; this really helps the compiler. - Each module intorduces a new global scope that is a separate world form other modules (oh boy this is where we can finally resolve the problem of too many conflicting names in R).
using
andimport
allow transportation of objects between those scopes. You can copy but you cannot insert and modify between modules.
Functions
- Function composition and piping!
- Composition operator (
\\circ
) can combine two functions together - Piping is using the pipe (
|>
) operator
- Composition operator (
- Dot syntax for vectorizing is very interesting.
- Vectorizing is not required for performance, but they’re still prettier
- Any julia function can be applied by adding a
.
after like.sin(V)
, or before the operator in the case of binary operators like.+
- Amazing yet again!
- This is just syntactic sugar for the
broadcast
function modifier - Nested brodcasts are joined together in a syntactic loop.
- You can pre-allocate using the dotted assignment
.=
- To avoid having too many dots then you can use the
@.
macro to add dots to every singe funciton call in that line.
- Anonymous functions use arrow like in JS:
(x) -> x + 2
Control Flow
Expressions
- Expressions using
begin
andend
- no brackets so need this, you can also separate into lines and use parens(x = 1; y = 2; x + y)
Conditionals
if
,elseif
,else
,end
- nothing else to add simple enough- If blocks do not have their own variable scope (same as R)
- They can also return a value like in R
- They must return a boolean
- Ternary operator can be used
condition ? iftrue: iffalse
&&
and||
short circuit the expression like in R, so you can use it to define some backup values or substitute the ternary operator&
and|
evaluate both arguments
function fact(n::Int)
n >= 0 || error("n must be non-negative")
n == 0 && return 1
n * fact(n-1)
end
Iteration
while
loops look pretty standard,break
breaks out of while.for
loops look pretty standard, execpt for:- you can loop over different vectors using
for i = 1:2, j = 3:4
, now that is a really good alternative topurrr::map2
and avoiding nested for loops - you can loop over using a tuple using
zip
:for (j, k) in zip([1 2 3], [4 5 6 7])
- you can loop over different vectors using
1:5
syntax can be used like in R for indices- You can use
\\epsilon
orin
Exception Handling
- There are a bunch of built-in exceptions:
- You can define also:
struct MyCustomException <: Exception
- Throw exceptions using
throw(DomainError(x, "argument is not part of domain"))
showerror
method on that error type allows you to define how that error will be displayedstop()
in R would beerror
throwing anErrorException
try
-catch
-end
is implemented like so:
try
sqrt("ten")
catch e # this e is the exception
println("Needs to be numeric not a string")
end
- can do inline try catch using
try condition catch e; expression end
finally
can be added to ensure that things are finalized (close db or file connections and whatnot).
Types
General type system
- Julia’s type system is ultimately dynamic but gains compiler advantages from type annotations.
- Type annotations serve 3 purposes:
- take advantage of Julia’s powerful multiple-dispatch mechanism
- improve human readability
- catch programming errors
- Why can’t you inherit from a concrete type?
- inherit behavior (methods) more important than inheriting structure
- but I mean why not both?
- there are some difficulties with inheriting structure (? - unanswered)
- inherit behavior (methods) more important than inheriting structure
- Salient aspects:
- No divison between object and non-object values
- No compile time type
- Only values have types not variables
Type declarations
::
operator: “is an instance of”(1 + 2)::Int
is a type assertionx::Int = 1
is a type restriction of that variablefunction sinc(x)::Float64
is a type restriction on the result
Abstract types
- Cannot be instantiated
abstract type «name» end
- Default supertype is
Any
- all objects are instances of - Bottom type is
Union{}
nothins is aUnion{}
and all are supertypes <:
is operator for “is a subtype of”
Primitive types
- You can declare these as bits but why?
primitive type <<name>> <: <<supertype>> <<bits>> end
Composite Types
- composed of primitive types, called records or structs or objects
- Julia composite types cannot have methods in them, the methods are outside
- By default two constructors area created which is just a function with the elements of the struct as arguments, and one that takes
Any
type and attempts to do the conversion. - Structs are immutable (like pretty much anything in R except environments) primarily for performance and secondarily for readability. They can be made mutable using a
mutable struct
keyword though. - How do you decide whether a type can be immutable or not? Ask yourself if two objects that have exactly the same fields are identical. If they are, it’s usually an indicator that you want an immutable type.
- All these declared types are of type
DataType
Type Unions
- You moosh together two types and anything inherits from that like so:
IntOrString = Union{Int, AbstractString}
Union{T, Nothing}
is essentially a nullable type because it can be the special valuenothing
.
Parametric types
- You can declare a parametric type like so:
struct Point{T}
x::T
y::T
end
Point
is also a type containing any of its parameterized equivalents as subtypes.Point{Float64}
is not a subtype ofPoint{Real}
, so in order to define a method that allows dispatching on both, use this form:
function norm(p::Point{<:Real}) end
Point{<:Real}
is really just a Point that is aUnionAll
of all the subtypes ofReal
, which explains how it works. You can also doPoint{>:Int}
to get all the supertypes ofInt
.- If you use the constructor in a way, it already defins the parameteric types, i.e. calling
Point(1.0, 2.0)
will generate aPoint{Float64}
automatically. - You can declare types as subtypes of parametric abstract types just like any other type.
Tuple Types
- Function arguments are tuples, which are really like immutable structs that are parameterized by the type of each arugment, but some differences:
- Tuple types can have any number of parameters
- Typle types are covariant, so
Tuple(Int)
is a subtype ofTuple(Any)
. - Types do not have field names and are accessed by index.
(1, "foo", 2.5)
is a tuple- Tuples may have
Vararg{T, N}
which can match 0 to N arguments (Inf ifN
is omitted. There is a convenienceNTuple{N,T}
to represent a Tuple that has N elements of type T.
Named Tuple types
- Is this the named list in R?
- It has a tuple of symbols and a tuple of field types: `NamedTuple{(:a, :b), Tuple{Int64, String}}
- You can use
@NamedTuple
to provide a more convenient struct-like syntax:
@NamedTuple begin
a::Int
b::String
end
UnionAll Types
Array{<:Integer}
is effectivelyArray{T} where T<:Integer
.- You can short form a type definition using:
Vector{T} = Array{T, 1}
Singleton Types
- Immutable composite types with no fields:
struct NoFields end
Operations on Types
isa(x, y)
orx <: y
checks for subtypestypeof()
returns the type of its argumentsupertype()
returns the supertype of its argument
Custom pretty-printing
- Overload the
show
function:Base.show(io::IO, z::Polar)
- You can also change the output based on the MIME type:
Base.show(io::IO, ::MIME"text/plain", z::Polar{T})
Value types
- You can dispatch on the actual value of a type using the
Val
keyword, but this is likely to be not idiomatic.
Methods
To facilitate using many different implementations of the same concept smoothly, functions need not be defined all at once, but can rather be defined piecewise by providing specific behaviors for certain combinations of argument types and counts.
- A “function” is
- an object that maps a tuple of argyuments to a return value or throws an exception if no appropriate value can be returned.
- a conceptual operation that may be abstract
- A “method” is
- a specific concrete implementation or behavior of a fucntion.
- a function defined is usually just one method
- Even if the concrete implementation is quite different, well designed dispatch will appear very coherent from the outside.
- Multiple dispatch (like R’s S4 but more deliberately made lol)
- Most specific (lower on the type tree) will be used.
- Ambiguities in selecting most specific type raise an error.
- Just define a function but with type annotations and voila it’s a method of that function (already that’s less typing than R)
- All conversion in Julia is explicit (also different from R)
- Use
methods(f)
to figure out the methods attached to the generic- No type means the
Any
apex type.
Parametric Methods
- you can also define methods like so:
same_type(x::T, y::T) where {T}
which applies whenever bothx
andy
are of the same type. - You can also constrain those parameters by doing
where {T<:Real}
Redefining methods
- You cannot immediately use new method definitions as soon as you defined them usually in the same expression; use
Base.invokelatest(f)
to get around this.
Design patterns
- Extracting the type parameter from a super-type - You can use this method to extract the type inside a parameterized type:
eltype(::Type{<:AbstractArray{T}}) where {T} = T
- Building a similar type with a different type parameter - Use
Base.similar
to create a mutable array with the given element type and size. UseBase.copyto!
in order to always create a copy. - Iterated dispatch - You can dispatch first on the outer container then continue down the dispatch tree (similar to single dispatch).
- Trait-based dispatch (Holy trait) - This stuff is really getting over my head now haha.
map(f, a::AbstractArray, b::AbstractArray) = map(Base.IndexStyle(a, b), f, a, b)
# Base.IndexStyle(a, b) returns a trait that is going to return the best way
# to index that particular set of parameters, and then you can just fall back
# to the default implementation so you don't have to keep a tree.
map(::Base.IndexCartesian, f, a::AbstractArray, b::AbstractArray)
- Output-type computation -
Base.promote_type
decides the output type of the computation for the basic types. - Separate convert and kernel logic - to reduce compile time is to isolate the conversion logic and the computation.
- Parametrically-constrained Varargs methods - use the same parameter in the operated objected and the varargs so you can constrain the methods
Some function gotchas
- Note: keyword arguments do not operate on multiple dispatch
- Note: default arguments are going to bne overridden if you specify a more specific implementation afterwards.
Function-like objects
- You can turn a type into a function that operates!
- This is very similar to R where functions are first class objects.
Empty generic functions
- Use this to define a specific interface:
function emptyfunc end
Method design and the avoidance of ambiguities
- Ambiguities can be hard to deal, here are some alternatives to just deifining a more specific method:
Tuple
andNTuple
arguments - in the empty case they are ambiguous as to type so you can either define one on empty type or restrain your NTuple arguments to only where N > 1- Orthogonalized design - nest the methods
- Dispatch on one argument at a time (Single dispatch)
- Don’t define methods that dispatch on specific element types of abstract containers, instead you should just define on a generic method and construct a conceptual tree before specializing.
- When recursing avoid relying on default arguments because you can cause an infinite loop.
Constructors
- It’s just an automatically generated function that accompanies a struct.
- “Outer constructors” - You can create new methods for it just like any other function
- Use the same value for fields
- Add default values
- “Inner constructors”
- needed for 2 use cases:
- enforcing invariants (validations)
- ensures that no object (if immutable) will violate the invariant
- you can’t enforce with outer constructors because ultimately they must call an inner constructor
- allowing construction of self-referential objects
- for recursive applications, you can call
new
without all the fields - you can create a self referential object like this
- declared inside the block of a type declaration
- special access to a local function called
new
that is the default constructor
- needed for 2 use cases:
struct OrderedPair
x::Real
y::Real
OrderedPair(x,y) = x > y ? error("out of order") : new(x,y)
end
- above object is now constraint to strictly decreasing
- it does not enforce this if the struct is mutable, so only immutable structs will have some sort of guarantee.
- Best practices for constructors:
- as few inner constructors as possible
- take all arguments explicitly and enforce essential error checking
- provide ocnveniences as outer construtors
- Incomplete initialization
- For recursion
- Parametric constructors
- Use the
promote
function heavily so that you can rationalize with only 1 type.
- Use the
Conversion and Promotion
- Two approaches to promotion:
- Automatic promotion for built-ins (Perl, Python)
- 1 + 1.5 is automatically promoted to floating point
- No automatic promotion
- quite inconvenient
- Julia falls into the no automatic promotion (#2) but implements some polymorphic multiple dispatch as a special application. It can be edited by the user if they so choose and user-defined types can participate.
Conversion
- Just call the constructor on the object to be converted.
convert(::DataType, x)
function is the function on which we add conversion methods.- Julia does not aurtomatically convert between strings and numbers.
- Conversion differs from construction in:
- Mutable collections
- Where it’s not really a conversion
- Wrapper types - types that wrap other value.
- Constructors that don’t rteturn instances of their own type.
Defining new conversions
- It’s just a metter of defining a method for
convert
- Only do this if implicit conversion is safe! Otherwise, rely on the constructor functions being explicitly called.
Promotion
- Standardization of types prior to an operation.
- Handled by the
promote()
method - but you don’t define the rules onpromote
but on thepromote_rule(::DataType, ::DataType)
which doesn’t take the actual values but the datatypes. This is symmetric already so you don’t need to define the flipped datatypes. This feeds into a function calledpromote_type
that you can then use to actually determine what type the value will end up being. - Aiming to be as lossless as possible.
Interfaces
- Iteration interface
- defining an
iterate
method will allow you to usefor
loopsin
operatormean
,std
etc
- defining
eltype
method will allow us to know more things like creating specialized iterable code that’s faster - defining
length
allows us to preallocate and stuff like that - defining
firstindex
andlastindex
allow us to specify the first and last valid indices so we can use thebegin
andend
indices
- defining an
AbstractArray
interfaceIndexStyle
is important to define for efficiency- This interface is extremely rich, simply defining:
struct SquaresVector <: AbstractArray{Int, 1}
is enough to make it iterable, indexable, and completely functional.
- Strided Arrays - a lot over my head
- Customizing broadcasting - also over my head but I can see how that can be super useful for building actual machine learning models.
Modules
- would be the whole package scope in an R package I suppose
- key properties are:
- separate namespaces - avoids conflicts in function names
- has
import
andexport
for defining what it needs and what it provides other modules (by default, there is no private namespace) - Modules can be precompiled for faster loading and contain code for running initialization.
- module code is typically organized into files and then read in using
include("file1.jl")
. - although related
include
is just adding the code in the file into wherever it is and modules can be composed around or within that any which way. parentmodule
finds the module that an object is contained- You can reserve variable names by declaring
global x
so that it cannot be modified from outside the module. using
loads all exports into theMain
namespace, whereasimport
just brings the module name into scope, so you’d need to quality everything in there in order to use it.using Module: name1, name2
only brings specific names into global scope and the module name will not be in scope unless you also include it in the names list. You can’t add methods to a function without a namespace (as it’s “using”)import Module: name1, name2
brings in the specific names and also allows you to attach methods (usually done in other modules)
- You can use
as
to work around namespace conflicts likeimport CSV: read as rd
orimportBenchmarkTools as BT
. This is not compatible withusing
/. - When modules export the same name:
- Use qualified name especially when they mean different things
- use the keyword to reanme one or both with unqiue identifiers.
- If they do share the same meaning then there is a unifier base package.
Base
andCore
are in the modules unless you definebaremodule
import .Module
imports a module defined in the current scope,import ..Module
imports a module defined in the parent module and so on...
is essentially a “sibling” module.
Module initialization and pre-compilation
- Whenever something is
import
ed orusing
ed the modulem is precompiled. Trigger manually usingBase.compilecache
- Put
__precompile__(false)
in the module file at the top to prevent a module from being precompiled. - Don’t precompile any external dependencies but rather define them at runtime using the
__init__()
function, like external C libraries and global constants that containe pointers returned by external library - Dictionaries can be safe to precompile if the key is generally standard types, not weird ones like
Function
orDataType
or user-defined types without a defined hash method.
Documentation
- Docstrings basically, interpreted as Markdown handled by
Markdown.jl
. - Standard for a docstring:
"""
bar(x[, y]) # function signature with indent, optional with [], kwargs with ;
Imperative format of title ending with a period.
Additional details in a second paragraph that explains more implem details. Be
concise and don't repeat yourself.
# Arguments
This part is only necessary if it's a complex function and those with kwargs.
See also: [`bar!`](@ref) # Hints to related functions
# Examples
Written as doctests whenever possible
\\```jldoctest
julia> code to run
output that matches the output exactly. Use [...] to truncate the output.
\\```
"""
code
andLATEX
- 92 characters wide max
- Function implementation for custom types can be in an Implementation section. Intended for developers rather than users.
- Long docstrings can be split off to an extended help header tahn can be called explicitly by adding ??function rather than ?function
- Ideally only the generic function needs to be documented, no need to document the individual methods, unless the behavior differs explicitly.
@doc
macro allows you to insert expressions into the documentation.- Use
$($name)
to use string interpolation in docstrings - You can define a method on
Docs.getdoc
to be able to operate on that type and access the data for it. - If you alias something, just document the original so that both can have the documentation.
- You can add a docstring to two things separated by a comma.
Metaprogamming
- Now this is going to be exciting!
- All this (including R) inherits from Lisp.
- Expressions have two parts (type
Expr
,Expr(:call, :+, 1, 1)
)- head: indicating the type of expression -
exp.head
- args: the contents of the expression -
exp.args
- head: indicating the type of expression -
- Expressions can be nested and manipulated
- Symbols are represented as
:symbol
Symbol("func", 10)
turns intofunc10
so you can manipulate the symbol with strings.- Quoting
- creation of expression objects
- :(a + b * c + 1) turns the quote string into an expression
- Interpolation
- manipulating the expression objects without
Expr
using$
ex = :($a + b)
turns into:(1 + b)
because a is evaluiated anda = 1
.- You can also do splatting interpolation through
$(xs...)
.:(f(1, $(args...)))
turns into:(f(1, x, y, z))
- the
$
is kinda like!!
in R and the splatted one is!!!
- You can nest down levels of quoting through
$$$....
QuoteNode
avoids interpolating the :$
- manipulating the expression objects without
- Evaluation
- You can execute the expression using
eval
Module.eval
evaluates inside the global scope of theModule
- You can execute the expression using
- You can define functions that manipulate these expressions
- Macros
- including generated code in the final body of a program
- maps a tuple of arguments to a returned expression which is compiled directly instead of at runtime
- Think of the result of the macro being inlined into the code by the compiler and then that being compiled.
- Inspect the expressiong using
macroexpand
macroexpand(Main, :(@sayhello "Troy"))
- you need to include the module in which the macro will be evaluated.
- Macros receive the
__source__
and__module__
argumens which contain the line number of the invocation and the information about the expansion module (like existing bindings). - Macros rename local variabels to avoid name clashes with the module scope. You can use
esc(ex)
to escape this and ignore hygiene. - Macros are also functions and can therefore take advantage of multiple dispatch, but they dispatch based on types of the AST not the evaluated values of the expressions
macro sayhello()
return :( println("Hello there!") )
end
- Boilerplate code can be generated programmatically using
eval
on aquote ... end
with interpolation. You can use the eval/quote pattern using the@eval macro
Non-standard string literals
- You can define string literals like
r"^\\s"
which is a regex orb"..."
which is a byte array literal. - You can also take advantage of this by defining a macros that is of the form
{...}_str
which allows you to define a string literal{...}"..."
.
Generated functions
- You can define using
@generated
and the function should return a quoted expression. - I don’t get this at all haha
- There’s also optionally generated functions.
- I feel like this is something where if we do it there’s something wrong.
Multi-dimensional arrays
- Arrays are first class but not really any special from any other object:
AbstractArray
- There’s no imperative to vectorize anything, it will be fast either way.
- All arguments to functions are pased by sharing (pointers).
- By convention ending with
!
will mean that it mutates its arguments. - You can concatenate by
[1:2, 4:5]
[1:2; 4:5]
- List comprehensions: A way to construct arrays (like set consturction in mathematics)
[ 0.25*x[i-1] + 0.5*x[i] + 0.25*x[i+1] for i=2:length(x)-1 ]
- Generator expressions: without the brackets, these won’t be evaluated until you iterate on them.
- Indexing is done via square brackets:
A[I_1, I_2, ..., I_n]
:
takes all indices for that particular dimension.begin
andend
are special words that take the begininning and end. the bracket is the method ofgetindex
method for that type.- Indexed assignment allows you to modify in place. indexed assignment is the method of
setindex!
for that type. - Supported index types
- scalar index that is an integer or an N-tuple corresponding to the dims:
page[[CartesianIndex(1,1)]]
- Array of scalar indices
page[1:4, 1:4, 1]
- An object that represents an array of scalar indices and can be converted using
to_indices
, such as:
or arrays of booleans like a conditional:page[[true, true, false, false]]
- scalar index that is an integer or an N-tuple corresponding to the dims:
- Linear indexing is also possible when only one index
i
is given. This is done in column-major iteration order, like it was reshaped into a one dimensional vector. - You can omit indices if the trailing dimensions are just one-dimensional, you can also add extra indices if the trailing dimensions are also just one dimensional.
- Iteration
- Use a
for a in A # do something; end
loop - Use a
for a in eachindex(A) #do something to A[i]; end
loopa
in this case will be integer for linear otherwise CartesianIndex.
- Use a
- Dot syntax for vectorized operations
sin.(x)
x .+ y
Missing Values
- Yet another R feature that I really miss in Python, let’s see what the implementation is!
missing
in a math operation returnsmissing
, but only because the core operations have handled these cases.- Functions that don’t propagate missing values can do so by using the
Missings.jl.passmissing()
function. missing == missing
returnsmissing
, so useismissing(x)
to test for missings. But theisequal(missing, missing)
andmissing === missing
will returntrue
missing
is considered as greater than any other value so when sortedmissing
s will be at the end of the ascending ordertrue | missing
returnstrue
missing | true
returnstrue
- control flow does not allow for missing value, nor do short circuiting ops.
skipmissing()
skips missing values (like thena.rm = TRUE
argument)
Networking and Streams
- Streams expose a
read
and awrite
with the stream as the first argument. - Files have
open
returning anIOStream
object that can be used. This can then beclose
ed to flush to disk. - TCP sockets are embedded in the
Sockets
standard library.
Parallel Computing
- This is exciting! Native support for parallel images.
- Asynchronous tasks (coroutines)
- Communication via
Channel
s wait
andfetch
syntax.- Tasks are operations that can be interrupted and resumed at any time, is this pretty much like a
future
in R? - Syntax
t = @task expr
to declare a taskschedule(t)
schedules it for executionwait(t)
blocks until completion@async
macro creates and schedules a task immediately
- Channel communication
- waitable first-in first-out queue.
- Producers
put!(::Channel)
while consumerstake!(::Channel)
put!
blocks if Channel si full,take!
blocks if it’s empty.fetch
gets the value but doesn’t remove the value.- closed channels can still be read from until it’s empty.
- Communication via
- Multithreading
- Set the
JULIA_NUM_THREADS
environment variable or use the command line argument-threads
to set more than 1 thread. - Be careful to avoid data races by acquiring a lock around any data that you suspect will be part of a data race. You can also make primitive types atomic (thread-safe) bty wrapping it in Atomic like:
Threads.Atomic{Int}(0)
- Use the
[email protected]
macro for a for loop to execute it in parallel.
- Set the
- Distributed computing
- Running multiple Julia processes with separate memory spaces.
- For this we have the
Distributed
standard library. - Primitives
- remote references - can be used from any process to refer to an object stored on that process
Future
- not rewritable.RemoteChannel
- rewritable so good for multiple processes.
- remote call - request by one process to call a certain function on another porcess
- returns a
Future
immediately, that you canwait
andfetch
.
- returns a
- remote references - can be used from any process to refer to an object stored on that process
- Syntax
remotecall(f, workerid, args..., kwargs...)
creates a future.remotecall_fetch
fetches the value immediately.
@spawnat workerid expr
does the same thing but for expressions- you can
@spawnat :any
to pick whichever or the owners of whatever futures are already defined.
- you can
- Considerations
- Your code must be available about the worker. Use
@everywhere
to run a piece of code in all workers so that they all have the same environment. You can also use theL
julia argument to load a file on startup. - Global variables are transferred when they are in closures.
- Minimize data movement as much as possible.
- You can run parallel for loops by using
@distributed
.@distributed (+) for i = 1:2000000
will combine the results with a specific reduction (+).- if you are iterating on an array you need to use
SharedArrays
.
- Use
pmap()
if you want to run an expensive iteration on multiple elements. Use@distributed
if the calculation is rather tiny.
- Your code must be available about the worker. Use
- Setting up a cluster
- local cluster
p
- remote cluster using
-machine-file
, definition is:[count*][[email protected]]host[:port] [bind_addr[:port]]
addprocs
,rmprocs
,workers
- you can manage processes
- local cluster
- GPU Computing
- You can run Julia code natively on GPUs using the JuliaGPU.org packages.
Running External Programs
- Now this is going to be useful for the Singer spec to be used.
- shell, Perl, and Ruby commands are referenced with backticks.
- This creates a
Cmd
object that can be connected to pipes, run, read, and written to. It does not capture output by default. - You can
open
to read from or write to an external command. - You can parse the comand just like a string
- Use
$
to interpolate string literals. - You can use
pipeline
to construct a pipeline - You can use
&
to run pipelines in parallel, and also put that inpipeline
Miscellaneous items
- Use Sys.
iswindows()
isapple()
isunix()
islinux()
isbsd()isfreebsd()
to maange operating system variation.
Performance Tips
- Avoid global variables
- Use
@time
to profile and watch out for memory allocation - Make containers contain as concrete a type as possible.
- Type declarations are helpful in:
- Field names of
struct
s - it’s also better to parameterize the types of fields in the types themselves.
- Field names of
- Avoid fields with absctract containers.
- Annotate values taken from untyped locations.
- Make Julia specialize on types as much as possible.
- If you have a lot of branching if statements likely you will want to break that up into method definitions.
- Write type stable functions, or annotate function return types
- Avoid changing the type of a variable, or annotate variable types
- Don’t abuse multiple dispatch by encoding values as types
- Access arrays in column-major order (how it’s stored in memory)
- Pre-allocate outputs
- Fuse vectorized functions
- Use views to modify in place (
view
or@views
macro on the expression) - Copying data is sometimes good if the original memory map has already been jumbled up or shuffled
- Small fixed size vectors, use
StaticArrays.jl
- Avoid string interpolation for IO as it builds a string instead of copying directly onto the connection
Workflow Tips
- Put the code in a temporary module (“Tmp.jl”) and then put the test code in a test code in another module (“Tst.jl”) so that you can keep
include
ing them as you develop your code. - Use
Revise.jl
: make sure to load Revise before loading any code, so that any modules are going to be updated as they are loaded, but there are limitations:- Changes to type defintitions
- Changes to vars and funcs that have the same name.
Style Guide
- 4 indentation level
- Use functions as soon as possible instead of scripts
- Avoid writing overly specific types
- Handle excess argument diversity in the caller
- Use ! to demarcate non-pure functions.
- Avoid strange Type Unions
- Avoid very very elaborate container types
- Avoid creating primitive types
- Provide default implementations for abstract types
- Function naming
- Smoosh words together (i.e.
haskey
) and don’t abbreviate words - Use underscores when there are different concepts, i.e.
remotecall_fetch
- Smoosh words together (i.e.
- Function arguments
- Functions first so you can use
do
syntax - I/O Stream
- object being mutated
- Type
- object not being mutated
- key
- value
- All others
- Varargs
- Kwargs
- Functions first so you can use
- Avoid errors rather than using try catch to catch them
- Don’t parenthesize conditions (
if a == b
notif (a == b)
) - Don’t overuse splicing
- Prefer instances to types
- Don’t use unnecessary static parameters
- Don’t overuse macros. If you use
eval
in a macro it’s likely because you’re using it to replace a function - Don’t expose unsafe operations at the interface level
- Don’t overload methods of base container types, define your own type based on a common abstract type
- Avoid type piracy if you aren’t collborated with the package imported
- Use
isa
or<:
to test types not==
- Use the least disruptive numeric types so as not to change the type of the input argument