Infrastructure & Tooling for Julialang Ecosystem
The Big Picture
Taken from Full Stack Deep Learning Spring 2021
The figure mainly shows the infrastructure and tools used in the Python ecosystem, I will follow the same concept to collect the infrastructure and tools in the julia ecosystem.
Data
Sources
Filesystem
- locally mounted disk
- networked (e.g. NFS)
Database
- PostgresQL
- SQLite
- InfluxDB: for time series
- QuestDB: for time series
Versioning
- DVC
For more information on data version control in Julia, please refer to this discussion thread
Exploration
- DataFrames.jl
Training/Evaluation
Computing
- desktop
- workstation
Resource Management
- Docker
Software Engineering
- Julia v1.6.5, LTS release
- VSCode
- Jupyter Notebook
- Pluto Notebook
- git
Frameworks & Distributed Training
Experiment Management
Hyperparameter Tuning
See use case for using Hyperopt.jl
Deployment
CI/Testing
- GitHub Action
Edge
- ONNX.jl
Appendix
Julia Package
Useful Utility
Revise
Test
ReTest
PkgTemplates
BenchmarkTools
Chain
: pipingLoggingExtras
: Composable Loggers for the Julia Logging StdLibMemento
: A flexible logging library for JuliaJuliaFormatter
DotEnv
orConfigEnv
: loads environment variables from a .env file into ENVWandbMacros
orWeightsAndBiasLogger
orWandb
: logging to weights and biases (Wandb) dashboard.
Database
LibPQ
: LibPQ.jl is a Julia wrapper for the PostgreSQL libpq C library.
ML
Flux
: frameworkGeometricFlux
: Geometric Deep Learning for FluxGraphNeuralNetworks
FluxArchitectures
: Complex neural network examples for Flux.jl.Transformer
: Julia Implementation of Transformer modelsMetalhead
: Computer vision models for FluxMLJ
: ML framework
ML Preprocessing
Augmentor
: A fast image augmentation library in Julia for machine learning.- MLDataUtils.jl
- MLUtils.jl
Multi-threading/Multi-processing/Distributed
Dagger
: A framework for out-of-core and parallel executionPolyester
Math/Statistic
GeoStats
: An extensible framework for high-performance geostatistics in Julia.QuadGK
Random
Distributions
LinearAlgebra
StatsBase
Statistics
DSP
Data table and manipulation
DataFrames
: In-memory tabular data in JuliaTables
: An interface for tables in JuliaFeatureTransforms
: Transformations for performing feature engineering in machine learning applicationsTableTransforms
: Transforms and pipelines with tabular dataImpute
: Imputation methods for missing data in juliaDataConvenience
: Convenience functions missing in JuliaCleaner
: A toolbox of simple solutions for common data cleaning problems.
Data IO
Arrow
Parquet
CSV
JSON3
BSON
Notebook
IJulia
- Pluto
Visualization
Plots
- Backends
GR
PlotlyJS
,PlotlyBase
PyPlot
- Extensions
StatsPlots
GraphRecipes
- Backends
Makie
- Backends
CairoMakie
GLMakie
(need GPU)WGLMakie
- Extensions
AlgebraOfGraphics
GraphMakie
GeoMakie
- Backends
VegaLite
Gadfly
Compose
Colors
ColorSchemes
Text processing
LaTeXStrings
Build in packages (standard library)
LinearAlgebra
Statistics
Markdown
Printf