Agent Skill
2/7/2026julia-performance-tips
Apply Julia performance optimization techniques when writing or optimizing Julia code. Use when optimizing functions, diagnosing performance issues, reviewing code for performance, or when the user asks about Julia performance, type stability, memory allocation, or code optimization.
K
kristianholme
0GitHub Stars
1Views
npx skills add KristianHolme/.dotfiles
SKILL.md
| Name | julia-performance-tips |
| Description | Apply Julia performance optimization techniques when writing or optimizing Julia code. Use when optimizing functions, diagnosing performance issues, reviewing code for performance, or when the user asks about Julia performance, type stability, memory allocation, or code optimization. |
name: julia-performance-tips description: Apply Julia performance optimization techniques when writing or optimizing Julia code. Use when optimizing functions, diagnosing performance issues, reviewing code for performance, or when the user asks about Julia performance, type stability, memory allocation, or code optimization.
Julia Performance Tips
Essential performance optimization guidelines for Julia code. Reference: https://docs.julialang.org/en/v1/manual/performance-tips/
Core Principles
Functions and Globals
- Put performance-critical code in functions - code inside functions runs faster than top-level code
- Avoid untyped global variables - use
constfor globals, or pass as function arguments - Break functions into multiple definitions - prefer
f(x::Vector) = ...overif isa(x, Vector) ... end
Type Stability
- Write type-stable functions - return consistent types: use
zero(x)not0,oneunit(x)not1 - Avoid changing variable types - initialize with correct type:
x::Float64 = 1notx = 1thenx /= ... - Use function barriers - separate type-unstable setup from type-stable computation
Type Annotations
- Avoid abstract type parameters - prefer
Vector{Float64}overVector{Real} - Use parametric types for struct fields -
struct MyType{T} a::T endnotstruct MyType a::AbstractFloat end - Annotate values from untyped locations -
x = a[1]::Int32when working withVector{Any} - Force specialization when needed -
f(t::Type{T}) where Tnotf(t::Type)forType,Function,Vararg
Memory Management
- Pre-allocate outputs - use in-place functions
f!(out, args...)and pre-allocateout - Use views for slices -
@viewsorview()instead ofarray[1:5, :]when possible - Fuse vectorized operations -
@. 3x^2 + 4xfuses into single loop,3x.^2 + 4xcreates temporaries - Unfuse when recomputing - if broadcast recomputes constant values, pre-compute:
let s = sqrt.(d); x ./= s end - Access arrays column-major - inner loop should vary first index:
for col, rownotfor row, col - Copy irregular views when beneficial - copying non-contiguous views can speed up repeated operations
Closures
- Type-annotate captured variables -
r::Int = r0in closure scope - Use
letblocks -f = let r = r; x -> x * r endavoids boxing - Use
@__FUNCTION__for recursive closures -(@__FUNCTION__)(n-1)instead offib(n-1)
Advanced Types
- Use
Valfor compile-time values -f(::Val{N}) where Nwhen dimension known at compile time - Avoid excessive type parameters - only use values-as-parameters when processing homogeneous collections
Performance Tools
@time- measure time and allocations (ignore first run, it's compilation)@code_warntype- find type instabilities (red = non-concrete types)@allocated- measure memory allocations- Profiling - use Profile.jl or ProfileView.jl for bottlenecks
- JET.jl - static analysis for performance issues
--track-allocation=user- find allocation sources
Performance Annotations
@inbounds- disable bounds checking (use with caution)@fastmath- allow floating-point optimizations (may change results)@simd- promise independent loop iterations (experimental, use carefully)
Miscellaneous
- Avoid unnecessary arrays -
x+y+znotsum([x,y,z]) - Use
abs2for complex numbers -abs2(z)notabs(z)^2 - Use
div,fld,cld- nottrunc(x/y),floor(x/y),ceil(x/y) - Fix deprecation warnings - they add lookup overhead
- Avoid string interpolation for I/O -
println(file, a, " ", b)notprintln(file, "$a $b") - Use
LazyStringfor conditional strings -lazy"..."for error paths - Set
OPENBLAS_NUM_THREADS=1when usingJULIA_NUM_THREADS>1for multithreaded code
Package Performance
- Use PrecompileTools.jl - reduce time-to-first-execution
- Minimize dependencies - use package extensions for optional features
- Avoid heavy
__init__()- minimize compilation in initialization - Use
@time_imports- diagnose slow package loading
Skills Info
Original Name:julia-performance-tipsAuthor:kristianholme
Download