Joblib

 

Joblib

Joblib is a set of tools to provide lightweight pipelining in Python. In particular: transparent disk-caching of functions and lazy re-evaluation (memoize pattern) easy simple parallel computing.

Why it is used?

  • Better performance
  • reproducibility
  • Avoid computing the same thing twice
  • Persist to disk transparently

Features

  • Transparent and fast disk-caching of output value
  • Embarrassingly parallel helper
  • Fast compressed Persistence


Importing libraries

from joblib import Memory,Parallel, delayed,dump,load
import pandas as pd
import numpy as np
import math


Data Creation

my_dir = '/content/sample_data'
a = np.vander(np.arange(3))
print(a)
output: [[0 0 1]  [1 1 1]  [4 2 1]]


Memory

mem = Memory(my_dir)
output: [[ 0 0 1] [ 1 1 1] [16 4 1]]
sqr = mem.cache(np.square)
b = sqr(a)
print(b)
output: [[ 0  0  1]  [ 1  1  1]  [16  4  1]]


Parallel

%%time
Parallel(n_jobs=1)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 2.85 ms, sys: 0 ns, total: 2.85 ms
Wall time: 3 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
%%time
Parallel(n_jobs=2)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 42.7 ms, sys: 762 µs, total: 43.5 ms
Wall time: 75.9 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
%%time
Parallel(n_jobs=3)(delayed(np.square)(i) for i in range(10))
output: CPU times: user 92.9 ms, sys: 8.93 ms, total: 102 ms
Wall time: 151 ms
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


Dump

dump(a,'/content/sample_data/a.job')
output: ['/content/sample_data/a.job']


Load

aa = load('/content/sample_data/a.job')
print(aa)
output: array([[0, 0, 1],        [1, 1, 1],        [4, 2, 1]])


References

https://github.com/SharathHebbar/Data-Science-and-ML/tree/main/codes/joblib

https://medium.com/@sharathhebbar24/joblib-78669d181069


Documentation: https://joblib.readthedocs.io

Download: https://pypi.python.org/pypi/joblib#downloads

Source code: https://github.com/joblib/joblib

Report issues: https://github.com/joblib/joblib/issues



Comments

Popular posts from this blog

Confounding Variables

Alpha Values and P-values