Skip to content

Instantly share code, notes, and snippets.

@MarkDana
Last active January 21, 2025 16:07
Show Gist options
  • Select an option

  • Save MarkDana/a9481b8134cf38a556cf23e1e815dafb to your computer and use it in GitHub Desktop.

Select an option

Save MarkDana/a9481b8134cf38a556cf23e1e815dafb to your computer and use it in GitHub Desktop.

Revisions

  1. MarkDana revised this gist Dec 7, 2021. 1 changed file with 10 additions and 8 deletions.
    18 changes: 10 additions & 8 deletions m1-max-numpy-setup.md
    Original file line number Diff line number Diff line change
    @@ -65,6 +65,8 @@ Except for the above optimal one, I also tried several other installations

    The above ABC options are directly installed from conda-forge channel. `numpy.show_config()` will show identical results. To see the difference, examine by `conda list` - e.g. `openblas` packages are installed in B. Note that `mkl` or `blis` is not supported on arm64.
    + D. `np_openblas_source`: First install openblas by `brew install openblas`. Then add `[openblas]` path `/opt/homebrew/opt/openblas` to `site.cfg` and build Numpy from source.
    + `M1` and `i9–9880H` in this [post](https://towardsdatascience.com/m1-macbook-pro-vs-intel-i9-macbook-pro-ultimate-data-science-comparison-dde8fc32b5df).
    + My old `i5-6360U` 2cores on MacBook Pro 2016 13in.

    ### 2. Benchmarks:
    Here I use two benchmarks:
    @@ -86,16 +88,16 @@ for _ in range(runtimes):

    print(f'mean of {runtimes} runs: {np.mean(timecosts):.5f}s')
    ```
    2. `dario.py`: A benchmark script by [Dario Radečić](https://gist.github.com/daradecic/a2ac0a75d7e5f22c9aa07174dcbbe061/raw/a56ee217e6d3f949b1d1f719a7a134cef130cd9f/macs.py) at this [post](https://towardsdatascience.com/m1-macbook-pro-vs-intel-i9-macbook-pro-ultimate-data-science-comparison-dde8fc32b5df). Results in this post is also reported in following, as `M1` and `i9–9880H`.
    2. `dario.py`: A benchmark script by [Dario Radečić](https://gist.github.com/daradecic/a2ac0a75d7e5f22c9aa07174dcbbe061/raw/a56ee217e6d3f949b1d1f719a7a134cef130cd9f/macs.py) at the post above.

    ### 3. Results:
    ```
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    | sec | np_veclib | np_default | np_openblas | np_netlib | np_openblas_source | M1 | i9–9880H |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    | mysvd | 1.02300 | 4.29386 | 4.13854 | 4.75812 | 12.57879 | / | / |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    | dario | 21 | 41 | 39 | 323 | 40 | 33 | 23 |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
    | sec | np_veclib | np_default | np_openblas | np_netlib | np_openblas_source | M1 | i9–9880H | i5-6360U |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
    | mysvd | 1.02300 | 4.29386 | 4.13854 | 4.75812 | 12.57879 | / | / | 2.39917 |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
    | dario | 21 | 41 | 39 | 323 | 40 | 33 | 23 | 78 |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+----------+
    ```

  2. MarkDana created this gist Dec 7, 2021.
    101 changes: 101 additions & 0 deletions m1-max-numpy-setup.md
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,101 @@
    How to install numpy on M1 Max, with the most accelerated performance (Apple's vecLib)? Here's the answer as of Dec 6 2021.

    ---
    ## Steps
    ### I. Install miniforge
    So that your Python is run natively on arm64, not translated via Rosseta.
    1. Download [Miniforge3-MacOSX-arm64.sh](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-MacOSX-arm64.sh), then
    2. Run the script, then open another shell
    ```bash
    $ bash Miniforge3-MacOSX-arm64.sh
    ```
    3. Create an environment (here I use name `np_veclib`)
    ```bash
    $ conda create -n np_veclib python=3.9
    $ conda activate np_veclib
    ```
    ### II. Install Numpy with BLAS interface specified as vecLib
    1. To compile `numpy`, first need to install `cython` and `pybind11`:
    ```bash
    $ conda install cython pybind11
    ```
    2. Compile `numpy` by (Thanks @Marijn's [answer](https://stackoverflow.com/a/66536896/13571357)) - don't use `conda install`!
    ```bash
    $ pip install --no-binary :all: --no-use-pep517 numpy
    ```
    3. An alternative of 2. is to build from source
    ```bash
    $ git clone https://github.com/numpy/numpy
    $ cd numpy
    $ cp site.cfg.example site.cfg
    $ nano site.cfg
    ```
    Edit the copied `site.cfg`: add the following lines:
    ```
    [accelerate]
    libraries = Accelerate, vecLib
    ```
    Then build and install:
    ```bash
    $ NPY_LAPACK_ORDER=accelerate python setup.py build
    $ python setup.py install
    ```
    4. After either 2 or 3, now test whether numpy is using vecLib:
    ```bash
    >>> import numpy
    >>> numpy.show_config()
    ```
    Then, info like `/System/Library/Frameworks/vecLib.framework/Headers` should be printed.
    ### III. For further installing other packages using conda
    Make conda recognize packages installed by pip
    ```
    conda config --set pip_interop_enabled true
    ```
    This must be done, otherwise if e.g. `conda install pandas`, then `numpy` will be in `The following packages will be installed` list and installed again. But the new installed one is from `conda-forge` channel and is slow.

    ---

    ## Comparisons to other installations:

    ### 1. Competitors:
    Except for the above optimal one, I also tried several other installations
    + A. `np_default`: `conda create -n np_default python=3.9 numpy`
    + B. `np_openblas`: `conda create -n np_openblas python=3.9 numpy blas=*=*openblas*`
    + C. `np_netlib`: `conda create -n np_netlib python=3.9 numpy blas=*=*netlib*`

    The above ABC options are directly installed from conda-forge channel. `numpy.show_config()` will show identical results. To see the difference, examine by `conda list` - e.g. `openblas` packages are installed in B. Note that `mkl` or `blis` is not supported on arm64.
    + D. `np_openblas_source`: First install openblas by `brew install openblas`. Then add `[openblas]` path `/opt/homebrew/opt/openblas` to `site.cfg` and build Numpy from source.

    ### 2. Benchmarks:
    Here I use two benchmarks:
    1. `mysvd.py`: My SVD decomposition
    ```python
    import time
    import numpy as np
    np.random.seed(42)
    a = np.random.uniform(size=(300, 300))
    runtimes = 10

    timecosts = []
    for _ in range(runtimes):
    s_time = time.time()
    for i in range(100):
    a += 1
    np.linalg.svd(a)
    timecosts.append(time.time() - s_time)

    print(f'mean of {runtimes} runs: {np.mean(timecosts):.5f}s')
    ```
    2. `dario.py`: A benchmark script by [Dario Radečić](https://gist.github.com/daradecic/a2ac0a75d7e5f22c9aa07174dcbbe061/raw/a56ee217e6d3f949b1d1f719a7a134cef130cd9f/macs.py) at this [post](https://towardsdatascience.com/m1-macbook-pro-vs-intel-i9-macbook-pro-ultimate-data-science-comparison-dde8fc32b5df). Results in this post is also reported in following, as `M1` and `i9–9880H`.

    ### 3. Results:
    ```
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    | sec | np_veclib | np_default | np_openblas | np_netlib | np_openblas_source | M1 | i9–9880H |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    | mysvd | 1.02300 | 4.29386 | 4.13854 | 4.75812 | 12.57879 | / | / |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    | dario | 21 | 41 | 39 | 323 | 40 | 33 | 23 |
    +-------+-----------+------------+-------------+-----------+--------------------+----+----------+
    ```