Anaconda是python的一个科学计算发行版，里面集成了各种各样的科学计算包，如numpy、pandas、sklearn等。作为学生，在使用anaconda发行版的同时，我们可以申请anaconda的学术证书，通过它可以下载一些额外的包以实现计算过程的加速。

注册、申请、下载

注册地址：https://anaconda.org/

其中的email要用学校邮箱，否则不能申请成功。我的邮箱是 **@stu.xmu.edu.cn

在注册完了后，先点右上角的头像处，再选择My Setting 如下：

选择add ons

将右边的三个license下载下来：MKL Optimizations、IOPro、Anaconda Accelerate。

安装

先打开命令行，输入如下命令（注，我这里以windows下为例），确定license的安装位置。

conda info --license

如图

（注：ASUS即用户的主文件夹，各位根据自己的电脑调整）

然后将下载的3个license（即txt文件）放到 .continuum 文件夹里面。这里注意下，如果打开个人文件夹(这里即ASUS文件夹)没看到 .continuum ，那要记得勾选一下显示隐藏文件。如果还不存在，那需要自己创建 .continuum 文件夹。如下：

然后打开命令行，分别输入

conda install accelerate
conda install iopro

注意，是分别输入，而且在安装accelerate时为满足dependency会同时安装mkl，所以就不单独输入conda install mkl了。比如安装 accelerate 模块，如下：

跟着提示来，下载过程可能有点慢：）

测试使用

以accelerate为例，从add ons页面中可以看出acelerate的作用是：

Fast Python for GPUs and multi-core with NumbaPro and MKL Optimizations.

在安装accelerate后我们可以利用GPU显卡来加速计算过程。不过查了官方文档搜了stackoverflow翻遍了google和百度都没有找到单独关于acclerate库的使用，大多数的讨论集中于在安装后对numba的使用，所以这里用numba来测试一下加快了多少。在测试前我把显卡的驱动升级了下，接下来确定一下机子的显卡是否支持。命令行打开，输入：

1
2
3

import numba.cuda.api,numba.cuda.cudadrv.libs
numba.cuda.cudadrv.libs.test()
numba.cuda.api.detect()

第一行导入库，第二行用来检测库的安装正确，第三行用来确定显卡是否支持加速。

下面是测试代码（网上找的稍微修改了下，自己还写不出来）：

import numpy as np
from numba import jit
nobs = 1000000
def proc_numpy(x,y,z):
   x = x*2 - ( y * 55 )      # these 4 lines represent use cases
   y = x + y*2               # where the processing time is mostly
   z = x + y + 99            # a function of, say, 50 to 200 lines
   z = z * ( z - .88 )       # of fairly simple numerical operations
   return z
@jit
def proc_numba(xx,yy,zz):
   for j in range(nobs):     # as pointed out by Llopis, this for loop
      x, y = xx[j], yy[j]    # is not needed here.  it is here by
                             # accident because in the original benchmarks
      x = x*2 - ( y * 55 )   # I was doing data creation inside the function
      y = x + y*2            # instead of passing it in as an array
      z = x + y + 99         # in any case, this redundant code seems to
      z = z * ( z - .88 )    # have something to do with the code running
                             # faster.  without the redundant code, the
      zz[j] = z              # numba and numpy functions are exactly the same.
   return zz
x = np.random.randn(nobs)
y = np.random.randn(nobs)
z = np.zeros(nobs)
res_numpy = proc_numpy(x,y,z)
z = np.zeros(nobs)
res_numba = proc_numba(x,y,z)
%timeit proc_numpy(x,y,z)
%timeit proc_numba(x,y,z)

结果如图,第一行是用cpu计算的时间，2.06ms；第二行是gpu计算的时间 121μs；就本例而言快了17倍左右。