了解jupyter notebook,你已然超越了90%的python程序员_python写代码的软件

来源：大数据

本文约2800字，建议阅读8分钟

本文总结了28种Jupyter Notebook的使用技巧。

[ 导读 ]最近做实验一直是用Jupyter Notebook编程，有一种打草稿的便捷感，在dataquest上看到一篇博客总结了28种Jupyter Notebook的使用技巧。为了方便大家理解，对原文一个简略的地方进行了适当的解释和扩充。希望大家在用Jupyter Notebook编程时可以更加爽快。

Jupyter Notebook，以前称为IPython Notebook，是一种灵活的python编程工具，可以用来创建可读的分析。在Jupyter Notebook上可以将代码、图像、注释、公式和可视化结果保存在一起。在这篇文章中，我们介绍了一些非常实用的Jupyter Notebook高级使用技巧，让Jupyter Notebook成为你编程的超级利器！

Jupyter Notebook有很多的快捷键，编程时使用这些快捷键将提高你的编程效率。想知道Jupyter Notebook有哪些快捷键，你可以在它的下拉菜单Help>Keyboard Shortcuts中找到。或者在command model中按下H查看。每次更新Jupyter的时候你都最好看看有哪些新的快捷键。
还有一个方法调用快捷键，那就是使用Ctrl + Shift + P 调出command palette。在这个对话框中你可以输入快捷功能的名字来使用快捷键，比如你想重启kernel，那就在对话框中输入’restar’，command palette会自动显示候选的功能。这个功能类似Mac上的Spotlight工具。

我的一些常用快捷键：

Esc进入command mode
在command mode下：A/B可以在上/下方插入新的cell，M切换到Markdown模式下，Y切回编程模式，D+D删除当前cell
Enter从command mode返回edit mode
Shift + Tab会显示你刚才输入对象的文档
Ctrl + Shift + -将会分割你的cell
Esc + F查找替换代码（不包含输出部分）
Esc + O隐藏cell的输出
你还可以选对多个cell进行操作：Shift + J 或Shift + Down向下选择，Shift + K 或Shift + Up向上选择，Shift + M合并多个cell

02 整齐的变量输出

当你的cell最后是一个变量名，那么你不需要用print就可以输出了。特别是你要输出Pandas DataFrames的时候，这很有用。
不过我要教你一个少有人知道的技巧，指定ast_note_interactivity参数为all来一次性输出多个变量而不用print。

ore.interactiveshell import InteractiveShellInteractiveShell.ast_node_interactivity = "all" from pydataset import data quakes = data('quakes') quakes.head() quakes.tail() # 输出的效果是将head和tail都输出，而不是只有tail输出

如果你希望所有Jupyter 的cell都这样输出，创建一个文件~/.ipython/profile_default/ipython_config.py并输入以下代码：

c = get_config() # Run all nodes interactively c.InteractiveShell.ast_node_interactivity = "all"

03 快速链接文档

你可以在Help菜单中看到一些常用库，如NumPy, Pandas, SciPy and Matplotlib的文档。不过你还可以在方法前面加?来查看对应的文档。

# 执行下面这行代码在Jupyter Notebkook中 ?str.replace() # 将显示文档 Docstring: S.replace(old, new[, count]) -> str Return a copy of S with all occurrences of substringold replaced by new.  If the optional argument count isgiven, only the first count occurrences are replaced. Type:      method_descriptor

04 在notebooks中绘图

常用的绘图库包括：matplotlib, Seaborn, mpld3, bokeh, plot.ly, Altair

05-15 魔法命令

由于Jupyter是基于IPython内核的，所以Jupyter可以使用IPython内核中的Magics命令。
可以使用%lsmagic查看所有的magic命令。

%env，设置环境变量

# Running %env without any arguments # lists all environment variables # The line below sets the environment # variable %env OMP_NUM_THREADS%env OMP_NUM_THREADS=4 # output env: OMP_NUM_THREADS=4

%run，执行python代码

# this will execute and show the output from # all code cells of the specified notebook %run ./two-histograms.ipynb

%load，导入外部脚本

有时候你想运行一个外部脚本，但是想用Jupyter加一些代码，那么你可以先把它load进Jupyter。

# 你有一个hello_world.py文件 # 内容是if __name__ == "__main__":   print("Hello World!") # 在Jupyter中先用%load载入 %load ./hello_world.py # 运行%load ./hello_world.py命令后，在你的cell中就出现以下几行代码（你执行的%run语句会显示已经注释） # %load ./hello_world.py if __name__ == "__main__":    print("Hello World!")

%store，在notebook之间传递变量

# 在notebook A 中 data = 'this is the string I want to pass to different notebook' %store data del data # This has deleted the variable # 在notebook B 中 %store -r data print(data) # 显示this is the string I want to pass to different notebook

%who，显示所有的变量

# 某个cell中有以下四行代码 one = "for the money" two = "for the show" three = "to get ready now go cat go" %who str # 输出为 one   three   two

%%time和%timeit

%%time将提供代码单次运行的信息，%%timeit将默认运行你的代码100,000次，提供最快运行三次的平均结果。

%%writefile和pycat，导出单格的内容/显示外部脚本的内容

%%writefile保存cell内容到外部文件。%pycat正好相反。

%prun，显示程序中每个函数的调用信息

%pdb，代码调试

详细的介绍在：

https://docs.python.org/3.5/library/pdb.html#debugger-commands

为视网膜(Retina)屏输出高分辨率图像

# 常规图像 x = range(1000) y = [i  2 for i in x] plt.plot(x,y) plt.show(); # 视网膜(Retina)图像 %config InlineBackend.figure_format ='retina' plt.plot(x,y) plt.show();

16 在函数末尾加入分号可以抑制输出

在函数末尾加分号可以抑制函数的输出。

17 执行shell命令

在shell命令前面加!

ore.interactiveshell import InteractiveShellInteractiveShell.ast_node_interactivity = "all" from pydataset import data quakes = data('quakes') quakes.head() quakes.tail() # 输出的效果是将head和tail都输出，而不是只有tail输出

18 用LaTex写公式

在markdown cell 中书写LaTeX时，它会被 MathJax 渲染成一个公式

19 在一个notebook中运行多种kernel的代码

c = get_config() # Run all nodes interactively c.InteractiveShell.ast_node_interactivity = "all"

20 为Jupyter安装其他的kernel

Jupyter其实不止可以用于python编程，安装一个R内核它就可以用于R语言编程。

# 通过Anaconda安装 conda install -c r r-essentials # 手动安装 # 你需要先从https://cloud.r-project.org下载安装R # 然后在R控制台下运行以下代码 install.packages(c('repr', 'IRdisplay', 'crayon', 'pbdZMQ', 'devtools')) devtools::install_github('IRkernel/IRkernel') IRkernel::installspec()  # to register the kernel in the current R installation

21 在同一个notebook中运行R和Python

你可以安装rpy2用pip install rpy2

%load_ext rpy2.ipython %R require(ggplot2) array([1], dtype=int32) import pandas as pd df = pd.DataFrame({         'Letter': ['a', 'a', 'a', 'b', 'b', 'b', 'c', 'c', 'c'],         'X': [4, 3, 5, 2, 1, 7, 7, 5, 9],         'Y': [0, 4, 3, 6, 7, 10, 11, 9, 13],         'Z': [1, 2, 3, 1, 2, 3, 1, 2, 3]     }) %%R -i df ggplot(data = df) + geom_point(aes(x = X, y= Y, color = Letter, size = Z))

22 用其他语言来写函数

有时numpy的速度不够快，我需要写一些快速的代码。
原则上，可以在动态库中编译函数并编写python包装器…
但是把这个无聊的部分做完会更好，对吧?
您可以用cython或fortran编写函数，并直接从python代码中使用这些函数。
首先你需要安装cython：

!pip install cython fortran-magic %load_ext Cython %%cython def myltiply_by_2(float x):     return 2.0 * x myltiply_by_2(23.)

就个人而言我建议使用fortran：

%load_ext fortranmagic %%fortran subroutine compute_fortran(x, y, z)     real, intent(in) :: x(:), y(:)     real, intent(out) :: z(size(x, 1))     z = sin(x + y) end subroutine compute_fortran compute_fortran([1, 2, 3], [4, 5, 6])

23 多行编辑模式

你可以在Jupyter中使用多行编辑模式，只需要按住Alt键。

24 在Jupyter上安装插件

Jupyter-contrib extensions是一个插件库，包含了很多实用的插件，包括jupyter spell-checker和code-formatter。
使用以下命令安装Jupyter-contrib extensions

# 执行下面这行代码在Jupyter Notebkook中 ?str.replace() # 将显示文档 Docstring: S.replace(old, new[, count]) -> str Return a copy of S with all occurrences of substringold replaced by new.  If the optional argument count isgiven, only the first count occurrences are replaced. Type:      method_descriptor

安装成功后Jupyter-contrib extensions会以菜单栏的方式显示在界面上。

25 从notebook中创建PPT

安装RISE工具就可以从已有的notebook中创建powerpoint风格的演示了。

conda install -c damianavila82 rise或pip install RISE安装RISE。

# 激活RISE jupyter-nbextension install rise --py --sys-prefix jupyter-nbextension enable rise --py --sys-prefix

26 Jupyter输出系统

使用IPython.display这个库可以将多媒体文件排列输出。

27 大数据分析

推荐使用ipyparallel，pyspark工具以及%%sql魔法命令进行大数据查询，处理。

28 分享notebooks

通常分享*.ipynb文件是最简单的方式。但是如果你要给不用Jupyter的人分享有以下几种选择：

使用File - Download as - HTMLl菜单选项将笔记本转换为html文件
在github或者gist.github.com上分享notebooks
使用jupyterhub搭建你自己的分享系统
在dropbox上存储你的notebook并且将链接挂到https://nbviewer.jupyter.org上
使用File - Download as - PDF保存notebook为PDF

最后希望大家看完这篇“安利”后可以愉快地使用Jupyter Notebook~

参考资料

28 Jupyter Notebook Tips, Tricks, and Shortcuts

https://www.dataquest.io/blog/jupyter-notebook-tips-tricks-shortcuts

与作者交流

github：

https://github.com/keloli

blog：

https://www.sigusoft.com/u/d055ee434e59

编辑：于腾凯

校对：林亦霖

— 完 —

关注清华-青岛数据科学研究院官方微信公众平台“THU数据派”及姊妹号“数据派THU”获取更多讲座福利及优质内容。