Package Installation in Python and R
Package Installation in Python and R#
What are the packages pre-installed for every hub?
Package installation varies across the different hubs. We ensure that basic python packages such as numPy, pandas, scikit-learn, matplotlib, etc., are installed across the main Datahub. Our R hubs supports packages such as shiny, dplyr, tidyR, RSQLlite, etc. However, you can customize the packages for the hubs by requesting them using this template. If you want to check the list of packages installed,
You can use the below command for Python,
!pip list ```- You can use the below command for R,``` installed.packages()
You can check the packages installed in Julia by accessing the Julia Hub
What should I do if I want to install more packages?
Use your datahub instance to install the required version of the package. Self installation of packages in your instanceof hub is a temporary measure to identify dependencies. If you require a permanent solution then you need to request us to install the required package(s) in your hub.
If you want to install packages for Python in your instance, then use the following syntax,
pip install <package-name> Eg: pip install numpy
If you want to install packages for R in your instance, then use the following syntax,
Check if there are specific dependencies for the installed package. Highlight the package name along with their version and dependencies as part of your request.
Raise a request using this template!
Many Python packages have been pre-installed on JupyterHub and are available by default. This file on GitHub contains a list of all the packages (and corresponding versions) that are available by default. To use a pre-installed package such as
numpy, you can simply type the line below into a code cell.
Installing New Packages#
You can also install other packages that are not on this list. There are two methods for installation. If you will be using a package regularly in your course, we recommend using the long-term installation method.
Notebooks provide support for bash commands in code cells. The line below, when run in a notebook code cell, will temporarily install
numpy into a user’s personal account.
numpy will then be available for use while the server is running. This cell must be run every time the user’s server is restarted. Note, this is not a system-wide installation. Running the cell below will only install numpy temporarily into a user’s personal account.
!pip install numpy import numpy
This is the recommended method for packages that will be used frequently.
Our JupyterHub is deployed from the berkeley-dsep-infra/datahub GitHub repository. The
staging branch reflects the state of the staging hub while the
prod branch reflects the state of the production hub.
Make sure to specify a version for any library you install. If you do not, it is likely that the deployment process will break at some point during the semester. Omitting a version will not enable the user environment to always have the latest version – it will only have the latest version that existed on the date that CI process runs. If you want to use an unreleased version of a library, specify the corresponding git SHA of that library’s repository.