Onboarding new users to the Hub#

Note

New to Datahub? Interested to learn more about the services offered by the hub? If yes, refer below!

I am instructor planning to teach using Datahub. How do I onboard myself?

Dear Instructor, Sharing few logistical information which would make onboarding easy for you.

Datahub Team - Communications Overview

  • Github Issues: The best way to request help with the Datahub is handled through filing an issue on the github page for the datahub deployment: https://github.com/berkeley-dsep-infra/datahub/issues

  • Slack Channel: For urgent troubleshooting, you or your course staff can also use uctech.slack.com (anyone with a berkeley.edu account can get in) and join the #ucb-datahubs channel to touch base with the infra team.

  • Documentation: You can also refer to the FAQ section of this support documentation (Curriculum Guide), where we regularly update solutions to some of the reported issues.

Datahub Team - Common Requests

  • Packages: Check whether all the needed Python/R packages and their required versions are installed in your respective hub. If not, please use the following template to raise a request to the infra team.

  • Admin Access: You can use this template to add course staff instructors as admins for the requested hub. Your requests should get completed within two to three working days.

  • Big Assignments / High Use times: You can share the important date(s)/time(s) for workshops/exams/assignments etc. when you expect the compute requirement to be larger than usual during this semester. We will review your request and get back to you directly about the feasibility of increasing the compute resources during the mentioned timeframe. You can provide us with all the relevant information using this template.

Datahub Team - News and Updates

  • Newsletter: Are you interested to learn about some of the latest infrastructure changes, case studies of faculty doing interesting work, and scheduled workshops related to Datahub and other tools part of Berkeley Data Science Teaching Stack? If yes, Check our monthly newsletter and subscribe to it if you think the information shared is useful! Some new things to know about are,

  • Nbgitpuller Plugin: There is a new plugin in the Google Chrome store that makes nbgitpuller links from within Github pages!

Finally, provide us with any feedback that will help us improve our hub operations. We want to ensure that you have a smooth experience teaching this semester.

How can I learn more about Datahub to onboard myself?

What languages are supported by the hub?

Datahub primarily supports three languages - Python, R and, Julia. However, We can also support other languages on a case-to-case basis. If you have a unique requirement for using a different programming language as part of your hub, Share your exact requirement over an email to Eric Van Dusen/Balaji Alwar or raise a Github issue.

How many hubs across the campus exist? Which courses use them extensively?

We have 15+ hubs that cater to the diverse needs of the campus audience. We have the main Datahub, which serves multiple departments/courses across the campus. In addition, We have separate hubs for courses such as Data 8, Data 100, Public health, etc., serving the teaching team’s and enrolled students needs. You can learn more about some of the hubs deployed through this link.

What is the default Memory/CPU requirement for every hub?

Datahub has a memory limit of 1 GB of RAM, which should meet the teaching/research needs of most of our users. If you are interested to know more about the memory consumption in your instance, Please use the following steps,

  • Look at the top right corner of your Python/R notebook for the term memory. It will highlight the amount of memory you had consumed by the amount of memory provided to your instance.

  • If your consumed memory is greater than the available memory for your instance, You may require an increase in RAM,

../_images/memory.png

Fig. 6 Here is where you can find the memory related details!#

Please contact us if your course/research has more complex computation requiring increased capacity.

What are the different services offered as part of the Datahub?

We offer UI for Classic Jupyter Notebook, RStudio and JupyterLab across different hubs. You can learn more about the varied services offered through this jupyterhubs services documentation.

What is the process to raise Github issues? How can I track the raised issues?

If you want to raise a bug, you can use this link to submit your issues. You can also use post messages to our Piazza channel if you require real time troubleshooting!

When do I receive a response when an issue gets raised?

Note

Please refer to the below Service Level Agreement (SLA) for varied requests. Time for this SLA will start from the time when we have complete information regarding your request,

  • SLA for package installation: “Acknowledgement within two working days.”

  • SLA for RAM increase: “Acknowledgement within two working days.”

  • SLA for admin access: “Request completed within two working days.”

  • SLA for data archival request: “Request completed within three working days.”

Are there existing templates for submitting requests to the infrastructure team?

We have categorized common requests we get from our users into templates that you can repurpose to make your request. You can refer to the following templates catering to varied scenarios,

Raise Bugs: If you found a bug in the workflow, Please use this template to raise an issue.

Share a New Enhancement: If you envision a new feature/documentation that would help your existing workflow, please use this template to submit a request.

Request Package Addition/Change: If you want to install new packages in R/Python/Julia as part of your hub, please raise a request using this template.

Request for RAM/CPU: If you want to increase/decrease RAM for a specific hub then please use this template to make the request.

Request Admin Access: If you want members of your teaching team to have admin access then please use this template to make this request.

Request to recover hub data: If you want to request data stored as part of your hub instance, then please do use this template to raise a request

Note

If these templates are not exhaustive enough to cover the type of issue you are raising, you can use this generic template to raise your issue.

As an instructor what do I need to do to set up the hub for my course?

Honestly, nothing! You are free to use the Datahub starting today.

Note

We expect that all course members log in using their UC Berkeley email id. We also expect that you are using nbgitpuller service to distribute materials to your class. We can help you set up the links so that you can distribute through your course website.

What if I have a student outside UC Berkeley?

We can’t allow non UC Berkeley users as our authentication system only allows users with UC Berkeley email id. For such users, there are couple of options we recommend below,

How do my students download their submissions as a PDF? We recommend that you use the following options,

  • For Jupyter Notebooks: Select File -> Download as -> PDF via HTML(.pdf) to get the PDF version of your notebook.

../_images/downloadhtml.PNG

Fig. 7 Here is where you can find the option to download the Python notebook as a PDF!#

  • For R files: Select File -> Knit Document -> Select the target folder -> Select the Output Format as PDF to save the PDF version of the file

../_images/knitting.PNG

Fig. 8 Here is where you can find the option to download the R file!#

../_images/knittingpdf.PNG

Fig. 9 Here is where you can find the option to specify the download format as PDF!#