Welcome
Join us in person for the October 2024 nf-core hackathon!
This hackathon will be held in advance of the Nextflow Summit 2024 in Barcelona, Spain and is organised jointly.
- Summit website: https://summit.nextflow.io/2024/barcelona/
- Dates: 28th - 30th October 2024
- Registration: Visit the summit registration page
- Schedule: Visit the summit agenda page
- Location: World Trade Center Barcelona
- Slack channel:
#hackathon-oct-2024
This nf-core hackathon is sponsored by Seqera and Oxford Nanopore Technologies. Many thanks to both companies for making this event possible!
How the hackathon works
What to expect
The nf-core hackathons are collaborative, community-driven events where participants work together on projects.
Everyone is welcome, no prior experience of nf-core contributions is needed. However, we do expect that you have some experience with writing Nextflow code. Note that there is a separate training event for learning Nextflow and nf-core from scratch, running in parallel to the hackathon.
nf-core hackathons are not just about coding! We also have a lot of fun. We typically run things like a quiz, a bingo and have several small prizes for the winners. In addition to small social games and contests during the event, we also have a social evening.
Prerequisites
Before you arrive at the hackathon, please make sure that you have:
- Registered to come!
- Joined the nf-core Slack
- Joined the nf-core GitHub organisation
- Added yourself to the
#hackathon-oct-2024
slack channel
Where to find tasks
We collect all tasks in the “Hackathon October 2024” GitHub project board.
If you are not too familiar yet with the code base, a great starting point is to filter for issues labelled as good first issue
.
Once you found something, get in touch with the project group (i.e. ping them on slack, find them in the room), assign yourself, and get started.
How to contribute code
We use GitHub to collaborate on code:
- Find a hackathon project
- Discuss with the group and assign yourself to an issue. (Create one if it is not there)
- Fork the repository
- Work on a branch in your Fork
- Once ready, open a PR to the parent repository
Only assign yourself to an issue if you are ready to work on it, typically one issue at a time.
See Helpful resources below for more information.
Schedule
The hackathon will run from Monday 28th October to Wednesday 30th October. The registration opens on Monday at 9am. We will start at 10am everyday and close at 5pm on Monday and Tuesday. On Wednesday we will wrap up at 1pm. For a complete schedule visit the summit agenda page
Projects
This hackathon we will use projects, as opposed to the broader groups of previous events. At the end of each day, we will group the projects into categories and sum up their progress. Projects can be anything from:
- Adding new features to existing pipelines
- Adding and improving components (modules / subworkflows)
- Improving the website and nf-core tooling
- Creating entirely new pipelines
- Discussion and planning community initiatives
- Working on special interest group topics
- …anything else
You can bring your own favourite topic or choose from a list of open issues in the community. Each project has a lead who can point you in the right direction.
You don’t need to commit to a single project and are free to move around groups and projects throughout the event.
Submit a new project
New projects can be proposed in the #hackathon-oct-2024
slack channel.
Use the project proposal form to submit an idea.
After a some community discussion, you can add your project to the list below and others can find it.
If you are planning to start a new pipeline, please propose it on the #new-pipelines
slack channel
ahead of the hackathon start to avoid delays during the event.
Once a project is approved, the project leaders should add it to this webpage and
add issues issues to the GitHub project board ahead of the hackathon.
If appropriate, label them as good first issue
.
Join a project
Joining a project is as simple as turning up and getting in touch with the group. If you don’t know where to find them in the room, ping the project lead on slack.
You can move freely between projects throughout the event.
List of projects
Pipelines
nf-core/seqinspector
#seqinspector
nf-core/seqinspector aims to be a pipeline for initial quality control of sequencing data. Input is either FASTQ files or a run folder, and output is planned to be a global MultiQC report and, if wished, MultiQC files of groups that are defined in the sample sheet. By joining this group you can
- add existing modules to a pipeline (beginner friendly)
- write a new module of your preferred QC tool if it doesn’t exist yet (intermediate level).
- start with implementation of long read methods (advanced level, we have only limited experience in the group, so help would be more than appreciated!).
- work on display of the data in the MultiQC reports (beginner - intermediate level)
- write documentation
Goal
Work towards a first release
Group Leaders
Single cell analysis: nf-core/scrnaseq and nf-core/scdownstream
nf-core/scrnaseq transforms FASTQ files into expression matrices, while nf-core/scdownstream receives expression matrices as input and performs quality control, integration, clustering and more. This project has two main parts:
- Move some local modules from scdownstream to nf-core/modules so that they can be re-used in scrnaseq.
- Add new features to scdownstream. See the open enhancement issues for details.
By joining this group you can:
- Create new modules and move existing local modules to nf-core/modules (beginner friendly)
- Integrate the newly added modules into the pipelines (beginner - intermediate level)
- Improve the scdownstream MultiQC report and documentation (beginner - intermediate level)
- Look into a potential extension of scdownstream to multi-omics analyses (advanced level, has not yet been tackled but help would be great!)
Prior experience with single cell analysis is not required, but helpful.
Goal
Move shared functionality to nf-core/modules and make #scdownstream ready for 1.0 release
Group Leaders
nf-core/deepmodeloptim
#deepmodeloptim
We proposed a new pipeline to nf-core, initially called STIMULUS (available here: https://github.com/mathysgrapotte/stimulus).
This pipeline aims to explore ways that deep learning models can learn, relative to how the input data is processed
(check GitHub or the #deepmodeloptim
channel on the nf-core slack).
Working on deepmodeloptim on the hackathon will mostly involve nf-core-izing the pipeline and making it a place where it is easy to contribute.
Goal
Reach nf-core/deepmodeloptim v1.0.0 release!
Group Leaders
nf-core/sarek
#sarek
General work on sarek with a focus on maintenance:
- Improve input validation
- Improve documentation
- Fix bugs
If you want to get started on a new addition, this is a great time to come by and chat.
Goal
Improve input validation, usability, and docs
Group Leaders
Sarek (preprocessing) goes GPU
#parabricks
Variant calling has multiple time consuming steps that could be faster if we use GPUs instead of CPUs. First steps to achieve that could be the integration of Parabricks which is software developed by NVIDIA. The modules are ready and need to be integrated into sarek.
This project can also expand onto other pipelines and include more tools that allow execution on GPU.
Goal
Integration of Parabricks in sarek
Group Leaders
nf-core/genomeqc
#genomeqc
A pipeline to compare and contrast genome assemblies and their annotations.
When you sequence a new genome, or wish to use a published genome, it is important to gauge the quality of the assembly. There are basic tools, such as BUSCO (completeness), QUAST (contiguity), or general statistics of numbers of chromosomes, genes, etc (AGAT), but no nf-core pipeline to perform all of these tasks, including documenting their TE content, telomere locations, contamination level.
In addition, we would want to plot this on a phylogenetic tree, to help compare these stats.
See the #genomeqc
nf-core Slack channel to join!
Goal
Write a first draft of this pipeline
Group Leaders
nf-core/variantbenchmarking
#variantbenchmarking
This is a variant benchmarking pipeline, for now structural variant and small benchmarking parts for germlines are working, yet there are plans for addin g somatic benchmarking including creation of a truth file for structural variants and adding some benchmark tools. The pipeline needs to be tested extensively and a set of reviews is required.
Goal
I would like to have the first version of this pipeline published
Group Leaders
nf-core/proteinfold
#proteinfold
We have adding new reporting capabilities to the pipeline lately and we would like to finish adding this features and testing them during the hackathon.
A part from this, we would like to explore which other tools could be added to the pipeline (e.g. RoseTTAFold or OmegaFold) and discuss with the community what should be the future of the pipeline in terms of development.
Of course, as any other pipeline we would try to find other more “house-keeping” issues in which people joining the group could get involved during the hackathon.
If you are interested please join the group!
Goal
Towards release 1.2.0 and beyond
Group Leaders
nf-core/phaseimpute
#phaseimpute
nf-core/phaseimpute is a multi-steps pipeline dedicated to genetic imputation from simulation to validation. By joining this group you can
- Contribute updated subworkflows and modules back to nf-core (beginner - intermediate level)
- Improve documentation and enhance readability (beginner)
- Assist with pre-release review (beginner to expert)
- Add support for SNP chip array data (simulation and imputation) (expert)
- Add support for sexual chromosome imputation (expert)
Goal
Work towards a first release
Group Leaders
nf-core/differentialabundance
This is a pipeline for downstream gene expression analysis, with a main focus on differential expression analysis. Currently, we are working on a new branch dev-ratio
with two objectives:
- add new methods
- convert the pipeline into a modular and unified framework that allocates different ways of performing differential analysis.
By joining this group you will help with:
- Implement and add nf-core modules that wrap other methods that can be used to perform differential analysis. Some modules (ie. propd) were already created but need to be updated (beginner friendly).
- Add these new modules to the pipeline and update the pipeline parameters correspondingly (beginner friendly).
- Add the existing modules to the new modular subworkflow in a way that will reproduce the original pipeline’s behaviour (beginner - intermediate level).
- Update the code required to generate the plots and reports (beginner - intermediate level).
- Restructure the pipeline architecture (intermediate level).
- Update documentation (beginner friendly).
Goal
move forward for the next release with the restructured pipeline and the new methods!
Group Leaders
Components
Image processing pipelines
This project focuses on creating modules and adding functionality to (highly multiplexed) imaging pipelines - nf-core/mcmicro and nf-core/molkart. By joining this group you can:
- Create new segmentation modules for nf-core/modules (beginner friendly) …
- … and integrate them into the pipelines (beginner - intermediate level)
- Support the spot detection implementation for MCMICRO (intermediate level)
- Work on improved QC metric reporting for both pipelines (beginner - intermediate level)
- Help us address open issues (beginner - intermediate level)
Goal
Work towards next Molkart release and implement an additional MCMICRO segmentation option.
Group Leaders
Update subworkflows meta.yml
We will work on updating the meta.yml
file of subworkflows to have the proper description for the structure of input and output channels.
Check the issue describing the tasks and tracking the progress.
Beginners and first-time contributors are welcome!
Goal
Update the meta.yml file of all nf-core subworkflows
Group Leaders
Software packaging: ARM
Looking into making more packages build natively for linux/arm64
and improving
performance of the important ones for faster and cheaper runs on ARM machines
such as AWS Graviton.
Goal
Optimise run time on at least one tool
Group Leaders
Tooling
References
#references
Continuing the discussion from last year’s hackathon, this group will work on tasks related to references’ genomes handling / management. Some work has started with nf-core/references but it is at a very early stage. This hackathon group will work towards agreeing on a fundamental structure and plan.
Goal
Replacing iGenomes, then world domination.
Group Leaders
Tube map polishing
Everybody loves the nf-core tube maps, but they also need some special care to gleam in all their beauty. Come join us and refine your workflows representations to their full glory. Doesn’t matter if you already have a finished version and want a thorough review (🦅👀) or brainstorm some ideas and concepts to start a new one, this group is for you. Disclaimer: This will not be an introduction to vector graphic tools. You bring the tools, we bring the eyes and brains.
Goal
Make the tube maps in pipelines even more fabulous.
Group Leaders
nf-test plugins
#nft-plugins
nf-test is a very important piece of our modules, used for continuous integration testing of all our modules.
However, writing tests for some file types / more advanced tests can be difficult.
In this group we will try and kickstart the creation of nf-test plugins to make our testing a lot easier.
This will mainly involve the development of nft-utils
, the improvement of nft-bam
and hopefully the creation of completely new nf-test plugins.
Goal
Fully develop nft-utils and start new nf-test plugins
Group Leaders
Infrastructure around nf-core/modules
#tools
For Pythonistas 🎉 Working on nf-core/tools by developing infrastructure related to nf-core/modules.
Issues:
- Make
nf-core modules create
use the same structure for local modules than for remote modules (beginner friendly) - Fix bug:
nf-core modules update
deletes templates files when there is a patch file (intermediate level) - Fix the structure of modules meta.yml files (intermediate level)
Goal
Develop infrastructure for nf-core/modules
Group Leaders
Special Interest Groups
Regulatory
#regulatory
This group will work on tasks for the #regulatory special interest group. Most likely we will try to come up with more detailed plans on how to tackle different needs of subgroups within regulatory and try to come up with a strategy on how to both align between those subgroups as well as to come up with plans / proposals for the wider community what we could add to enable e.g. auditors or authorities to understand better what nf-core already provides.
Goal
Clearing out the scope of the regulatory special interest group and discussing who would tackle different subfields of the entire regulatory space for future improvements on nf-core guidelines and pipelines.
Group Leaders
Meta-omics
#meta-omics
We will work together on any or all of the meta-omics pipelines — mag, ampliseq, metatdenovo, magmap, eager, funcscan, createtaxdb and taxprofiler etc. — extending functionality, but also discussing how they can be made to better integrate with each other plus a number of downstream pipelines, both within and outside nf-core.
We will have a number of documentation and/or new-module requests for newcomers to get their hands dirty, and larger implementation tasks for more advanced developers.
Goal
Widened understanding of the implementation details of all pipelines in a larger group of developers.
Group Leaders
Teaching
#training
nf-core pipelines are playing a crucial role in standardising bioinformatics workflows and their user base is growing every day. Engaging training materials are essential to complement the pipelines that are being released. To achieve this, a problem-based learning approach could be developed for several nf-core pipelines where tutorials follow a storyline based on carefully simulated data. A first attempt at this approach has been drafted for nfcore/sarek (https://lescai-teaching.github.io/sarek-tutorial) and nfcore/rnaseq (https://lescai-teaching.github.io/rnaseq-tutorial). This project intends to gather people who are willing to discuss and develop further similar materials for these and other nf-core pipelines.
Goal
Develop tailored course materials and hands-on tutorials.
Group Leaders
Social activities
During the hackathon, we will have light-hearted fun and games! Special prizes are up for grabs for the winners!
More details will be revelealed at the start of the event, but you can expect: a quiz, bingo and sock-related activities.
Connect game
While waiting for tests to pass, why not play a quick game?
In the game you have to match three of the same symbol. Don’t forget to post your highscores in the #connectgame channel on slack!
Be sure to also check out the soundtrack! 🎧
Helpful resources
Bytesize talks
There are many talks about Nextflow and nf-core on the nf-core Bytesize playlist. In particular, the talk about using git and GitHub in an nf-core environment may be useful.
Tutorials and docs on the nf-core website
Help with coding and nf-core tools
- Learn how to use the Gitpod environment
- Install nf-core/tools
- Overview of all nf-core/tools commands for pipelines, modules and subworkflows
Adding to pipelines
- Adding new modules to a pipeline to an existing pipeline.
- How to write new modules / subworkflows.
Creating a new pipeline
- How to create a new pipeline that won’t be added to nf-core
- Guidelines for developing a pipeline for external use
- Guide for adding a new pipeline to nf-core
Code of conduct
Please note that our Code of Conduct applies to the Hackathon, and all participants need to abide by our guidelines to participate. We should all feel responsible for making nf-core events safe and fun for everyone.
You can also report any CoC violations directly to safety@nf-co.re. Our safety officers will contact you to follow up on your report.
In case of an immediate perceived threat at the hackathon, please reach out to any of staff or organizers on site.