Data generation

Different instruments for high throughput sequencing are available at the LGTC. These instruments produce the primary (raw) data. The generation of the raw data of each of these machines, i.e. FASTQ files, are created with the software provided by the company unless stated otherwise. This generally includes basecalling and basic quality control assessments. Consequently the output files depend on the machine used. The output formats differ between the available technologies and the summary below show which primary files correspond to which technology. These files will always be handed over by the LGTC and from this point the data is suitable for secondary analysis. Data such as image, movie, or intensity files are not saved during production and not available for transfer.

Primary files produced per technology:

  • Illumina: FASTQ
  • Ion Torrent and Ion Proton: FASTQ
  • Pac Bio RSII: .pls.h5 and .bas.h5

Secondary data

The LGTC performs data analysis only if this has been requested and included in the quotation. Whenever this is the case a meeting is scheduled with a specialist associated with the LGTC. During this meeting the analysis plan will be determined on the basis of a checklist provided by the LGTC covering a wide variety of services. The resulting analysis plan, the resulting terms and conditions, and prices are recorded in an agreement during the same meeting. Whenever new techniques are performed, it can be necessary to setup a custom analysis plan. Other possibilities such as annotation against a local variant database will be discussed at this point as well, see the General Workflow NGS.

Sequencing applications

In conjunction with other members of the Center for Human and Clinical Genetics we provide analysis of next-generation sequencing (NGS) data, as well as advice.

We also give courses, like the MGC course on NGS data analysis, Linux courses and an introduction course for using our cluster.This and additional courses can also be found at our course announcements page.

We have worked with data from many applications, including Exome sequencing, ChIP-seq, DeepCAGE, DeepSAGE RNA-seq, de novo sequencing, SNP, detection, and miRNA profiling. For a selection of these applications there are pipelines setup by the LGTC. However, modified or complete custom analysis are also possible.

Data delivery

You will receive the data which can be provided via an HTTP server or external hard disks. This includes the raw data specific to the instrument used and the output files from the discussed analysis. The media and format of the files will be as what was agreed upon in the intake discussion. Whenever analysis is complete, another meeting will scheduled to review the results and conclude the project.

Software and hardware

The software used at the LGTC will be open-source as much as possible. This means it can be freely used and distributed without any additional costs. These tools should run without any problems on any Linux based system. Our custom pipelines may require a cluster.

Regarding hardware, we can also provide services on our computer cluster for projects which need hundreds of cores or lots of memory.