VDJlogo Home
Explore Data User Guide

User Guide

This page contains tutorials for using VDJbase

Introduction | Search | Graphs | Downloading | Submission | Summary | Frequently Asked Questions (FAQ)

Introduction

VDJbase offers users a fast and convenient way to browse for genotypes and haplotypes, compare published datasets, generate interactive visual analyses, and submit AIRR-seq data to foster continuous growth. No login is required, and the data is freely downloadable.

Graphical structure of VDJbase

figure2

Query options

How to generate graphs for selected data?

The ‘Export Graphs’ menu enables the creation of a visual analysis of the user’s selected samples. Using a set of drop-down tag lists, users can filter all visualizations according to genes, alleles, or the certainty level of inferences (Kdiff) for each genotype/haplotype decision

Downloading Data

Metadata

'Download Selected' tab provide an Excel format with a full metadata information

about the selected samples along with their genotype/haplotype files. Metadata fields are according MiAIRR-complaint format

In the case of missing sample metadata information, this means that this information had not published

Metadata name

Metadata description

Example

STUDY

 

 

Study ID

Unique ID assigned by study registry

PRJNA001

Contact information (data collection)

Full contact information of the data collector, i.e. the person who is legally responsible for data collection and release. This should include an e-mail address.

Dr. P. Stibbons, p.stibbons@unseenu.edu

Lab name

Department of data collector

Department for Planar Immunology

Lab address

Institution and institutional address of data collector

Bar Ilan University

Relevant publications

Publications describing the rationale and/or outcome of the study

"PMID:85642"

SUBJECT

 

 

Subject ID

Subject ID assigned by submitter, unique within study

SUB856413

Organism

Binomial designation of subject's species (format: ontology)

Id: 9096

Value: Homo sapiens

Sex

Biological sex of subject

Male,femle

Age

Absolute age of subject at time point `Age event`

"65 yr"

Ethnicity

Ethnic group of subject (defined as cultural/language-based membership)

English, Kurds

Race

Racial group of subject (as defined by NIH)

White, American Indian or Alaska Native

Study group description

Designation of study arm to which the subject is assigned to

Celiac disease (control)

Diagnosis

Diagnosis of subject

Multiple myeloma

Length of disease

Time duration between initial diagnosis and current intervention

23 months

Disease stage

Stage of disease at current intervention

Stage II

Prior therapies for primary disease under study

List of all relevant previous therapies applied to subject for treatment of `Diagnosis`

melphalan/prednisone

Immunogen/agent

Antigen, vaccine or drug applied to subject at this intervention

bortezomib

Intervention definition

Description of intervention

systemic chemotherapy, 6 cycles, 1.25 mg/m2

Other relevant medical history

Medical history of subject that is relevant to assess the course of disease and/or treatment

MGUS, first diagnosed 5 years prior

SAMPLE

 

 

Biological sample ID

Sample ID assigned by submitter, unique within study

SUP52415

Sample type

The way the sample was obtained, e.g. fine-needle aspirate, organ harvest, peripheral venous puncture

Biopsy

Tissue

The actual tissue sampled, e.g. lymph node, liver, peripheral blood

Bone marrow

Anatomic site

The anatomic location of the tissue, e.g. Inguinal, femur

Iliac crest

Collection time event (sample)

Event in the study schedule to which `Sample collection time` relates to

Primary vaccination

PCR TARGET

 

 

Target locus for PCR

Designation of the target locus according to IMGT nomenclature

IGH,IGK,IGL,TRA,TRB,TRD,TRG

Forward PCR primer target location

Position of the primer set that used for the 5' end.

5’ utr

Reverse PCR primer target location

Position of the primer set that used for the 3' end.

Constant region

NUCLEIC ACID PROCESSING

 

 

Target substrate

The class of nucleic acid that was used as primary starting material for the following procedures

RNA/DNA

Library generation protocol

Description of processes applied to substrate to obtain a library that is ready for sequencing

cDNA was generated using

SEQUENCING RUN

 

 

Total reads

Number of usable reads for analysis (functional clones)

10365118

Sequencing platform

Designation of sequencing instrument used

Alumina LoSeq 1000

Sequencing length

Read length in bases for each direction

Short reds, Full length

Date of sequencing run

Date of sequencing run

2016-12-16 (date)

GENOTYPE AND HAPLOTYPE PROCESSING PROTOCOL

 

 

Processing version

Indicate the version of the processing pipeline.

Full_Pipeline_11_07_19

Aligner tool

Indicate which computational tool was used to perform V(D)J alignment

IgBLAST 

Aligner tool version

Version number and / or date that indicate which computational tool was used to perform V(D)J alignment

1.7.0

Preprocessing tool

indicate which computational tool was used to perform initial processing of the data (pRESTO)

 

Germline gene set

The reference of sequences which the aligner used.

IMGT full - version 4/12/2018

Genotype tool

Indicate which computational tool was used to perform V(D)J genotyping

TigGER

Genotyping tool version

Indicate the version of computational tool that was used to perform V(D)J genotyping

0.3.1

Data Submission

A straightforward submission form is available upon request via our ‘Discussion Forum’. Submitted forms are validated and the associated data processed by the site administrator. This will allow the repository to grow, and for contributors to receive appropriate credit from database users.

Data Summary

Currently we have populated the database with more than 500 samples, which are associated with various diseases (e.g, MS, celiac, HCV, influenza) and tissue types (e.g, brain lesion, lymph node, blood) originating from ten studies.
On the 'Explore Data' page you can find a summary of number of studies, samples and their related metadata currently in the database.

Frequently Asked Questions (FAQ)

If you have any problems with VDJbase, or would like to ask questions or suggest enhancements, visit our discussion forum!

Data table can be searched individually by entering a search term in the input field at the top of the table. The search starts after pressing the 'Search’. Multiple objects can be used for sorting by holding the "Ctrl key" down while clicking on a further object (for example in studies and alleles fields). The search can be performed for one or more search terms. In case of more than one term the default combination is the logical AND.

Specific sample can be searched individually by entering a sample name in the input field “Name” at the top right of the page. Click on the sample row to view added information about this sample.

The "Explore Data" page contains a table that links the VDJbase project number with the study ID.

You can find this details on the

The rows in the table can be opened for sample’s metadata by clicking on a sample row. Click on the “Genotype” row to view versions information for genotype inference.

Specific sample can be searched individually by entering a sample name in the input field “Name” at the top right of the page. The rows in the table can be opened for sample’s metadata by clicking on a sample row. Click on the “Sequence Protocols” row to view added information about this sequence protocol.

The search section in the top is intended to filter the samples according to selectable criteria , and the graphs are the output of the filtered samples. Users can select samples for visualization by checking the square next to each sample. In case the user does not select specific samples, graphs will be generated for all the filtered samples. Users can generate a plot for single individual by the right genotype and haplotype columns. There are three format options: downloadable PDF, interactive graph by HTML, and downloadable summary statistics table file. Graphs for comparing multiple samples are provided by the left black bar.

Find an explanation for optional exports graphs by the

If there are multiple genes/alleles listed for the condition in which you are interested, you can focus on those of most interest to you by selecting from filter options in the left column called “Advanced Filters”. The ‘advanced filter’ allows filtering out the figures by gene type, pseudogenes, ORF’s genes and the certainty level of inference (Kdiff). The defaults are zero Kdiff and pseudogenes are filtered out.

Query for allele by selecting 'Alleles' from the pull-down menu of the “Database Search” button on the navigation bar. This page provides a summary collection of the alleles that appear in the database. It provides information about their sequence and indicates the number of samples that contain the allele in their genotype. Clicking on this number opens a new page with these samples.

The genotype column contains four format options: downloadable PDF, interactive graph by HTML, graph by PDF, and downloadable summary statistics table file.