logo CATA Data BaseV1.01

1. Introduction

Here,we developed a comprehensive Cancer ATAC-seq database that aim to provide a large number of available resources on canceraccessible chromatin regions data. The database was annotated with potential functions of accessible regions in the cancer. The current version of CATA documented a total of 2,991,163 ATAC-Peak from 410 tumor samples that was Collected in 23 different cancers. To help researchers better identify some biotherapeutic targets, we have assembled common SNPs, motif changes, expression quantitative trait locus, TFBS, Motif data, methylation, copy number variation, somatic mutation, enhancer, and corresponding clinical data. At the same time, we also provide survival analysis, pathway enrichment and opening region associated analysis which is more convenient for researchers to mine data. CATA database will help elucidate accessible chromatin regions related functions and find potential biological effects and potential therapeutic targets.


Users can search cancer accessible chromatin regions information of cancers through four paths, includingsearch accessible regions by tumor category-bases, search accessible regions by gene, search accessible regions by gene, search accessible regions by TF(transcription factor).Advanced search by genome location.

2.1 Search by tumor samples

User can query by choose cancer type of interest. including 24 cancer type.

2.2 Search by gene

Users can query by inputing TF name of interest and selecting lncRNA source and lncRNA promoter region, the search results will be displayed on the next page.

2.3 Search by TF

Users can query by inputing SNP ID(Common SNP of dbSNP150 Build) of interest and selecting lncRNA source and lncRNA promoter region, the search results will be displayed on the next page.

2.4 Advanced search

Users can query by inputing a genomic position of enhancer. and selecting lncRNA source, the search results will be displayed on the next page.

3. Search results.

This page mainly shows summary information about.accessible regions. Including(PEAK ID, genome location, accessible scores , Annotation ,TFBS numbers, motif numbers, Enhancer numbers, and so on).user can click PEAK ID to entering detail page.

4. detail page

4.1 Overview information

Overview information includes PEAK ID,cancertype, Tissue type, gene symbol,genome location, accessible scores, RNA-expression, Genome Browser and TCGA sample counts numbers.

4.2 Accessible region annotation.

The SNP annotation page numbers can click to entering other pages to get more information.

4.3 RNA-expression

User can choose the associated gene to display the expression of cancer.

4.4 Associative clinical data

Our data base providethe clinical data about patients of sample.

4.5 Survival analysis

User can choose associated gene to survival analysis.
Methods: Select the OS or DFS survival method.
Axis Units: Select Month or Day unit for plotting.
Group Cutoff: Select a suitable expression threshold for splitting the high-expression and low-expression cohorts.
Cutoff-High(%): Samples with expression level higher than this threshold are considered as the high-expression cohort.
Cutoff-Low(%): Samples with expression level lower than this threshold are considered the low-expression cohort.

4.6 Methylation visualization

We provide the visualization of methylation, including 24 cancers.

4.7 Upstream Transcription Factor Enrichment

The CATA provide the accessible region upstream TF enrichment.User can choose the number of interact TF to visualization the net.

5. Exploration

The 'Data-Browse' page is an interactive table of alphanumeric sorting that allows you to quickly search for accessible region and customize filters through 'Tissue','Cancertype', 'Annotation', and 'Chromosome'. User can click PEAK-ID quickly entering detail page.


We provide two analysis tools, including Pathway downstream analysis and Associated Accessible Region Analysis.

6.1 Pathway downstream analysis

When user input a PEAK-id, gene symbol, FDRs and the selection of at least one pathway database (e.g. KEGG), According to TFs binding to theaccessible region. Then, the enriched pathways associated with theaccessible region are calculated using hypergeometric test based on TFs binding to theaccessible region.

6.1.1 Pathway enrichment results

Pathway enrichment results. including pathway ID, Pathway name, Source, FRD, Ann gene, and so on.

6.2 Associated Accessible Region Analysis

Usercaninputagenomelocationora BED files. then get accessible chromatin regions.

6.2.1 Results

Including accessible region location, Peak-ID and users input information and user can click the peak-id to entering accessible region detail page.

7. Genome-Browser

In order to help users to intuitively view transcriptional regulatory information of accessible regions in the genome, we developed a personalized genome browser using GIVE and added a lot of useful tracks.

8. Download

We provide Chromatin accessible region data to download ,including 24 cancers.

9. contactus

If any question about the CATA, please contact us.
First author email:343856348@qq.com
Corresponding authoremail:lic@163.com