直接把 bam 导入IGV看峰图太慢,把 bam 转为 bigWig 格式后,文件会小很多,载入也很快了。

1. 安装软件

启发: GEO 看到一句:
Bigwig files were obtained with the bamCoverage function from deepTools. Values represent number of reads per bin. We used a bin size of 50.

https://deeptools.readthedocs.io/en/develop/content/list_of_tools.html

安装软件

$ pip3 install deeptools 

$ deeptools --version
deeptools 3.5.1

$ bamCoverage --help
usage: An example usage is:$ bamCoverage -b reads.bam -o coverage.bw

This tool takes an alignment of reads or fragments as input (BAM file) and generates a coverage track (bigWig or bedGraph) as output. 

The coverage is calculated as the number of reads per bin, where bins are short consecutive counting windows of a defined size. It is possible to extended the length of the reads to better reflect the actual fragment length. 

*bamCoverage* offers normalization by scaling factor, Reads Per Kilobase per Million mapped reads (RPKM), counts per million (CPM), bins per million mapped reads (BPM) and 1x depth (reads per genome coverage, RPGC).

2. 测试

对于基因 EEF1D, 10x bam 最大366;

(1) 基础测试

# v1 # 去掉的过多了! 最大是 75
$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.bigWig \
--outFileFormat bigwig \
--ignoreDuplicates \
-p 12 \
--scaleFactor 0.5
其中 -p 12 是多进程,测试机器只有12个核;


# v2 no --ignoreDuplicates \  #最大是 3851
$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.v2.bigWig \
--outFileFormat bigwig \
-p 12 \
--scaleFactor 0.5
--ignoreDuplicates    If set, reads that have the same orientation and start position will be considered only once. 
	If reads are paired, the mate's position also has to coincide to ignore a read. (default: False)
	# 方向一致,起点一致就算一次! 如果是配对的,另一个位置也要一致才可以忽略一个 read。 默认 F

# v3 no --ignoreDuplicates , no --scaleFactor 0.5 #最大是 v2 的2倍,7701
$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.v3.bigWig \
--outFileFormat bigwig \
-p 12

(2) --binSize 1 更精细的变化, 默认 50

$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.v4.bigWig \
--outFileFormat bigwig \
--binSize 10 \
-p 12

--binSize INT bp, -bs INT bp
     Size of the bins, in bases, for the output of the bigwig/bedgraph file. (Default: 50)
	 分区间的大小,默认 50

当改为1bp分辨率时,生成时间突然变长了:
$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.v4_1.bigWig \
--outFileFormat bigwig \
--binSize 1 \
-p 12


$ ls -lth bam/bw
total 29M
-rw-rw-r-- 1 wangjl wangjl  13M May 23 13:42 BMMC_donor1_cluster6.v4_1.bigWig
-rw-rw-r-- 1 wangjl wangjl 4.6M May 23 13:34 BMMC_donor1_cluster6.v4.bigWig
-rw-rw-r-- 1 wangjl wangjl 1.9M May 23 13:19 BMMC_donor1_cluster6.v3.bigWig
-rw-rw-r-- 1 wangjl wangjl 1.9M May 23 11:30 BMMC_donor1_cluster6.v2.bigWig
-rw-rw-r-- 1 wangjl wangjl 1.9M May 22 22:44 BMMC_donor1_cluster6.bigWig
-rw-rw-r-- 1 wangjl wangjl 3.2M May 22 22:44 BMMC_donor1_cluster1.bigWig
-rw-rw-r-- 1 wangjl wangjl 2.5M May 22 22:42 BMMC_donor1_cluster3.bigWig

(3) --smoothLength 平滑化,巨慢

会特别慢 13:54–>14:11 还没结束,算了,强制退出。其他都是1-2min结束。

$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.v5.bigWig \
--outFileFormat bigwig \
--binSize 10 \
--smoothLength 20
-p 12

--smoothLength INT bp
	The smooth length defines a window, larger than the binSize, to average the number of reads. 
	For example, if the --binSize is set to 20 and the --smoothLength is set to 60, then, for each bin, the average of the bin and its left and right neighbors is considered.
	Any value smaller than --binSize will be ignored and no smoothing will be applied. (default: None)
	平滑长度,低于 --binSize 则忽略,不做平滑化。默认不做平滑化。

(4) 按照正负链生成2个bw文件 --filterRNAstrand forward or --filterRNAstrand reverse

起初,这个无脑做出来的bw文件在 igv上看是反的,仔细看参数,是过滤掉 forward,那么剩下的是 reverse。同理 filterRNAstrand reverse 得到的是 forward 链的reads。

下面是正确的命名:

$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.v6r.bigWig \
--outFileFormat bigwig \
--binSize 1 \
--filterRNAstrand forward \
-p 12

$ bamCoverage --bam BMMC_donor1_cluster6.bam \
--outFileName bw/BMMC_donor1_cluster6.v6f.bigWig \
--outFileFormat bigwig \
--binSize 1 \
--filterRNAstrand reverse \
-p 12

总结:按方向生成 1bp 分辨率的bw,再使用 igv 查看。

效果图:

EEF1D gene

load BigWig to IGV
看图中,cluster6的bam(最大366)和后面的v6 (最大5791) 最高值差异很大,可能 bam 还是有抽样?或者 bw 值偏大?

换几个基因看看:

CTCF gene

在这里插入图片描述

仅留下 bam 和 bw bin=1的

在这里插入图片描述
BigWig 和 原bam 的峰形状为什么不一样呢? //todo 不知道。
这不是 igv.js 特有的,igv 桌面版也是这样:
在这里插入图片描述

附录

网页版 IGV.js 的代码:

<meta charset=utf8>
<span>
help: https://github.com/igvteam/igv.js/wiki 
| 
test3: show tracks using BigWig file as input
</span>

<script src="js/igv.min.js"></script>
<div id="igv-div"></div>

<script>
	var igvDiv = document.getElementById("igv-div");

    var options =
	{
		genome: "hg19",
		locus: "chr8:144,661,785-144,662,802",
		
		tracks: [
			{
				"name": "Cluster1_NK",
				"url": "bam/BMMC_donor1_cluster1.bam",
				"indexURL": "bam/BMMC_donor1_cluster1.bam.bai",
				"format": "bam",
				"type": "alignment",
				
				"color": "red", //set colors

				showAlignments: false,

				height:50, //total height: default 300
				coverageTrackHeight:50, //coverage height: default 50
				alignmentRowHeight:0, //height of reads, default 14 per reads;
			},
			
			{
				"name": "Cluster3_B",
				"url": "bam/BMMC_donor1_cluster3.bam",
				"indexURL": "bam/BMMC_donor1_cluster3.bam.bai",
				"format": "bam",
				"type": "alignment", //"annotation",
				"color": "blue", //set colors
				showAlignments: false,
				height:50,
				coverageTrackHeight:50,
				//visibilityWindow: -1,
				alignmentRowHeight:0, 
			},
			
			{
				"name": "Cluster6_mono",
				"url": "bam/BMMC_donor1_cluster6.bam",
				"indexURL": "bam/BMMC_donor1_cluster6.bam.bai",
				"format": "bam",
				"type": "alignment", //"annotation",
				"color": "purple", //set colors
				showAlignments: false,
				height:50,
				coverageTrackHeight:50,
				alignmentRowHeight:0, 
			},

// bw
			{
				type: "wig",
				name: "v1-bin50-noDuplicate-scale0.5",
				url: "bam/bw/BMMC_donor1_cluster6.bigWig",
				height:50,
				color: "#FFAEC9",
				min: "0",
				//max: "30",
				//color: "rgb(0, 0, 150)",
				guideLines: [
					{color: 'green', dotted: true, y: 2500}, 
					{color: 'red', dotted: false, y: 5}
				]
			},

			{
				type: "wig",
				name: "v2-bin50-scale0.5",
				url: "bam/bw/BMMC_donor1_cluster6.v2.bigWig",
				height:50,
				color: "#FFC90E",
				min: "0",
				//max: "30",
				//color: "rgb(0, 0, 150)",
				guideLines: [
					{color: 'green', dotted: true, y: 2500}, 
					{color: 'red', dotted: false, y: 5}
				]
			},

			{
				type: "wig",
				name: "v3-bin50-scale1",
				url: "bam/bw/BMMC_donor1_cluster6.v3.bigWig",
				height:50,
				color: "#B5E61D",
				min: "0",
				//max: "30",
				//color: "rgb(0, 0, 150)",
				guideLines: [
					{color: 'green', dotted: true, y: 2500}, 
					{color: 'red', dotted: false, y: 5}
				]
			},

			{
				type: "wig",
				name: "v4-bin10",
				url: "bam/bw/BMMC_donor1_cluster6.v4.bigWig",
				height:50,
				color: "#99D9EA",
				min: "0",
				//max: "30",
				//color: "rgb(0, 0, 150)",
				guideLines: [
					{color: 'green', dotted: true, y: 2500}, 
					{color: 'red', dotted: false, y: 5}
				]
			},

			{
				type: "wig",
				name: "v4_1-bin1",
				url: "bam/bw/BMMC_donor1_cluster6.v4_1.bigWig",
				height:50,
				color: "#22B14C",
				min: "0",
				//max: "30",
				//color: "rgb(0, 0, 150)",
				guideLines: [
					{color: 'green', dotted: true, y: 2500}, 
					{color: 'red', dotted: false, y: 5}
				]
			},

			{
				type: "wig",
				name: "v6-forward", //binSize 1, default:50
				url: "bam/bw/BMMC_donor1_cluster6.v6f.bigWig",
				height:50,
				color: "orange",
				min: "0",
				//max: "30",
				//color: "rgb(0, 0, 150)",
				guideLines: [
					{color: 'green', dotted: true, y: 2500}, 
					{color: 'red', dotted: false, y: 5}
				]
			},

			{
				type: "wig",
				name: "v6-reverse", //binSize 1, default:50
				url: "bam/bw/BMMC_donor1_cluster6.v6r.bigWig",
				height:50,
				color: "#FF7F27",
				min: "0",
				//max: "30",
				//color: "rgb(0, 0, 150)",
				guideLines: [
					{color: 'green', dotted: true, y: 2500}, 
					{color: 'red', dotted: false, y: 5}
				]
			},
		]
	};

	igv.createBrowser(igvDiv, options)
		.then(function (browser) {
			console.log("Created IGV browser");	
		})
</script>

refer

目前,该issue还没有收到回复:https://github.com/deeptools/deepTools/issues/1139

Logo

为开发者提供学习成长、分享交流、生态实践、资源工具等服务,帮助开发者快速成长。

更多推荐