SNEP Homepage

For details, see our paper (Fujisawa et al., 2009, BMC Bioinformatics, Vol.10, No.131).


Warranty: The software does not come with any warranty. The software is only free for non-commercial users. For non-academic users, please make contact with the corresponding author.


1. Download and Read the following function file on R.

Download: SNEP.R
Run: source("SNEP.R");


You can know how to use SNEP by understanding Steps 2-6. If you want to directly analyze your CEL file, please read Step A given after Step 6.


2. Prepare two objects on R, ProbeNamePair and Xdata, e.g. from .CEL files.

nprobe: Number of probes.
nk: Number of replicates.
ProbeNamePair: nprobe * 2 matrix. The first and second columns are "ProbeSetName" and "PairIndex" on a .CEL file.
Example. ProbeNamePair.txt.
Xdata: nprobe * (2*nk) matrix. The rows correspond to the rows of ProbeNamePair. The first to nk-th columns correspond to one strain and the (nk+1)-th to (2*nk)-th columns correspond to the other strain. The elements are the log10 values of PM signal intensity.
Example. Xdata.txt.

You can make an example by running the following commands on R:

ProbeNamePair = read.table("ProbeNamePair.txt", header=T);
Xdata = read.table("Xdata.txt");


3. Run the following commands on R to make global variables.

nprobe = nrow(ProbeNamePair);
ProbeNamePair$ID = 1:nprobe;
ProbeFirstNum = subset(ProbeNamePair,PairIndex==1)$ID;
npset = length(ProbeFirstNum);
ProbeFirstNum = c(ProbeFirstNum,nprobe+1);


4. Run the following commands on R.

##### Results
wres = MulTestForGenes(Xdata,npset);
pvalue = wres[1:nprobe];
tv.gene = wres[nprobe+1:npset];


5. Results.

pvalue: nprobe * 1 vector with p-values for all the probes.
The null hypothesis is that the probe is not an SFP and the alternative hypothesis is that the probe is an SFP for the second strain.
pvalue=(p_1,...,p_npset), where p_a=(p_{a1},...,p_{aJ}), p_{aj} is the p-value for the j-th probe in the a-th probe set, J is the number of probes in the a-th probe set, typically 11.
p-value for the alternative hypothesis that the probe is an SFP for the first strain : 1-pvalue

tv.gene: npset * 1 vector with t-values based on the test statistic T0.
The null hypothesis is that two expression levels for two strains are not different.
tv.gene=(t_1,...,t_npset), where t_a is the t-value for the a-th gene.


6. Analysis of Example.

You will obtain the following results after running the above R program for the example: ProbeNamePair.txt and Xdata.txt.

> pvalue
[1] 0.53554154 0.36027663 0.65617879 0.53554154 0.37713016 0.63959028 0.42898901 0.60564204 0.37713016 0.60564204 0.32739966 0.45266215 0.45266215 0.54828344 0.45266215 0.64117058 0.53239102 0.00000000 0.50047622 0.51644679 0.26124566 0.50047622
> tv.gene
[1] 0.07209695 5.55782153

When you need the p-value for the test statistic T0, you can calculate an approximate p-value by the following:
> pv.gene = 2*(0.5-abs(0.5-pnorm(tv.gene))); pv.gene
[1] 9.425247e-01 2.731625e-08

The example consists of two probe sets, which are illustrated in the following two graphs. The black and red correspond to the first and second strain with each 5 replicates, respectively.
@@@

In the first figure, it seems that no SFP is present and two gene expression levels are almost the same. All of the p-values in the first probe set, pvalue[1:11], are not significant. The p-value for the difference of gene expression level between two strains, pv.gene[1], is not significant.

In the second figure, it seems that the 7th probe is an SFP and two gene expression levels are different. The p-value on the 7th probe, pvalue[11+7]=0.00000000, shows highly significance. The p-value for the difference of gene expression level between two strains, pv.gene[2]=2.731625e-08, shows highly significance.


A. Treatment of CEL file.

You must install "Bioconductor" in advance. After that, please make "Xdata" and "ProbeNamePair" by following this file.


Last modified on Jun 12, 2009.