CLEAVER : A publicly available web site for supervised analysis of microarray data

Soumya Raychaudhuri*, Patrick D. Sutphin**, Joshua M. Stuart*, and Russ B. Altman*†.

Departments of Medicine (Biomedical Informatics)* and Radiation Oncology**, Stanford University, Stanford, CA.† Correspondence should be addressed to R.B.A (e-mail: rba@stanford.edu).

 

 

Supplement: typical output from Cleaver Classification for a variety of different problems. Cleaver Classification algorithm is an algorithm developed for supervised learning on micro-array data. It will be made publicly available in the form of an internet web site along with other statistical procedures. The table below has references, as well as links to the files that were used for each test case, the graphical and text output for each run, and summaries of performance. If a case is given a positive score - it is predicted to be a member of the positive set, otherwise a member of the negative set; the magnitude of the score indicates the confidence with which the assignment is made. The individual features are also scored to indicate the relative importance of each feature in prediction. The graphical output contains only the most influential features. All of the runs below were the result of cross-validation on the listed data.

 

CLEAVER web site

 

Table 1

 

 

 

 

 

 

 

Cross Validation Accuracy

Problem

Gold Standard Reference

Positive Set

N

Negative Set

N

Features

Penalty

= 0.1

Penalty

= 10

Acute Leukemia

(Golub)

Golub

Lymphoid

data

47

Myeloid

data

25

7129

genes

95.83%

gif output

text output

97.22%

gif output

text output

Lymphoma

(Alizadeh)

Alizadeh

Diffuse Large Cell Lymphoma

data

42

Non-DLCL

data

54

4026

genes

95.83%

gif output

text output

94.79%

gif output

text output

Ribosomal Genes

(Eisen)

Mewes

(MIPS catalogue)

Ribosomal Genes

data

121

Other Genes

data

2346

79

exp.

99.23%

gif output

text output

95.18%

gif output

text output

Sporulation Promoters

(Chu, Spellman , De Risi)

Chu, Mitchell

early genes (urs1 promoters)

data

13

middle genes (mse promoters)

data

23

103

exp.

97.22%

gif output

text output

94.44%

gif output

text output

VHL expression/hypoxia profile

(Scherf, Ross)

 

 

VHL induced

 

data

17

VHL repressed

data

 

15

6828

75%

gif output

text output

75%

gif output

text output

 

 For any queries or suggestions contact sxr@smi.stanford.edu or sutphin@smi.stanford.edu

 

 

 

Golub, T.R., et al. Science 286, 531-537 (1999).

Alizadeh, A.A. et al. Nature 403, 503-511 (2000).

Eisen, M.B., Spellman, P.T., Brown, P.O., Botstein, D. Proc. Natl. Acad. Sci. USA 95, 14863-14868 (1998).

Mewes, et al. Nucleic Acid Res. 25, 28-30 (1997).

Spellman P.T. et al. Mol. Biol. Cell 9, 3273-3297 (1998).

Chu, S. et al. Science 282, 699-705 (1998).

DeRisi J.L. et al. Science 278, 680-686 (1997).

Chu, S., Herskowitz, I. Mol. Cell 1, 685-696 (1998).

Mitchell, A. Microbiological Reviews 58, 56-70 (1994).

Schref, U, et al. Nat. Genet. 24, 236-244 (2000).

Ross, D.T., et al. Nat. Genet. 24, 227-235 (2000).