New User!
Machine Learning in Bioinformatics (Wiley Series in Bioinformatics #4)
By: Yanqing Zhang , Jagath C. RajapakseeBook Publisher: John Wiley & Sons
Imprint: John Wiley & Sons
Format: Adobe Encrypted (DRM)
Earn $0.50 - Write a Review »
An introduction to machine learning methods and their applications to problems in bioinformatics
Machine learning techniques are increasingly being used to address problems in computational biology and bioinformatics. Novel computational techniques to analyze high throughput data in the form of sequences, gene and protein expressions, pathways, and images are becoming vital for understanding diseases and future drug discovery. Machine learning techniques such as Markov models, support vector machines, neural networks, and graphical models have been successful in analyzing life science data because of their capabilities in handling randomness and uncertainty of data noise and in generalization.
From an internationally recognized panel of prominent researchers in the field, Machine Learning in Bioinformatics compiles recent approaches in machine learning methods and their applications in addressing contemporary problems in bioinformatics. Coverage includes: feature selection for genomic and proteomic data mining; comparing variable selection methods in gene selection and classification of microarray data; fuzzy gene mining; sequence-based prediction of residue-level properties in proteins; probabilistic methods for long-range features in biosequences; and much more.
Machine Learning in Bioinformatics is an indispensable resource for computer scientists, engineers, biologists, mathematicians, researchers, clinicians, physicians, and medical informaticists. It is also a valuable reference text for computer science, engineering, and biology courses at the upper undergraduate and graduate levels.
See more like this in our Computers eBooks section
Share your thoughts on the Machine Learning in Bioinformatics (Wiley Series in Bioinformatics #4) Computers eBook with others!
| Title of Computers eBook: Machine Learning in Bioinformatics (Wiley Series in Bioinformatics #4) | |
| Release Date: 02-23-2009 | |
| Publisher: John Wiley & Sons |
This eBook download is available in the following formats:
| Parent title | Machine Learning in Bioinformatics... |
|---|---|
| Encrypted (DRM) | Yes |
| SKU | 9780470397411 |
| File size | 9619 |
| Security | n/a |
| Printing | Not allowed |
| Copying | Not allowed |
| Read aloud | No Sys requirements Download reader |
| Devices | Samsung Tablet, Apple Ipad & Iphone, Barnes & Noble Nook, Kobo eReader, Aluratek Libre, Iliad, Nokia, Blackberry, Hanlin |
| Note | Excellent navigation features are available via Adobe such as bookmarks and a quick access table of contents. Text search is easily accessible. An Adobe DRM-protected file is different than a pdf file in that it uses Adobe DRM (Digital Rights Management) technology, which authors and publishers use to protect their content from illegal online distribution and to set certain privileges such as restrictions on copying and printing. |
Machine Learning in Bioinformatics (Wiley Series in Bioinformatics #4)
Chapter One
FEATURE SELECTION FOR GENOMIC AND PROTEOMIC DATA MININGSun-Yuan Kung and Man-Wai Mak
1.1 INTRODUCTION
The extreme dimensionality (also known as the curse of dimensionality) in genomic data has been traditionally a serious concern in many applications. This has motivated a lot of research in feature representation and selection, both aiming at reducing dimensionality of features to facilitate training and prediction of genomic data.
In this chapter, N denotes the number of training data samples, M the original feature dimension, and the full feature is expressed as an M-dimensional vector process
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]
The subset of features is denoted as an m-dimensional vector process
y (t) = [[[y.sub.1] (t), [y.sub.2] (t), ..., [y.sub.m](t)].sup.T] (1.1)
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (1.2)
where m [less than or equal to] M and [s.sub.i] stands for index of a selected feature.
From the machine learning's perspective, one metric of special interest is the sample-feature ratio N/M. For many multimedia applications, the sample-feature ratios lie in a desirable range. For example, for speech data, the ratio can be as high as 100: 1 or 1000: 1 in favor of training data size. For machine learning, such a favorable ratio plays a vital role in ensuring the statistical significance of training and validation.
Unfortunately, for genomic data, this is o
...Read full excerpt from Machine Learning in Bioinformatics (Wiley Series in Bioinformatics #4) ebook








