Building Machine Learning Systems with Python : master the art of machine learning with Python and build effective machine learning systems with this intensive hands-on guide

Enregistré dans:
Détails bibliographiques
Auteur principal: Richert, Willi. (Auteur)
Autres auteurs: Coelho, Luis Pedro. (Auteur)
Support: E-Book
Langue: Anglais
Publié: Birmingham : Packt Publishing, 2013.
Collection: Open source*.
Sujets:
Autres localisations: Voir dans le Sudoc
Résumé: This is a tutorial-driven and practical, but well-grounded book showcasing good Machine Learning practices. There will be an emphasis on using existing technologies instead of showing how to write your own implementations of algorithms. This book is a scenario-based, example-driven tutorial. By the end of the book you will have learnt critical aspects of Machine Learning Python projects and experienced the power of ML-based systems by actually working on them. This book primarily targets Python developers who want to learn about and build Machine Learning into their projects, or who want to pro
Accès en ligne: Accès à l'E-book
Lien: Collection principale: Open source* : community experience distilled
Table des matières:
  • Cover; Copyright; Credits; About the Authors; About the Reviewers; www.PacktPub.com; Table of Contents; Preface; Chapter 1: Getting Started with Python Machine Learning; Machine learning and Python
  • the dream team; What the book will teach you (and what it will not); What to do when you are stuck; Getting started; Introduction to NumPy, SciPy, and Matplotlib; Installing Python; Chewing data efficiently with NumPy and intelligently with SciPy; Learning NumPy; Indexing; Handling non-existing values; Comparing runtime behaviors; Learning SciPy; Our first (tiny) machine learning application
  • A more complex dataset and a more complex classifierLearning about the Seeds dataset; Features and feature engineering; Nearest neighbor classification; Binary and multiclass classification; Summary; Chapter 3: Clustering
  • Finding Related Posts; Measuring the relatedness of posts; How not to do it; How to do it; Preprocessing
  • similarity measured as similar number of common words; Converting raw text into a bag-of-words; Counting words; Normalizing the word count vectors; Removing less important words; Stemming; Installing and using NLTK; Extending the vectorizer with NLTK's stemmer
  • Looking behind accuracy
  • precision and recall
  • Reading in the dataPreprocessing and cleaning the data; Choosing the right model and learning algorithm; Before building our first model; Starting with a simple straight line; Towards some advanced stuff; Stepping back to go forward
  • another look at our data; Training and testing; Answering our initial question; Summary; Chapter 2: Learning How to Classify with Real-world Examples; The Iris dataset; The first step is visualization; Building our first classification model; Evaluation
  • holding out data and cross-validation; Building more complex classifiers
  • Slimming the data down to chewable chunksPreselection and processing of attributes; Defining what is a good answer; Creating our first classifier; Starting with the k-nearest neighbor (kNN) algorithm; Engineering the features; Training the classifier; Measuring the classifier's performance; Designing more features; Deciding how to improve; Bias-variance and its trade-off; Fixing high bias; Fixing high variance; High bias or low bias; Using logistic regression; A bit of math with a small example; Applying logistic regression to our postclassification problem
  • Stop words on steroidsOur achievements and goals; Clustering; KMeans; Getting test data to evaluate our ideas on; Clustering posts; Solving our initial challenge; Another look at noise; Tweaking the parameters; Summary; Chapter 4: Topic Modeling; Latent Dirichlet allocation (LDA); Building a topic model; Comparing similarity in topic space; Modeling the whole of Wikipedia; Choosing the number of topics; Summary; Chapter 5: Classification
  • Detecting Poor Answers; Sketching our roadmap; Learning to classify classy answers; Tuning the instance; Tuning the classifier; Fetching the data