发表于2024-11-22
学习如何利用R语言洞察、知晓、理解原始数据。本书介绍了R、RStudio以及tidyverse,后者是一组相互配合工作的R包,能够使数据科学更快速、流畅、富有乐趣。本书旨在帮助你尽快地上手数据科学相关的工作,并不要求读者先前具备编程经验。
作者Hadley Wickham和Garrett Grolemund将一步步指导你对数据进行导入、提炼、探索以及建模并发布成果。除了处理数据所需的基本工具,你还将会对数据科学的周期拥有一个完整的、宏观的理解。
学习如何利用R语言洞察、知晓、理解原始数据。
《数据科学:R语言实现(影印版 英文版)》介绍了R、RStudio以及tidyverse,后者是一组相互配合工作的R包,能够使数据科学快速、流畅、富有乐趣。
《数据科学:R语言实现(影印版 英文版)》旨在帮助你尽快地上手数据科学相关的工作,并不要求读者具备编程经验。
《数据科学:R语言实现(影印版 英文版)》Hadley Wickham和Garrett Grolernund将一步步指导你对数据进行导入、提炼、探索以及建模并发布成果。除了处理数据所需的基本工具,你还将会对数据科学的周期拥有一个完整的、宏观的理解。
Hadley Wickham是RStudio的首席科学家以及R基金会成员。他构建了一套使数据科学变得更加快捷、富有乐趣的工具。可以通过其个人网站了解更多的信息:http://hadley.nz。
Garrett Grolemund是一名统计学家、教师以及RStudio的硕士生导师。他还是《Hands-On Programming with R 》(O'Reilly)一书的作者。Garrett的很多授课视频可以在oreilly.com/safari上找到。
“Hadley Wickham是数据科学领域的一位传奇人物,他创造出了一套之前无人想到过的进行数据分析的全新方法。他这本和Garrett Grolemund合著的新书用代码展示了这种新奇的方法,本书可谓是数据分析方面的圣经。” —— Roger D.Peng (约翰?霍普金斯大学布隆博格公共卫生学院生物统计学教授)
Preface
Part I. Explore
1. Data Visualization with ggplot2
Introduction
First Steps
Aesthetic Mappings
Common Problems
Facets
Geometric Objects
Statistical Transformations
Position Adjustments
Coordinate Systems
The Layered Grammar of Graphics
2. Workflow: Basics
Coding Basics
What's in a Name?
Calling Functions
3. Data Transformation with dplyr
Introduction
Filter Rows with filter()
Arrange Rows with arrange()
Select Columns with select()
Add New Variables with mutate()
Grouped Summaries with summarize()
Grouped Mutates (and Filters)
4. W0rkfl0w: Scripts
Running Code
RStudio Diagnostics
5. Exploratory Data Analysis
Introduction
Questions
Variation
Missing Values
Covariation
Patterns and Models
ggplot2 Calls
Learning More
6. Workflow: Projects
What Is Real?
Where Does Your Analysis Live?
Paths and Directories
RStudio Projects
Summary
Part II. Wrangle
7. Tibbles with tibble
Introduction
Creating Tibbles
Tibbles Versus data.frame
Interacting with Older Code
8. Data Import with readr
Introduction
Getting Started
Parsing a Vector
Parsing a File
Writing to a File
Other Types of Data
9. Tidy Data with tidyr
Introduction
Tidy Data
Spreading and Gathering
Separating and Pull
Missing Values
Case Study
Nontidy Data
10. Relational Data with dplyr
Introduction
nycflightsl3
Keys
Mutating loins
Filtering loins
loin Problems
Set Operations
11. Strings with stringr
Introduction
String Basics
Matching Patterns with Regular Expressions
Tools
Other Types of Pattern
Other Uses of Regular Expressions
stringi
12. Factors with forcats
Introduction
Creating Factors
General Social Survey
Modifying Factor Order
Modifying Factor Levels
13. Dates and Times with lubridate
Introduction
Creating Date/Times
Date-Time Components
Time Spans
Time Zones
Part III. Program
14. Pipeswith magrittr
Introduction
Piping Alternatives
When Not to Use the Pipe
Other Tools from magrittr
15. Functions
Introduction
When Should You Write a Function?
Functions Are for Humans and Computers
Conditional Execution
Function Arguments
Return Values
Environment
16. Vectors
Introduction
Vector Basics
Important Types of Atomic Vector
Using Atomic Vectors
Recursive Vectors (Lists)
Attributes
Augmented Vectors
17. Iteration with purrr
Introduction
For Loops
For Loop Variations
For Loops Versus Functionals
The Map Functions
Dealing with Failure
Mapping over Multiple Arguments
Walk
Other Patterns of For Loops
Part IV. Model
18. Model Basics with modelr
Introduction
A Simple Model
Visualizing Models
Formulas and Model Families
Missing Values
Other Model Families
19. Model Building
Introduction
Why Are Low-Quality Diamonds More Expensive?
What Affects the Number of Daily Flights?
Learning More About Models
20. Many Models with purrr and broom
Introduction
gapminder
List-Columns
Creating List-Columns
Simplifying List-Columns
Making Tidy Data with broom
Part V. Communicate
21. R Markdown
Introduction
R Markdown Basics
Text Formatting with Markdown
Code Chunks
Troubleshooting
YAML Header
Learning More
22. Graphics for Communication with ggplot2
Introduction
Label
Annotations
Scales
Zooming
Themes
Saving Your Plots
Learning More
23. R Markdown Formats
Introduction
Output Options
Documents
Notebooks
Presentations
Dashboards
Interactivity
Websites
Other Formats
Learning More
24. R Markdown Workflow
Index
数据科学:R语言实现(影印版 英文版) [R for Data Science] 下载 mobi pdf epub txt 电子书 格式 2024
数据科学:R语言实现(影印版 英文版) [R for Data Science] 下载 mobi epub pdf 电子书影印版英文书籍 对于学习ML很有帮助
评分这本书是本好书,我很喜欢,做分布式系统必备
评分机器学习的知识书,帮助很大
评分帮老公买的,包装不错,送货速度快
评分商品非常好用,价钱又合适,家里的东西都是京东搞定的
评分帮老公买的,包装不错,送货速度快
评分不错,好看,不错,好看,不错,好看,
评分质量很好,好好学习一下!
评分质量很好,好好学习一下!
数据科学:R语言实现(影印版 英文版) [R for Data Science] mobi epub pdf txt 电子书 格式下载 2024