name: inter-slide class: left, middle, inverse {{ content }} --- name: layout-general layout: true class: left, middle <style> .remark-slide-number { position: inherit; } .remark-slide-number .progress-bar-container { position: absolute; bottom: 0; height: 4px; display: block; left: 0; right: 0; } .remark-slide-number .progress-bar { height: 100%; background-color: red; } /* custom.css */ .plot-callout { width: 300px; bottom: 5%; right: 5%; position: absolute; padding: 0px; z-index: 100; } .plot-callout img { width: 100%; border: 1px solid #23373B; } </style>
--- class: middle, left, inverse # Analyse des Données : Introduction to R and Data Analysis ### 2023-10-10 #### [Master ISIFAR]() #### [Analyse de Données](http://stephane-v-boucheron.fr/courses/isidata/) #### [Stéphane Boucheron](http://stephane-v-boucheron.fr) --- template: inter-slide ##
### [
](#history) ### [Rstudio](#rstudio) ### [Books](#books) ??? --- .f1[
] is a software environment for - statistical computing and - graphics distributed freely at `CRAN` under a `GPL 2/3 licence` --- ###
<iframe src="https://www.r-project.org" width="3600" height="400px" data-external="1"></iframe> --- template: inter-slide name: history ## Brief history of
--- .fl.w-70.pa[ - 1976 _S_ programming language designed at Bell Labs by John Chambers et al. - 1998 _The New S Language_ (Blue Book) - 1988 _S-PLUS_ first produced by a Seattle-based start-up company called Statistical Sciences, Inc. - 1991-93
initated by [R. Ihaka](https://en.wikipedia.org/wiki/Ross_Ihaka) and <a href="https://en.wikipedia.org/wiki/Robert_Gentleman_(statistician)"> R. Gentleman</a> - 1997 Formation of
Core Team - 2001 [Bioconductor](https://www.bioconductor.org) - 2009-..., A group of packages called the [Tidyverse](https://www.tidyverse.org), which can be considered a _dialect of the R language_, is increasingly popular in the
ecosystem ] .fl.w-30.top.pa[ ![](./img/gnu.png)
is one of 5 languages with an Apache `Spark` API, the others being `Scala`, `Java`,
, and `SQL` ] ??? --- exclude: true
--- ### Aspects of
.fl.w-50[ ###
as a Programming Language - Functional (inherited from <a href="https://en.wikipedia.org/wiki/Scheme_(programming_language)"> Scheme</a> - Object Oriented - Interpreted - Written in `C` and `Fortran` - Interfaced with `C, C++, Java`,
, ... - Interfaced with SQL
] -- .fl.w-50[ ### Environment REPL - REPL: Read-Eval-Print-Loop -
Application - `ESS` Emacs Speaks Statistics - `Rstudio` IDE
- `Jupyter` notebooks - VS Code ... ] --- ### Kernel and packages
is made of a relatively stable core part (the kernel) and an ever growing collection of packages Packages enrich the kernels in many ways: - provide new modeling techniques (`nlme`, ...) - visualization (`ggplot2`, `plotly`, ...) - cleaner, more efficient data structures (`tibble`, `data.table`, ...) - cleaner APIs (`lubridate`, `stringr`, ...) --- ### CRAN - Comprehensive R-project Archive Network <iframe src="https://pbil.univ-lyon1.fr/CRAN/" width="3600" height="400px" data-external="1"></iframe> Packages can also be installed from GitHub
and other sources --- ### R-bloggers <iframe src="https://www.r-bloggers.com" width="3600" height="400px" data-external="1"></iframe> --- template: inter-slide name: rstudio ## Rstudio IDE --- <img src="./img/rstudio_Opera Snapshot_2021-09-09_115435_www.rstudio.com.png" width="218" /> ### [
Rstudio Desktop IDE](https://www.rstudio.com/products/rstudio/) --- ### IDE - Editor(s) - Console(s) - Viewer(s) - Misc. + Versioning + Environment monitoring + ... --- ### Use cases - Programming/Package developments - **Reporting** - ... --- ### RMarkdown <iframe src="https://rmarkdown.rstudio.com/index.html" width="3600" height="400px" data-external="1"></iframe> --- ### [Literate programming](https://en.wikipedia.org/wiki/Literate_programming) > Literate programming is a programming paradigm introduced by Donald Knuth in which a computer program is given an explanation of its logic in a natural language, such as English, interspersed with snippets of macros and traditional source code, from which compilable source code can be generated. The approach is used in scientific computing and in data science routinely for reproducible research and open access purposes. Literate programming tools are used by millions of programmers today. --- ### Reproducible (statistical) research .fl.w-60.pa2[ ![](img/pipeline.svg) ] .fl.w-40.pa2[ **A pipeline** - Data gathering - Data wrangling - Exploratory Data Analyis (and Visualization) - Modeling prediction - Reporting (paper, html, interactive dashboard, ...) ] --- ### Reproducible (statistical) research (cont'd) .fl.w-60.pa2[ ![](img/pipeline.svg) ] .fl.w-40.pa2[ ### A pipeline The pipeline is defined by solid arrows It may be (partially) iterated (several times) The reports need to be updated every time - data are edited/reshaped - models are modified ] --- ### Reproducible (statistical) research (cont'd) .fl.w-50.pa2[ **A pipeline** - Data gathering - Data wrangling - Exploratory Data Analyis (and Visualization) - Modeling prediction - Reporting (paper, html, interactive dashboard, ...) ] .fl.w-50.pa2[ - The pipeline may be iterated several times - Gluing the different steps in a manageable way is vital - `RMarkdown` does just that
- [`Quarto`](https://quarto.org) is an improvement/extension of `RMarkdown` ] --- template: inter-slide name: books ## Books and references --- The Web offers many valuable resources for learning
either as a beginner or as seasoned statistician - [Online books](https://bookdown.org) - [Cheatsheets](https://www.rstudio.com/resources/cheatsheets/) - [Datacamp](https://www.datacamp.com/) - [Udacity](https://www.udacity.com) - [Stackoverflow](https://stackoverflow.com/questions/tagged/r) - [R job interviews](https://www.edureka.co/blog/interview-questions/r-interview-questions/) - ... --- <iframe src="https://rstudio-education.github.io/hopr/" width="3600" height="400px" data-external="1"></iframe> --- <iframe src="https://r4ds.had.co.nz" width="3600" height="400px" data-external="1"></iframe> --- <iframe src="https://adv-r.hadley.nz" width="3600" height="400px" data-external="1"></iframe> --- class: middle, center, inverse background-image: url('./img/pexels-cottonbro-3171837.jpg') background-size: cover # The End