Pre-Processing
Data Science 2.0 equips students with the essential knowledge and practical skills needed to prepare and enhance their self-collected or sourced data for analysis (preprocessing). This unit focuses on fundamental data processing skills while tackling common challenges in the processing of environmental science data, all through a practical, ‘hands-on’ approach with R exercises. Students will learn to articulate the characteristics of their data sets using the appropriate technical terminology. They will also learn to interpret metadata and critically assess its implications for their own analysis projects. The lesson emphasises critical concepts such as scale levels, data types, time data, and type conversions.
This lesson focuses on the central skills required for preprocessing structured data, a fundamental aspect of environmental science research. It covers combining datasets (joins) and transforming them (“reshape”, “split-apply-combine”). Given that data seldom presents itself in a format ready for statistical analysis or information visualisation, students will master the key concepts and R tools required for these often intricate preprocessing tasks, enabling them to execute them effectively.
Title | Date | Lesson | Topic |
---|---|---|---|
Preparation | 2024-02-20 | PrePro1 | Preparation |
Prepro 1: Demo | 2024-02-20 | PrePro1 | Data Types |
PrePro 1: Exercise | 2024-02-20 | PrePro1 | Data Types |
Prepro 2: Demo | 2024-02-27 | PrePro2 | Piping / Joins |
Prepro 2: Exercise A | 2024-02-27 | PrePro2 | Piping / Joins |
Prepro 2: Exercise B | 2024-02-27 | PrePro2 | Piping / Joins |
Prepro 3: Demo | 2024-03-05 | PrePro3 | Split-Apply-Combine |
Prepro 3: Exercise | 2024-03-05 | PrePro3 | Split-Apply-Combine |