EASI Short Courses

EASI is committed to innovation and excellence in statistical methodologies for sustainable development.

EASI Camp-I: R and Data Analysis I

Course Overview

This training camp is aimed at empowering participants with the necessary R statistical programming basic skills through basic statistical theory followed by hands-on R training. By the end of the training, participants will be able to manage and analyze data using the dynamic R language. The training will cover the basic structure and functions of R; data import, export and management skills; Important ways to describe, visualize and analyze data will be presented. Although the introductory statistical background is necessary, it is not a prerequisite for participants given that necessary theory will be introduced for everyone to fast-track, as well as those with knowledge of other statistical analysis programs. Each participant will be expected to have a laptop, where the R language software will be installed. There will be three modules in this training workshop.

Course Content

Module 1 presents an overview of the software for R and RStudio interface, along with how to configure associated packages and libraries. The R language and how to work with variables, vectors, matrices, data-frames and arrays. Subsequently, data entry and management with hands-on practice of reading data into R and sub-setting, cleaning, and recoding missing data.

Module 2 explores the many options for describing data in R. In addition to reviewing functions for calculating measures of central tendency and dispersion, will also discuss how to apply survey weights to produce weighted estimates for complex survey data. The section then ends with a description of basic and intermediate data visualization.

Module 3 focuses on measures of association and inferential statistics in R. We begin with a discussion of hypothesis testing in different situations and analysis of variance. A review of aspects of simple regression and correlation will be competently offered. This section concludes with an introduction to R libraries and functions for more data visualization.

EASI Camp-II: Monitoring and Evaluation

Course Overview

This is a unique program, unlike any other with a similar name, offered only by the East African Statistics Institute. The training workshop is aimed to introduce monitoring and evaluation (M&E) from a systems perspective. By the end of the course, participants will understand how to set up a functional M&E System and how to implement an M&E system. The course is covered in 5 modules, namely: an Introduction to M&E. linking project design to M&E, the Problem Tree, the log frame matrix, setting up a functional M&E system (Participatory M&E, Plan and budget).

Course Content

Module 1 presents an introduction to Monitoring and Evaluation so as to understand; M&E and describe its importance, discuss the Myths, difference between M and E, Guiding Principles, Important Concepts and Importance of M&E systems.

Module 2 links Project Design to M&E to enable participants to understand the stages of project management as well as how M&E is part and parcel of project management cycle. Components of a project cycle, things to consider when developing a Project cycle and Uses of a project cycle will be adequately discussed.

Module 3 defines Problem Tree to enable participants understand the problem tree analysis and the process of developing it, including its uses; and practice how to develop a problem tree.

Module 4 presents the Log Frame Matrix to facilitate participants to understand what a log frame tool means, its importance, the different elements of a log frame, the vertical, horizontal and diagonal logic, an be able to practice how to develop a log frame.

Module 5 discusses how to setting Up Functional M& E Systems, participants will be able to: Understand what Participatory Monitoring & Evaluation (PM & E), the Critical Decisions to make in introducing Participatory M&E, the differences between standard approaches to evaluation and participatory evaluation, and Describe the Advantages and Disadvantages of PM & E. The Scope of the M & E System, including; information, Indicators, Tools and sources, Responsibility, Data analysis, Data uses, where necessary participants will have hands-on practice.

EASI Camp-III: R and Statistical Analysis II

Course Overview

The main objective of this training workshop is to develop proficiency in furthering R for data management and data analysis. It focuses on how to adjust knowledge of data analysis in another software program to R. We introduce research methods and reporting using the RStudio. Further data manipulation and visualization, multivariate normality tests, and factor analysis will be introduced. Instruction on the specific statistics and statistical models will be minimal. Participants will be encouraged to use their own data to produce scientific reports to be orally presented. The report will be formatted and an almost ready publishable manuscript developed.

Course Content

Module I: Research Methods: Scientific Writing using RMarkdown, Overview of R and RStudio, and principles of research methods using statistical methods.

Module II: Working with Data in R and RStudio, Introduction to data manipulation with the dplyr package, Descriptive statistics, Data transformations with tidyr, further data visualization with ggplot2 and Bivariate correlations and reliability tests with the psych package.

Module III: Statistical Modeling using R and RStudio, Basic inference tests: T-tests and chi-square test, Analysis of Variance, Time-series Analysis, Logistic regression, and Factor Analysis.

EASI Camp-V: Data Preprocessing in Data Mining

Course Overview

This hands-on short course introduces techniques for preprocessing data before mining. Key concepts of data preprocessing which include data cleaning, data integration, data transformation, data reduction, data normalization, and partitioning of data are discussed. The outcome expected after the data preprocessing tasks is a final dataset, which can be considered accurate and reliable for applying data mining algorithms. This short course will conclude with the project(s) developed using R software to provide efficient real-life solutions to problems emanating from core statistical methods. Participants will be encouraged to use their own data to produce a presentable workshop project. The report will be formatted and an almost ready publishable manuscript developed.

Learning Objectives/Outcomes

The short course aims at achieving the following learning outcomes for participants to:

  • Understand the rationale and context of data preprocessing.
  • Understand the basic steps of data preprocessing and their data needs.
  • Derive and understand the basic formulae for the various stages of data preprocessing
  • Demonstrate with examples the application of data preprocessing to practical real-life data mining problems
  • Evaluate the effectiveness and accuracy performance of data preprocessing in predictive tasks

Course Content

Module I: Data Cleaning

  • Rationale for data preprocessing
  • Dealing with Missing values
  • Irrelevant data
  • Smooth out noisy data

Module II: Data integration

  • Challenges in data integration
  • Sources of coherent meta-data
  • Integration of multiple databases

Module III: Data transformation

  • Scaling numeric attributes
  • Normalization methods
  • Dealing with out of range values

Module IV: Data reduction

  • Dimensionality reduction
  • Discretization
  • Reducing instances

Module V: Data Partitioning

  • Training dataset
  • Testing dataset
  • Challenges encountered in data partitioning

References:

1) García, S., Luengo, J., & Herrera, F. (2015). Data Preprocessing in Data Mining. In J. Kacprzyk & L. Jain (Eds.), Intelligent Systems Reference Library (72nd ed.). https://doi.org/10.1007/978-3-319-10247-4

2) Aggarwal, C. (2015). Data mining: The Text book. Springer. https://doi.org/10.1007/978-3-319-14142-8 14

Benhar, H., Idri, A., & Fernandez-Aleman, J. (2020). Data preprocessing for heart disease classification: A systematic literature review. In Computer Methods and Programs in Biomedicine. https://doi.org/10.1016/j.cmpb.2020.105635

EASI Camp-IV: Introduction to Artificial Neural Networks with R

Course Overview

The main objective of this training workshop is to introduce the fundamental techniques and principles of ANNs and investigate their application using R-software. The course will provide hands-on and essential skills to perform statistical data management and analysis with neural networks. We will focus on underlying principles that make artificial neural networks (ANN) universal statistical computing frameworks. This short course will conclude with project(s) developed using R software to provide efficient real-life solutions to problems emanating from core statistical methods. Participants will be encouraged to use their own data to produce a presentable workshop project. The report will be formatted and an almost ready publishable manuscript developed.

Learning Objectives/Outcomes

The short course aims at achieving the following learning outcomes:

  1. Understand basic machine learning and the context of ANN.
  2. Understand the rationale for the most popular types of ANNs and their data needs.
  3. Derive and understand the basic formulae for ANNs
  4. Design and implement ANN models to practical real-life classification problems
  5. Evaluate model performance and interpret the results

 Course Content

Module I: General introduction

  • History of ANNs
  • Comparison between ANNs and Biological neurons
  • Characteristics of Neural networks
  • Examples of ANNs/Problem Areas addressed by ANNs

Module II: Learning Modes in ANNs

  • Supervised learning
  • Unsupervised learning
  • Reinforcement learning

Module III: Types of ANNs

  • Feedforward Neural Network
  • Feedback Neural Network
  • Back Propagation

Module IV: Developing ANNs, their Training, and Testing.

  • Components of ANNs (Nodes, Connections, Weights, Layers)
  • Algorithm for deriving ANNs
  • Visualizing ANNs using R

Module V: Assessing Goodness of Fit for ANNs.

  • Evaluation metrics
  • Model validation and Generalizability
  • Introduction to projects using real-life data

Course textbooks /Reading materials

  1. Callan (1999). The Essence of Neural Networks
  2. Simon Haykin (1999). Neural Networks: A Comprehensive Foundation
  3. Fyfe, C. (2000). Artificial Neural Networks and Information Theory (1.2).
  4. Titterington, M. (2010). Neural Networks. Wiley Interdisciplinary Reviews: Computational Statistics, 2(1), 1–8.
  5. Taylor, B. (2006). Methods and procedures for the verification and validation of Artificial Neural Networks. Springer.
  6. Yegnanarayana, B. (2005). Artificial Neural Networks. In Prentice Hall of India. http://cdn.iiit.ac.in/cdn/speech.iiit.ac.in/svlpubs/book/Yegna1999.pdf

Book a training

To book for the “R camp”, please fill in this form and provide all the information requested. Our staff will process your request promptly.