This app estimates the probability of a secondary school being awarded a certain grade after an inspection by Ofsted. To use it, simply select the characteristics of the school you want to predict for and then choose a prediction algorithm. The estimated probabilites are shown in the chart below. See the methods page for limitations.
There are six predictor variables and one outcome variable in the dataset used for this app. More information on the classification algorithms can be found in the Methods tab.
Region: this is the region of England where the school is located.
School type: this is the type of school according to its governance structure.
KS2 APS (key stage 2 average point score): this is the average point score achieved by the school's year 11 cohort (16 year olds) when they were aged 11. It has been found to be a good predictor of the test results 16 year olds achieve in their key stage 4 tests.
Religious denomination: this is whether the school has a religious affiliation.
Gender composition: this is whether the school is single-sex or mixed in gender composition.
Total pupils: this is the number of pupils enrolled at the school.
Ofsted grade: this is the overall grade given to schools by the Office for Standards in Education, Children's Services and Skills (Ofsted) following a school inspection. Four grades are possible: Outstanding, Good, Requires Improvement and Inadequate. This is the outcome variable.
Two datasets were used in this shiny app. Both come from the UK Department of Education's statistics page.
The caret
package was used to fit the machine learning algorithms.
The school dataset was split into training (70%) and testing (30%) datasets. All models were fit using the same training parameters and the same random seeds. Defaults were used except that models were tuned using 5-fold (rather than 10-fold) cross-valiation without repeat. Parallel computation was employed using the parallel
and doParallel
packages.
The specific algorithms:
lda
methodrf
methodnaiveBayes
methodknn
methodFor more information on these methods, see the caret
package's GitHub page.
The best performer was the lda
model (although its predictions based on increased KS2 APS are very odd). However, none of the algorithms employed achieved an out-of-sample error rate of under 40%. This indicates that other variables are needed to make better predictions and that further tuning and refining of the models is needed. However, the variables and models included here do a decent job of explaining a reasonable amount of the variation in gradings. Three other algorithms were also implemented but not used in the app because of lower accuracy rates than the four ultimately chosen. The choice of algorithms offered here is one of many examples of how statistical analyses and data aren't as “objective” as is often claimed.
You can find out more about building applications with Shiny here: http://shiny.rstudio.com
I referred to miningthedetails' shiny app and hack-r's server.R code code to help me build my app.
Finally, this app was partly inspired by that created by Trevor Burton. My app is an extension of Burton's as it uses a more recent dataset, allows the user to choose a classification algorithm, and employs more variables in the prediction. To reiterate Burton:
“I've only included the things you don't have any control over, so you can see how wrong it is that these should be correlated with Ofsted judgements.”