The term big data is often used to refer to datasets that are too large for traditional data management applications. While the SGP project has worked to assemble unprecedented amounts of student assessment data for analysis it is not yet a case of big data by any stretch of the imagination. In fact it is quite modest by comparison to for example the size of global Facebook interactions.
The SGPdata package includes exemplar WIDE and LONG format data sets (sgpData and sgpData_LONG) to assist in the preparation of SGP data for analysis. In general, students are represented by a single case/row in the WIDE data set with variables associated with the student at different times being represented by columns in the LONG data set.
Student Growth Percentiles (SGPs) are calculated by comparing current student assessment scores with the performance of academic peers in previous years using normative models. SGPs are intended to provide a more holistic view of student achievement than traditional student assessment reports which focus on a single measure.
A key feature of the SGP data is the inclusion of student and teacher demographic information. This information is used to determine the academic peer group for each student. The SGPdata package provides a number of analytic functions to examine the student and teacher data including graphical representations of the results.
A number of higher level wrapper functions in the SGPdata package use the sgpData_LONG data set for operational analyses. These functions are designed to take advantage of the numerous benefits of managing long format data over wide format data. Please consult the SGPdata analysis vignette for more detailed documentation on these high level SGP functions.
The sgpData_LONG data sets include a teacher lookup table that identifies the teachers with whom each student has been associated with for a given year. The SGPdata_INSTRUCTOR_NUMBER variable in sgpData_LONG can be used to associate teachers with specific students and test records.
The SGPdata_LONG data sets include a student identifier column, SS_2013, SS_2014, SS_2015, and SS_2016 which identify the individual grade levels for which SGPs have been calculated. The sgpData_LONG for each student also contains a fifth column, SGP_PERCENTILES, that provides the percentile rank of the SGP calculated for that student in each of these five grades.
This data is a valuable tool for students and educators as they prepare for future academic challenges. As more schools across the country implement SGP, the availability of this data will grow. The ability to compare student growth over time across schools and districts will allow teachers and administrators to develop informed strategies for student learning and achievement. We hope that the SGPdata website and analytic tools will help make it easier to access, analyze and interpret SGP data and share this information with others. We welcome your feedback and comments. We are working hard to improve the site on a daily basis and appreciate your patience as we work to make this website an even better resource for educators and students.