{ "cells": [ { "cell_type": "markdown", "id": "98bcfabf", "metadata": {}, "source": [ "Line & Scatter Plots\n", "====================\n", "This notebook introduces the methods of creating bar charts from our data, including some of the key concepts of our plotting package, `matplotlib`. Bar charts are useful for making comparisons between close data because we are very good at evaluating the size of rectangles.\n", "\n", "Topics covered:\n", "\n", "- bar charts\n", "- styling charts (titles, labels, size, color, etc)\n", "- grouped bar charts\n", "- tacked bar charts\n", "- new `school_data` package for custom code\n", "\n", "To work on these examples we will use the demographics and ELA/Math test score data." ] }, { "cell_type": "code", "execution_count": 1, "id": "a2d689e6", "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "import numpy as np\n", "import scipy\n", "from scipy.stats import pearsonr\n", "from IPython.display import Markdown as md\n", "\n", "from nycschools import schools, ui, exams\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "f2d33e74", "metadata": {}, "outputs": [], "source": [ "df = schools.load_school_demographics()\n", "tests = exams.load_math_ela_long()" ] }, { "cell_type": "markdown", "id": "e80995a0", "metadata": {}, "source": [ "Scatter plots and correlations\n", "------------------------------------------\n", "In the next section we're going to see what demographic factors of schools impact\n", "student test scores on the NYS ELA exams, grades 3-8.\n", "\n" ] }, { "cell_type": "code", "execution_count": 3, "id": "757331f2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "\n", "
| \n", " | total_enrollment | \n", "asian_pct | \n", "black_pct | \n", "hispanic_pct | \n", "white_pct | \n", "swd_pct | \n", "ell_pct | \n", "poverty_pct | \n", "eni_pct | \n", "
|---|---|---|---|---|---|---|---|---|---|
| total_enrollment | \n", "1.000000 | \n", "0.350720 | \n", "-0.251269 | \n", "-0.095415 | \n", "0.176726 | \n", "-0.187808 | \n", "-0.005513 | \n", "-0.149179 | \n", "-0.189138 | \n", "
| asian_pct | \n", "0.350720 | \n", "1.000000 | \n", "-0.450250 | \n", "-0.346040 | \n", "0.191238 | \n", "-0.213357 | \n", "0.151237 | \n", "-0.258741 | \n", "-0.307453 | \n", "
| black_pct | \n", "-0.251269 | \n", "-0.450250 | \n", "1.000000 | \n", "-0.428332 | \n", "-0.456705 | \n", "0.125320 | \n", "-0.356624 | \n", "0.303223 | \n", "0.250974 | \n", "
| hispanic_pct | \n", "-0.095415 | \n", "-0.346040 | \n", "-0.428332 | \n", "1.000000 | \n", "-0.385142 | \n", "0.077647 | \n", "0.442527 | \n", "0.465564 | \n", "0.531882 | \n", "
| white_pct | \n", "0.176726 | \n", "0.191238 | \n", "-0.456705 | \n", "-0.385142 | \n", "1.000000 | \n", "-0.080427 | \n", "-0.186733 | \n", "-0.790558 | \n", "-0.760384 | \n", "
| swd_pct | \n", "-0.187808 | \n", "-0.213357 | \n", "0.125320 | \n", "0.077647 | \n", "-0.080427 | \n", "1.000000 | \n", "0.031075 | \n", "0.191904 | \n", "0.266450 | \n", "
| ell_pct | \n", "-0.005513 | \n", "0.151237 | \n", "-0.356624 | \n", "0.442527 | \n", "-0.186733 | \n", "0.031075 | \n", "1.000000 | \n", "0.346132 | \n", "0.396029 | \n", "
| poverty_pct | \n", "-0.149179 | \n", "-0.258741 | \n", "0.303223 | \n", "0.465564 | \n", "-0.790558 | \n", "0.191904 | \n", "0.346132 | \n", "1.000000 | \n", "0.912243 | \n", "
| eni_pct | \n", "-0.189138 | \n", "-0.307453 | \n", "0.250974 | \n", "0.531882 | \n", "-0.760384 | \n", "0.266450 | \n", "0.396029 | \n", "0.912243 | \n", "1.000000 | \n", "